Scene Understanding for Autonomous Vehicles

This project aims to put in practice concepts and techniques to develop deep neural networks for being able to compute three main computer vision tasks: object detection, segmentation and recognition. These concepts will be applied in an specific environent which is autonomous vehicles.



Guillem Delgado guillemdelgado
Francisco Roldan franroldans
Jordi Gené Jordi-Gene-Mola

robertbenavente and lluisgomez as supervisors.

Final report

The final report details and summarizes the work done weekly in the project can be found here.

Final Slides

The final slides for the presentations detailing and summaraizing the work done weekly can be found here.


Datasets Analysis

We have manually inspected the data in which we have work to facilitate the interpretation of the results obtained. Find the data set analysis here.


The weights of the different models can be found here. As the size of weights files is huge there are just the most successful experiments for each dataset and network. However, if you feel there are missing the weights of an experiment you are interested in, just open an issue and we will update this Google Drive with your request ASAP.

Object recognition - Week 2


In order to choose a good-performing object recognition network for our system, we have tested several CNNs with different architectures. Changing different parameters from code/config/ we were able to test different datasets and different NN. In addition, we implemented and tested a Deep Network with Stochastic Depth based on Residual Blocks which can be found in code/models/


The framework's code is divided as follows:

  • callbacks/ : Folder that handles all the different callbacks involved during training.
  • config/ : Folder that contains all configuration files for the different experiments done and handles them.
  • initializations/ : Useful tools for weights initialization.
  • layers/ : Folder that contains layers not present in Keras such as Deconvolution.
  • metrics/ : Tools for model evaluation
  • models/ : Folder that handles all the different models involved in the project.
  • tools/ : Useful tools to manage deep learning projects.


Results of the different experiments.

How to run the code

See this README to know how to run the code and run the experiments.


  1. Testing the framework:
  • Analyze the dataset, which the summary can be found the Datasets Analysis section.
  • Calculate the accuracy on train and test sets.
  • Evaluate different techniques in the configuration file.
  • Transfer learning to another dataset.
  • Understand configuration file.
  1. Train networks on different datasets:
  • VGG model from scratch.
  • VGG model fine-tuning with ImageNet weights.
  1. Implementing a new Neural Network:
  • Integrate the new model into the framework.
  • Evaluate the new model on TT100K dataset.
  1. Boost performance
  • Data Augmentation.
  • Data Preprossesing.
  • Comparative of optimizers.

Object detection - Week 3 and 4


For object detection we have considered two single-shot models: You Only Look Once (YOLO) with the smaller model, Tiny-YOLO, and Single-Shot Multibox Detector (SSD). All these models have been trained to detect a variety of traffic signs in the TT100K detection dataset and to detect pedestrians, cars and trucks in the Udacity dataset. Faster-RCNN was tried but it is not included due to difficulties to upgrade it to newest Keras version.


The contributions done for these weeks are:

  • layers/ : Layers needed for the SSD model
  • models/ : SSD Model.
  • tools/ssd_utils : Utils needed for the SSD Model.


Results of the different experiments.

How to run the code

See this README to know how to run the code and run the experiments.


  1. YOLOv2 model in TT100k Dataset:
  • Analyze the dataset, which the summary can be found the Datasets Analysis section.
  • Calculate the F-score.
  1. Summary of references:
  • Summary of Yolo and F-RCNN.
  1. Implementing a new Neural Network:
  • Integrate the new model (SSD) into the framework.
  • Evaluate the new model on BOTH datasets.
  1. Train the networks on a different dataset:
  • Evaluate the YOLO on BOTH datasets.
  • Evaluate the SSD on BOTH datasets.
  1. Boost performance:
  • Data Augmentation.
  • Data Preprossesing.
  • Comparative of optimizers.

Object Segmentation - Week 5 and 6


Three different models have been tried during these weeks. Starting by FCN-8, we have explored different regularization methods not tried in previous weeks, such as batch normalization, and trained the model in different datasets (CamVid, Cityscapes and KITTI). Segnet and Unet have been also adapted to Keras 2.0 and tested on CamVid dataset.


The contributions done for these weeks are:

  • models/ U-Net Model
  • models/ SegNet Model
  • Script that generates the mask predictions.


Results of the different experiments.

How to run the code

See this README to know how to run the code and run the experiments.


  1. FCN8 model:
  1. Summary of references:
  • Summary of FCN and SegNet.
  1. Implementing a new Neural Network:
  • Integrate the new model (SegNet) into the framework.
  • Integrate the new model (unet) into the framework.
  1. Train the networks on a different dataset:
  • Evaluate FCN8 for Cityscapes and KITTI.
  1. Boost performance:
  • Comparative of batch sizes.
  • Comparative or learning rates.
  • Comparative of optimizers.
  • Batch normalitzation.
  • Data augmentation for unet.



