Skip to content

guillemdelgado/mcv-m5

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

Scene Understanding for Autonomous Vehicles

This project aims to put in practice concepts and techniques to develop deep neural networks for being able to compute three main computer vision tasks: object detection, segmentation and recognition. These concepts will be applied in an specific environent which is autonomous vehicles.

Authors

DeepDrivers

Authors Emails Github
Guillem Delgado guillem.delgado@gmail.com guillemdelgado
Francisco Roldan fran.roldans@gmail.com franroldans
Jordi Gené jordigenemola.1@gmail.com Jordi-Gene-Mola

robertbenavente and lluisgomez as supervisors.

Final report

The final report details and summarizes the work done weekly in the project can be found here.

Final Slides

The final slides for the presentations detailing and summaraizing the work done weekly can be found here.

4/4/2018

Datasets Analysis

We have manually inspected the data in which we have work to facilitate the interpretation of the results obtained. Find the data set analysis here.

Weights

The weights of the different models can be found here. As the size of weights files is huge there are just the most successful experiments for each dataset and network. However, if you feel there are missing the weights of an experiment you are interested in, just open an issue and we will update this Google Drive with your request ASAP.

Object recognition - Week 2

Abstract

In order to choose a good-performing object recognition network for our system, we have tested several CNNs with different architectures. Changing different parameters from code/config/tt100k_classif.py we were able to test different datasets and different NN. In addition, we implemented and tested a Deep Network with Stochastic Depth based on Residual Blocks which can be found in code/models/stochastic_depth.py

Code

The framework's code is divided as follows:

  • callbacks/ : Folder that handles all the different callbacks involved during training.
  • config/ : Folder that contains all configuration files for the different experiments done and handles them.
  • initializations/ : Useful tools for weights initialization.
  • layers/ : Folder that contains layers not present in Keras such as Deconvolution.
  • metrics/ : Tools for model evaluation
  • models/ : Folder that handles all the different models involved in the project.
  • tools/ : Useful tools to manage deep learning projects.

Results

Results of the different experiments.

How to run the code

See this README to know how to run the code and run the experiments.

Goals

  1. Testing the framework:
  • Analyze the dataset, which the summary can be found the Datasets Analysis section.
  • Calculate the accuracy on train and test sets.
  • Evaluate different techniques in the configuration file.
  • Transfer learning to another dataset.
  • Understand configuration file.
  1. Train networks on different datasets:
  • VGG model from scratch.
  • VGG model fine-tuning with ImageNet weights.
  1. Implementing a new Neural Network:
  • Integrate the new model into the framework.
  • Evaluate the new model on TT100K dataset.
  1. Boost performance
  • Data Augmentation.
  • Data Preprossesing.
  • Comparative of optimizers.

Object detection - Week 3 and 4

Abstract

For object detection we have considered two single-shot models: You Only Look Once (YOLO) with the smaller model, Tiny-YOLO, and Single-Shot Multibox Detector (SSD). All these models have been trained to detect a variety of traffic signs in the TT100K detection dataset and to detect pedestrians, cars and trucks in the Udacity dataset. Faster-RCNN was tried but it is not included due to difficulties to upgrade it to newest Keras version.

Code

The contributions done for these weeks are:

  • layers/ssd_layers.py : Layers needed for the SSD model
  • models/ssd.py : SSD Model.
  • tools/ssd_utils : Utils needed for the SSD Model.

Results

Results of the different experiments.

How to run the code

See this README to know how to run the code and run the experiments.

Goals

  1. YOLOv2 model in TT100k Dataset:
  • Analyze the dataset, which the summary can be found the Datasets Analysis section.
  • Calculate the F-score.
  1. Summary of references:
  • Summary of Yolo and F-RCNN.
  1. Implementing a new Neural Network:
  • Integrate the new model (SSD) into the framework.
  • Evaluate the new model on BOTH datasets.
  1. Train the networks on a different dataset:
  • Evaluate the YOLO on BOTH datasets.
  • Evaluate the SSD on BOTH datasets.
  1. Boost performance:
  • Data Augmentation.
  • Data Preprossesing.
  • Comparative of optimizers.

Object Segmentation - Week 5 and 6

Abstract

Three different models have been tried during these weeks. Starting by FCN-8, we have explored different regularization methods not tried in previous weeks, such as batch normalization, and trained the model in different datasets (CamVid, Cityscapes and KITTI). Segnet and Unet have been also adapted to Keras 2.0 and tested on CamVid dataset.

Code

The contributions done for these weeks are:

  • models/unet.py: U-Net Model
  • models/segnet.py: SegNet Model
  • predictions.py: Script that generates the mask predictions.

Results

Results of the different experiments.

How to run the code

See this README to know how to run the code and run the experiments.

Goals

  1. FCN8 model:
  1. Summary of references:
  • Summary of FCN and SegNet.
  1. Implementing a new Neural Network:
  • Integrate the new model (SegNet) into the framework.
  • Integrate the new model (unet) into the framework.
  1. Train the networks on a different dataset:
  • Evaluate FCN8 for Cityscapes and KITTI.
  1. Boost performance:
  • Comparative of batch sizes.
  • Comparative or learning rates.
  • Comparative of optimizers.
  • Batch normalitzation.
  • Data augmentation for unet.

References

Summaries

Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).

He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

A. Krizhevsky, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012.

G. Huang, “Deep networks with stochastic depth,” in European Conference on Computer Vision, 2016.

Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.

Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

Redmon, Joseph, and Ali Farhadi. "YOLO9000: better, faster, stronger." arXiv preprint 1612 (2016).

Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.

Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.

V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” arXiv preprint arXiv:1511.00561, 2015.

About

Master in Computer Vision - M5 Visual recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Python 100.0%