Skip to content

Datasets and code for reproducing results for IV2018 paper "Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation".

License

Notifications You must be signed in to change notification settings

pmeletis/IV2018-hierarchical-semantic-segmentation-for-heterogeneous-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation (IV 2018)

Code for reproducing results for IV2018 paper "Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation".

Panagiotis Meletis and Gijs Dubbelman (2018) Training of convolutional networks on multiple heterogeneous datasets for street scene semantic segmentation. The 29th IEEE Intelligent Vehicles Symposiom (IV 2018), full paper on arXiv.

If you find our work useful for your research, please cite the following paper:

@inproceedings{heterogeneous2018,
  title={Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation},
  author={Panagiotis Meletis and Gijs Dubbelman},
  booktitle={2018 IEEE Intelligent Vehicles Symposium (IV)},
  year={2018}
}

Code usage

See here.

Paper summary

Discrimative power and generalization capabilities of convolutional networks is vital for deployment of semantic segmentation systems in the wild. These properties can be obtained by training a single net on multiple datasets.

Combined training on multiple datasets is hampered by a variety of reasons, mainly including:

  • different level-of-detail of labels (e.g. person label in dataset A vs pedestrian and rider labels in dataset B)
  • different annotation types (e.g. per-pixel annotations in dataset A vs bounding box annotations in dataset B)
  • class imbalances between datasets (e.g. class person has 10^3 annotated pixels in dataset A and 10^6 pixels in dataset B)

We propose to construct a hierarchy of classifiers to combat above challenges. Hierarchical Semantic Segmentation is based on ResNet50. Its main novelty compared to other semantic segmentation systems, is that a single model can handle a variety of different datasets, with disjunct sets of semantic classes. Our system also runs in real time 18fps @512x1024 resolution. Figures 1-3 below provide sample results, from 3 different datasets.

Image 1.1 Image 1.2 Image 1.3
Predictions 1.1 Predictions 1.2 Predictions 1.3
Ground truth 1.1 Ground truth 1.2 Ground truth 1.3

Figure 1. Cityscapes validation split image examples - top: input images, center: predictions, bottom: ground truth. The network predictions include decisions from L1-L3 levels of the hierarchy. Note that the ground truth includes only one traffic sign superclass (yellow) and no road attribute markings.

Image 2.1 Image 2.2 Image 2.3
Predictions 2.1 Predictions 2.2 Predictions 2.3
Ground truth 2.1 Ground truth 2.2 Ground truth 2.3

Figure 2. Mapillary Vistas validation split image examples - top: input images, center: predictions, bottom: ground truth. The network predictions include decisions from L1-L3 levels of the hierarchy. Note that the ground truth does not include traffic sign subclasses.

Image 3.1 Image 3.2 Image 3.3
Predictions 3.1 Predictions 3.2 Predictions 3.3
Ground truth 3.1 Ground truth 3.2 Ground truth 3.3

Figure 3. TSDB test split image examples - top: input images, center: predictions, bottom: ground truth. The network predictions include decisions from L1-L3 levels of the hierarchy. Note that the ground truth includes only traffic sign bounding boxes, since rest pixels are unlabeled.

About

Datasets and code for reproducing results for IV2018 paper "Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages