# *Task* - Domain Adaptation for Semantic Image Segmentation

## Summary

Develop a methodology which optimizes a neural network for semantic image segmentation trained on public datasets with regard to its predictive performance on data affected by domain shift.

- [Background and Motivation](#background-and-motivation)
- [Task](#task)
- [Required Tools and Data](#required-tools-and-data)
- [Hints](#Hints)

## Background and Motivation

The main purpose of supervised learning is to repeatedly serve a training algorithm with input-label-pairs (samples), s.t. the optimized model not only performs well on the training data, but also generalizes well to unseen data (validation/test data). Generalizing to unseen data that is in principle similar to training data seems easier than generalizing to unseen and very dissimilar data. One example is that a model, which is trained on image data from Aachen on a sunny day, will most likely perform better on other Aachen images of a sunny day than on New York images on a rainy day. Ideally, one would always be able to train models on training data similar to the target domain of where the model is supposed be used.

The collection and labeling of suitable datasets for a particular supervised learning task, however, is usually associated with a lot of (manual) effort. In the case of image segmentation, creating a new dataset requires the oftentimes manual annotation of every single image in the dataset by associating every pixel with a particular semantic class. One example of an input image for semantic image segmentation and a corresponding semantic segmentation label is shown below.

![](./assets/semantic-image-segmentation.png)

Considering the labeling effort, it is desirable to make the most use out of publicly available datasets. Careful application of data augmentation, domain adaptation, and hyperparameter tuning techniques has the potential to improve generalization capabilities of trained models, allowing them to perform better on new domains (e.g., different sensor setup, different environment).

## Task

The task is to develop a methodology which optimizes a neural network for semantic image segmentation trained on public datasets with regard to its predictive performance on data affected by domain shift.

### Subtasks

> ***Note:*** *The subtasks listed below do not have to be followed strictly. They serve the purpose of guiding you along your own research for this topic.*

1. Search for and choose **two** publicly available datasets for semantic image segmentation related to driving (e.g. *Cityscapes*).
1. Research augmentation and domain adaptation techniques for semantic image segmentation, such as ...
   - image flipping
   - modification of image brightness, contrast, etc.
   - merging of classes (e.g., if training dataset distinguishes between cars and buses, but target domain only cares about vehicles in general)
   - use of some (labeled) target domain samples in training dataset, possibly given extra weight as compared to standard samples
   - ...
1. Research training techniques to improve generalization, such as ...
   - dropout
   - L1/L2 regularization
   - ...
1. Implement a TensorFlow data pipeline, possibly including online data augmentation.
1. Implement a TensorFlow model for semantic image segmentation.
1. Train models on both selected public datasets and evaluate their performance on each other (suggested metric: *Mean IoU*).
   1. Train on dataset 1 training data, evaluate on dataset 2 validation data, evaluate on ika's validation data.
   1. Train on dataset 2 training data, evaluate on dataset 1 validation data, evaluate on ika's validation data.
   1. Train on dataset 1+2 training data combined, evaluate on dataset 1 validation data, evaluate on dataset 2 validation data, evaluate on ika's validation data.
1. Iterate on the trainings with different augmentation, domain adaptation, generalization techniques, and other hyperparameters in order to optimize generalization capabilities of the trained models, especially with regard to ika's validation dataset.
1. Document your research, developed approach, and evaluations in a Jupyter notebook report. Explain and reproduce individual parts of your implemented functions with exemplary data.

## Required Tools and Data

### Tools

- TensorFlow
- Image Segmentation Training Pipeline & Model *(see [ACDC Exercise: Semantic Image Segmentation](https://github.com/ika-rwth-aachen/acdc-notebooks/blob/main/section_2_sensor_data_processing/1_semantic_image_segmentation.ipynb))*

### Data

- [ika's validation dataset](data/ika-dataset/)
- *(to be found)* two publicly available datasets for semantic image segmentation related to driving

## Hints

### Relevant ACDC Sections

- **Sensor Data Processing Algorithms**
  - Image Segmentation

### TFDS Datasets

[TensorFlow Datasets](https://www.tensorflow.org/datasets) is a collection of datasets ready-to-use with TensorFlow. It may already contain datasets interesting for semantic image segmentation and thus save you from worrying about parsing data files from disk. All TFDS datasets are exposed as [`tf.data.Dataets`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset), ready to be passed to [`model.fit`](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit).

The example below shows how easy it is to load a TFDS dataset, in this case the famous [MNIST dataset](https://www.tensorflow.org/datasets/catalog/mnist). Note that this dataset is not related to semantic image segmentation.

In [1]:
# install required Python packages via pip

import sys
!{sys.executable} -m pip install \
    tensorflow==2.10.0 \
    tensorflow-datasets==4.4.0



In [2]:
# load MNIST

import tensorflow_datasets as tfds

ds = tfds.load("mnist")
print("The TFDS MNIST dataset contains three different splits of data:")
print(ds)
print("The dataset has two features, 'image' and 'label'. Here is one sample:")
tfds.as_dataframe(ds["train"].take(1))

2023-01-13 12:01:34.618958: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-13 12:01:34.751233: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-01-13 12:01:35.240028: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/targets/x86_64-linux/lib:/usr/local//usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-01-13 12:01:35.240078: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dy

The TFDS MNIST dataset contains three different splits of data:
{'test': <PrefetchDataset element_spec={'image': TensorSpec(shape=(28, 28, 1), dtype=tf.uint8, name=None), 'label': TensorSpec(shape=(), dtype=tf.int64, name=None)}>, 'train': <PrefetchDataset element_spec={'image': TensorSpec(shape=(28, 28, 1), dtype=tf.uint8, name=None), 'label': TensorSpec(shape=(), dtype=tf.int64, name=None)}>}
The dataset has two features, 'image' and 'label'. Here is one sample:


Unnamed: 0,image,label
0,[[[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 84]  [254]  [101]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [174]  [253]  [119]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 31]  [247]  [202]  [ 29]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 1]  [ 1]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [141]  [253]  [168]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 66]  [208]  [ 56]  [ 0]  [ 0]  [ 0]  [ 0]  [186]  [253]  [120]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 57]  [253]  [119]  [ 0]  [ 0]  [ 0]  [ 28]  [249]  [240]  [ 25]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 34]  [253]  [119]  [ 0]  [ 0]  [ 0]  [109]  [254]  [197]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 53]  [253]  [119]  [ 0]  [ 0]  [ 0]  [135]  [254]  [133]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [133]  [254]  [119]  [ 0]  [ 0]  [ 27]  [240]  [255]  [ 35]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 7]  [235]  [253]  [208]  [151]  [169]  [215]  [253]  [206]  [ 2]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 97]  [253]  [253]  [253]  [254]  [253]  [253]  [253]  [ 86]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [150]  [244]  [145]  [119]  [101]  [ 82]  [253]  [253]  [ 14]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 84]  [254]  [172]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [174]  [253]  [119]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [237]  [252]  [ 56]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 50]  [241]  [182]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [187]  [254]  [249]  [105]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [186]  [253]  [206]  [ 21]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [227]  [242]  [ 32]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [236]  [219]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]],4
