# *Task* - Panoptic Image Segmentation

## Summary

Develop a methodology to perform panoptic image segmentation and evaluate its performance on semantic segmentation, instance segmentation, and panoptic segmentation tasks.

- [Background and Motivation](#background-and-motivation)
- [Task](#task)
- [Required Tools and Data](#required-tools-and-data)
- [Hints](#Hints)

## Background and Motivation

*Semantic segmentation* in images is a popular visual perception task for automated driving applications. By assigning a semantic class label to each pixel in an image, semantic segmentation can provide a semantic scene understanding of the current vehicle environment. While it is especially helpful for detecting non-object regions like the drivable road space, it cannot differentiate between multiple instances of the same class, e.g., multiple vehicles. The detection and segmentation of object instances is instead known as *instance segmentation*, where the algorithm assigns a unique label to each pixel belonging to a unique object instance. Going one step further, *panoptic segmentation* is the combination of *semantic segmentation* and *instance segmentation*: it assigns both a semantic label to each pixel as well as an instance label to pixels belonging to object instances. This provides a more complete and informative representation of the visual scene than either semantic or instance segmentation alone. By providing a unified representation of the visual scene, panoptic segmentation can help improve the performance of downstream tasks such as object tracking.

![](./assets/semantic-instance-panoptic-segmentation.png)
[*Image Source*](https://www.mdpi.com/2079-9292/11/12/1884#)

## Task

The task is to develop a methodology to perform panoptic image segmentation and evaluate its performance on semantic segmentation, instance segmentation, and panoptic segmentation tasks.

### Subtasks

> ***Note:*** *The subtasks listed below do not have to be followed strictly. They serve the purpose of guiding you along your own research for this topic.*

1. Search for publicly available datasets for panoptic image segmentation related to driving.
2. Research existing panoptic image segmentation methodologies.
3. Implement a TensorFlow (or PyTorch) data pipeline for loading and pre-processing a chosen panoptic image segmentation dataset.
4. Implement a TensorFlow (or PyTorch) model for panoptic image segmentation.
5. Train a model for panoptic image segmentation on the chosen dataset.
6. Iterate on the training with different augmentation, generalization techniques, and/or other hyperparameters in order to optimize generalization capabilities of the trained model.
7. Qualitatively and quantitatively evaluate the trained model on the training dataset's test data.
   - semantic segmentation performance
   - instance segmentation performance
   - panoptic segmentation performance
8. Qualitatively and quantitatively evaluate the trained model on ika's validation dataset (quantitative evaluation only on semantic image segmentation task).
9.  Document your research, developed approach, and evaluations in a Jupyter notebook report. Explain and reproduce individual parts of your implemented functions with exemplary data.

## Required Tools and Data

### Tools

- TensorFlow/PyTorch
- *(optional)* Image Segmentation Training Pipeline & Model *(see [ACDC Exercise: Semantic Image Segmentation](https://github.com/ika-rwth-aachen/acdc-notebooks/blob/main/section_2_sensor_data_processing/1_semantic_image_segmentation.ipynb))*

### Data

- [ika's validation dataset](data/ika-dataset/)
- *(to be found)* publicly available datasets for panoptic image segmentation related to driving

## Hints

### Relevant ACDC Sections

- **Sensor Data Processing Algorithms**
  - Image Segmentation

### TFDS Datasets

[TensorFlow Datasets](https://www.tensorflow.org/datasets) is a collection of datasets ready-to-use with TensorFlow. It may already contain datasets interesting for panoptic image segmentation and thus save you from worrying about parsing data files from disk. All TFDS datasets are exposed as [`tf.data.Dataets`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset), ready to be passed to [`model.fit`](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit).

The example below shows how easy it is to load a TFDS dataset, in this case the famous [MNIST dataset](https://www.tensorflow.org/datasets/catalog/mnist). Note that this dataset is not related to semantic image segmentation.

In [1]:
# install required Python packages via pip

import sys
!{sys.executable} -m pip install \
    tensorflow==2.10.0 \
    tensorflow-datasets==4.4.0



In [2]:
# load MNIST

import tensorflow_datasets as tfds

ds = tfds.load("mnist")
print("The TFDS MNIST dataset contains three different splits of data:")
print(ds)
print("The dataset has two features, 'image' and 'label'. Here is one sample:")
tfds.as_dataframe(ds["train"].take(1))

2023-01-13 12:01:34.618958: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-13 12:01:34.751233: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-01-13 12:01:35.240028: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/targets/x86_64-linux/lib:/usr/local//usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-01-13 12:01:35.240078: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dy

The TFDS MNIST dataset contains three different splits of data:
{'test': <PrefetchDataset element_spec={'image': TensorSpec(shape=(28, 28, 1), dtype=tf.uint8, name=None), 'label': TensorSpec(shape=(), dtype=tf.int64, name=None)}>, 'train': <PrefetchDataset element_spec={'image': TensorSpec(shape=(28, 28, 1), dtype=tf.uint8, name=None), 'label': TensorSpec(shape=(), dtype=tf.int64, name=None)}>}
The dataset has two features, 'image' and 'label'. Here is one sample:


Unnamed: 0,image,label
0,[[[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 84]  [254]  [101]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [174]  [253]  [119]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 31]  [247]  [202]  [ 29]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 1]  [ 1]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [141]  [253]  [168]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 66]  [208]  [ 56]  [ 0]  [ 0]  [ 0]  [ 0]  [186]  [253]  [120]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 57]  [253]  [119]  [ 0]  [ 0]  [ 0]  [ 28]  [249]  [240]  [ 25]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 34]  [253]  [119]  [ 0]  [ 0]  [ 0]  [109]  [254]  [197]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 53]  [253]  [119]  [ 0]  [ 0]  [ 0]  [135]  [254]  [133]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [133]  [254]  [119]  [ 0]  [ 0]  [ 27]  [240]  [255]  [ 35]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 7]  [235]  [253]  [208]  [151]  [169]  [215]  [253]  [206]  [ 2]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 97]  [253]  [253]  [253]  [254]  [253]  [253]  [253]  [ 86]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [150]  [244]  [145]  [119]  [101]  [ 82]  [253]  [253]  [ 14]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 84]  [254]  [172]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [174]  [253]  [119]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [237]  [252]  [ 56]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 50]  [241]  [182]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [187]  [254]  [249]  [105]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [186]  [253]  [206]  [ 21]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [227]  [242]  [ 32]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [236]  [219]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]  [[ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]  [ 0]]],4
