Semantic segmentation has become an important component of self-driving vehicles. It allows the car to understand the surroundings by classifying every pixel of the input image.
To run inference on the pre-trained models, please use
from segmentor import Segmentor seg = Segmentor() classes_output, img_viz = seg.semantic_segmentation(image=image, visualization=True)
classes_output is the pixel-wise classification result for all the categories.
img_viz is a RGB image generated based on
The best way to run some actual test is using
test.py. You must specify the image path by changing the
The pre-trained weights are stored in the
The Cityscape Dataset
In order to train the model, please download the cityscape dataset, which can be found here.
Remeber to preprocess the data using this jupyter notebook:
Data Preprocessing.ipynb. The script will generate
My data is organized as such:
Cityscape │ train_labels.csv │ val_labels.csv └─── training │ └─── aachen │ └─── augsburg │ . │ . └─── training_gt │ └─── aachen | └─── augsburg | . | . └─── val │ └─── frankfurt │ └─── lindau └─── val_gt | └─── frankfurt | └─── lindau
There are two training scripts:
train.py is the ICNet training script.
utils.py contains all the categories (classes). You can modify them based on your dataset.
An overview of the different segmentation models in this project.
ICNet (or image cascade network) is a realtime semantic segmentation model developed by Zhao et al. at The Chinese University of Hong Kong. Their paper shows that ICNet can achieve mIoU of ~70% with the Cityscape dataset, while running at ~30 FPS. After some testing, ICNet became a great choice for self-driving applications. (I am currently using the network on my self-driving golf cart project)
Here is a simple benchmark comparison between ICNet and other popular semantic segmentation models. These images and visualizations are from the original ICNet paper, which can be found here.
If you have questions, comments or concerns, please contact me at email@example.com.