Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Semantic Segmentation with MobileNets and PSP Module


Environment Details: The code has been tested with Ubuntu 16.04 LTS, Intel i5-3570, 4 Cores @ 3.40 GHz, NVIDIA GeForce GTX 1080

First, install NVIDIA drivers and check whether they are working with nvidia-smi.

  • CUDA 8.0.61 (

    • Install both (Base Installer and Patch 2) and refuse wherever asked to overwrite NVIDIA drivers.
  • cuDNN v5.1

    • Download cudnn-8.0-linux-x64-v6.0.tgz and do the following.
      tar -xzvf cudnn-8.0-linux-x64-v6.0.tgz
      cp cuda/lib64/* /usr/local/cuda/lib64/
      cp cuda/include/cudnn.h /usr/local/cuda/include/
  • Tensorflow 1.3

    • To install TF 1.3
      pip install tensorflow-gpu==1.3



  1. The weights for the model can be initialized by using pretrained weights of MobileNet and initializing rest of weights of PSP Module from PSPNet (conv weights with Xavier initializer & biases initialized as 0).
python --pretrained_mobilenet=MobileNetPreTrained/model.ckpt-906808 --save_model=MobileNetPSP

Weights have already been saved in MobileNetPSP, so this part can be skipped.

  1. The dataset can be downloaded from here. The dataset should be kept in the directory structure as shown below:
├── leftImg8bit
├── gtCoarse
└── gtFine

gtFine and gtCoarse should contain the *_gtFine_labelTrainIds.png which are the ground truth labels generated by in cityscapesScripts.

List with images and ground truth paths can be generated using This step has been done and different lists are saved in list.

  1. Train the model using
python --data_dir=PATH_TO_cityscapes-images_FOLDER --log_dir=logs/train1 --num_epochs=80 --gpu=0 --update_beta=True --update_mean_var=True

Train the model for some time. Then, change update_mean_var to False and train for more time. Finally, train with --update_beta=False --update_mean_var=False for rest of the time. can be used to start the training from certain epoch if you wish to stop the training and change parameters. For example:

python --pretrained_checkpoint=logs/train1/checkpoints/model.ckpt --log_dir=logs/train1 --num_epochs=80 --start_epoch=81 --gpu=0 --update_mean_var=False --update_beta=True

Training Procedure on Fine Dataset:

python --log_dir=logs/train1 --gpu=0 --num_epochs=80 --optimizer=momentum --update_mean_var=True --update_beta=True
python --pretrained_checkpoint=logs/train1/checkpoints/model.ckpt-80 --log_dir=logs/train1 --gpu=0 --num_epochs=80 --start_epoch=81 --update_mean_var=False --update_beta=True
python --pretrained_checkpoint=logs/train1/checkpoints/model.ckpt-160 --log_dir=logs/train1 --gpu=0 --num_epochs=100 --start_epoch=161 --update_mean_var=False --update_beta=False

Training procedures can also be found in It can be used by changing DATASET_DIR.

chmod +x
Training on Coarse dataset

For training on Coarse Dataset, --data_list=list/train_list.txt needs to be changed to

Its training procedure can be found in


Models can be evaluated by

python --checkpoint_path=logs/train1/model.ckpt can be used to evaluate a model by changing its variables: DATASET_DIR, DATA_LIST, CHECKPOINT_FOLDER, EPOCH. can be used to evaluate all models from range START to END. can be used to evaluate models in the list EPOCHS.


After training on Fine annotated Cityscapes dataset, evaluation on Cityscapes validation dataset gives 61% mIoU without Flipping.
Inference Time: 52ms on GPU (TF 1.3), 3.34s on CPU (this is wrong!)
Trained Weights Size: 69MB
Few results are shown.

Input Image Prediction Ground Truth


  • Train on Coarse Dataset and report results.
  • Add Auxilary loss.


  1. Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, 2017 [arxiv]
  2. Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia. Pyramid Scene Parsing Network, 2017 [arxiv]


Semantic Segmentation with MobileNets and PSP Module








No releases published


No packages published