by Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell.


This repository is for Understanding Convolution for Semantic Segmentation (WACV 2018), which achieved state-of-the-art result on the CityScapes, PASCAL VOC 2012, and Kitti Road benchmark.


We tested our code on:

Ubuntu 16.04, Python 2.7 with

MXNet (0.11.0), numpy(1.13.1), cv2(3.2.0), PIL(4.2.1), and cython(0.25.2)


  1. Clone the repository:

    git clone
    python develop --user
  2. Download the pretrained model from Google Drive.

  3. Build MXNet (only tested on the TuSimple version):

    git clone --recursive
    vim make/ (we should have USE_CUDA = 1, modify USE_CUDA_PATH, and have USE_CUDNN = 1 to enable GPU usage.)
    make -j
    cd python
    python develop --user

    For more MXNet tutorials, please refer to the official documentation.

  4. Training:

    cd train
    python ../configs/train/train_cityscapes.cfg

    The paths/dirs in the .cfg file need to be specified by the user.

  5. Testing

    cd test
    python ../configs/test/test_full_image.cfg

    The paths/dirs in the .cfg file need to be specified by the user.

  6. Results:

    Modify the result_dir path in the config file to save the label map and visualizations. The expected scores are:

    (single scale testing denotes as 'ss' and multiple scale testing denotes as 'ms')

    • ResNet101-DUC-HDC on CityScapes testset (mIoU): 79.1(ss) / 80.1(ms)
    • ResNet152-DUC on VOC2012 (mIoU): 83.1(ss)


If you find the repository is useful for your research, please consider citing:

  title={Understanding convolution for semantic segmentation},
  author={Wang, Panqu and Chen, Pengfei and Yuan, Ye and Liu, Ding and Huang, Zehua and Hou, Xiaodi and Cottrell, Garrison},
  journal={arXiv preprint arXiv:1702.08502},


Please contact or .