Semantic Segmentation Architectures Implemented in PyTorch
Clone or download


license pypi DOI

Semantic Segmentation Algorithms Implemented in PyTorch

This repository aims at mirroring popular semantic segmentation architectures in PyTorch.

Networks implemented

  • PSPNet - With support for loading pretrained models w/o caffe dependency
  • ICNet - With optional batchnorm and pretrained models
  • FRRN - Model A and B
  • FCN - All 1 (FCN32s), 2 (FCN16s) and 3 (FCN8s) stream variants
  • U-Net - With optional deconvolution and batchnorm
  • Link-Net - With multiple resnet backends
  • Segnet - With Unpooling using Maxpool indices


DataLoaders implemented


  • pytorch >=0.4.0
  • torchvision ==0.2.0
  • scipy
  • tqdm
  • tensorboardX

One-line installation

pip install -r requirements.txt


  • Download data for desired dataset(s) from list of URLs here.
  • Extract the zip / tar and modify the path appropriately in your config.yaml


Setup config file

# Model Configuration
    arch: <name> [options: 'fcn[8,16,32]s, unet, segnet, pspnet, icnet, icnetBN, linknet, frrn[A,B]'

# Data Configuration
    dataset: <name> [options: 'pascal, camvid, ade20k, mit_sceneparsing_benchmark, cityscapes, nyuv2, sunrgbd, vistas'] 
    train_split: <split_to_train_on>
    val_split: <spit_to_validate_on>
    img_rows: 512
    img_cols: 1024
    path: <path/to/data>

# Training Configuration
    n_workers: 64
    train_iters: 35000
    batch_size: 16
    val_interval: 500
    print_interval: 25
        name: <loss_type> [options: 'cross_entropy, bootstrapped_cross_entropy, multi_scale_crossentropy']

    # Optmizer Configuration
        name: <optimizer_name> [options: 'sgd, adam, adamax, asgd, adadelta, adagrad, rmsprop']
        lr: 1.0e-3

        # Warmup LR Configuration
        warmup_iters: <iters for lr warmup>
        mode: <'constant' or 'linear' for warmup'>
        gamma: <gamma for warm up>
    # Augmentations Configuration
        gamma: x                                     #[gamma varied in 1 to 1+x]
        hue: x                                       #[hue varied in -x to x]
        brightness: x                                #[brightness varied in 1-x to 1+x]
        saturation: x                                #[saturation varied in 1-x to 1+x]
        contrast: x                                  #[contrast varied in 1-x to 1+x]
        rcrop: [h, w]                                #[crop of size (h,w)]
        translate: [dh, dw]                          #[reflective translation by (dh, dw)]
        rotate: d                                    #[rotate -d to d degrees]
        scale: [h,w]                                 #[scale to size (h,w)]
        ccrop: [h,w]                                 #[center crop of (h,w)]
        hflip: p                                     #[flip horizontally with chance p]
        vflip: p                                     #[flip vertically with chance p]

    # LR Schedule Configuration
        name: <schedule_type> [options: 'constant_lr, poly_lr, multi_step, cosine_annealing, exp_lr']

    # Resume from checkpoint  
    resume: <path_to_checkpoint>

To train the model :

python [-h] [--config [CONFIG]] 

--config                Configuration file to use

To validate the model :

usage: [-h] [--config [CONFIG]] [--model_path [MODEL_PATH]]
                       [--eval_flip] [--measure_time]

  --config              Config file to be used
  --model_path          Path to the saved model
  --eval_flip           Enable evaluation with flipped image | True by default
  --measure_time        Enable evaluation with time (fps) measurement | True
                        by default

To test the model w.r.t. a dataset on custom images(s):

python [-h] [--model_path [MODEL_PATH]] [--dataset [DATASET]]
               [--dcrf [DCRF]] [--img_path [IMG_PATH]] [--out_path [OUT_PATH]]
  --model_path          Path to the saved model
  --dataset             Dataset to use ['pascal, camvid, ade20k etc']
  --dcrf                Enable DenseCRF based post-processing
  --img_path            Path of the input image
  --out_path            Path of the output segmap

If you find this code useful in your research, please consider citing:

    Author = {Meet P Shah},
    Title = {Semantic Segmentation Architectures Implemented in PyTorch.},
    Journal = {},
    Year = {2017}