Skip to content

A PyTorch implementation of EMANet based on ICCV 2019 paper "Expectation-Maximization Attention Networks for Semantic Segmentation"

Notifications You must be signed in to change notification settings

leftthomas/EMANet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EMANet

A PyTorch implementation of EMANet based on ICCV 2019 paper Expectation-Maximization Attention Networks for Semantic Segmentation.

Requirements

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
  • opencv
pip install opencv-python
  • tensorboard
pip install tensorboard
  • pycocotools
pip install git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI
  • fvcore
pip install git+https://github.com/facebookresearch/fvcore
  • panopticapi
pip install git+https://github.com/cocodataset/panopticapi.git
  • cityscapesScripts
pip install git+https://github.com/mcordts/cityscapesScripts.git
  • detectron2
pip install git+https://github.com/facebookresearch/detectron2.git@master

Datasets

For a few datasets that detectron2 natively supports, the datasets are assumed to exist in a directory called datasets/, under the directory where you launch the program. They need to have the following directory structure:

Expected dataset structure for COCO:

coco/
  annotations/
    panoptic_{train,val}2017.json
  panoptic_{train,val}2017/
  # png annotations
  panoptic_stuff_{train,val}2017/  # generated by the script mentioned below

run ./datasets/prepare_coco.py to extract semantic annotations from panoptic annotations.

Expected dataset structure for Cityscapes:

cityscapes/
  gtFine/
    train/
      aachen/
        color.png, instanceIds.png, labelIds.png, polygons.json,
        labelTrainIds.png
      ...
    val/
    test/
  leftImg8bit/
    train/
    val/
    test/

run ./datasets/prepare_cityscapes.py to creat labelTrainIds.png.

Before training, the pre-trained backbone models (ResNet50, ResNet101 and ResNet152) on ImageNet should be downloaded and unzipped into epochs.

Training

To train a model, run

python train_net.py --config-file <config.yaml>

For example, to launch end-to-end EMANet training with ResNet-50 backbone for coco dataset on 8 GPUs, one should execute:

python train_net.py --config-file configs/r50_coco.yaml --num-gpus 8

Evaluation

Model evaluation can be done similarly:

python train_net.py --config-file configs/r50_coco.yaml --num-gpus 8 --eval-only MODEL.WEIGHTS epochs/model.pth

Results

There are some difference between this implementation and official implementation:

  1. The image sizes of Multi-Scale Training are (640, 672, 704, 736, 768, 800) for coco dataset;
  2. The image sizes of Multi-Scale Training are (800, 832, 864, 896, 928, 960, 992, 1024) for cityscapes dataset;
  3. No RandomCrop used;
  4. Learning rate policy is WarmupCosineLR.

COCO

Name train time (s/iter) inference time (s/im) train mem (GB) PA
%
mean PA % mean IoU % FW IoU % download link
R50 1.04 0.11 11.14 80.49 53.92 42.71 68.69 model | xxi8
R101 1.55 0.18 17.92 81.16 54.54 43.61 69.50 model | 1jhd
R152 1.95 0.23 23.88 81.73 56.53 45.15 70.40 model | wka6

Cityscapes

Name train time (s/iter) inference time (s/im) train mem (GB) PA
%
mean PA % mean IoU % FW IoU % download link
R50 0.81 0.11 11.22 95.13 80.01 72.28 91.09 model | x2d5
R101 1.11 0.14 14.69 95.35 81.77 74.02 91.47 model | t2m1
R152 1.37 0.15 18.87 95.48 82.97 75.12 91.68 model | vqeq

About

A PyTorch implementation of EMANet based on ICCV 2019 paper "Expectation-Maximization Attention Networks for Semantic Segmentation"

Topics

Resources

Stars

Watchers

Forks

Languages