Skip to content

A light-weight deep learning library based on Caffe

Notifications You must be signed in to change notification settings

shenfalong/SegModel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SegModel

This repository is for Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF.

@inproceedings{shen2017segmodel,
  author = {Falong Shen, Gan Rui, Shuicheng Yan and Gang Zeng},
  title = {Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF},
  booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2017}
}

Installation

This library is based on Caffe. CuDNN 7 and NCCL 1 are required. Please follow the installation instruction of Caffe.

Include

  • Imlplementation details introduced in the paper, including training code and test code.
  • GPU Memory reuse by setting different flow for each data blob, which saves about half of memory in the training stage.
  • Multi-GPU efficient running.
  • Support multi-batch normalization.
  • Support training generative adversial networks.

Scripts

Matlab code. Please execute the scripts in Matlab folder.

Put these models into matlab/caffemodel/ and modify the model name in matcaffe_fcn.m. Ensemble the three models should reach mIoU 79.2% on the test set of Cityscapes.

Datasets

PASCAL VOC 2012 semantic segmentation benchmark contains 20 foreground object classes and one background class. The original dataset contains 1464 train, 1449 val, and 1456 test pixel-level labeled images for training, validation, and testing, respectively. The dataset is augmented by the extra annotations provided by in 10582 training images. However, the label strategy of the extra annotations is not exactly consistant with the original annotation. Please refer the label images for more details.

Cityscapes dataset consists of 2975 training images and 500 validation images . Both have pixel-wise annotations. There are also another about 19,998 image with coarse annotation. There are 19 categories in this dataset and there is no background category. All the images are about street scene in some European cities and are taken by car-carried cameras. It should be noticed that the size of every image is 1024 ×2048 in this dataset.

The data for this benchmark comes from ADE20K Dataset which contains more than 20K scene-centric images exhaustively annotated with objects and object parts. Specifically, the benchmark is divided into 20K images for training, 2K images for validation, and another batch of held-out images for testing. There are totally 150 semantic categories included for evaluation, which include stuffs like sky, road, grass, and discrete objects like person, car, bed. Note that there are non-uniform distribution of objects occuring in the images, mimicking a more natural object occurrence in daily scene.

About

A light-weight deep learning library based on Caffe

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages