Hybrid Task Cascade for Instance Segmentation

Hybrid Task Cascade for Instance Segmentation

1 Introduction

This project reproduces HTC based on paddledetection framework.

Cascade is a classic yet powerful architecture that has boosted performance on various tasks. However, how to introduce cascade to instance segmentation remains an open question. A simple combination of Cascade R-CNN and Mask R-CNN only brings limited gain. In this work, authors propose a new framework, Hybrid Task Cascade (HTC), which differs in two important aspects: (1) instead of performing cascaded refinement on these two tasks separately, it interweaves them for a joint multi-stage processing; (2) it adopts a fully convolutional branch to provide spatial context, which can help distinguishing hard foreground from cluttered background. Overall, this framework can learn more discriminative features progressively while integrating complementary features together in each stage. Without bells and whistles, a single HTC obtains 38.4% and 1.5% improvement over a strong Cascade Mask R-CNN baseline on MSCOCO dataset. Moreover, the overall system achieves 48.6 mask AP on the test-challenge split, ranking 1st in the COCO 2018 Challenge Object Detection Task.

Paper:

[1] K. Chen et al., “Hybrid Task Cascade for Instance Segmentation,” ArXiv190107518 Cs, Apr. 2019, Accessed: Aug. 31, 2021. [Online]. Available: http://arxiv.org/abs/1901.0751

Reference project:

https://github.com/open-mmlab/mmdetection

The link of aistudio:

2 Accuracy

model	Style	box AP	mask AP
htc-R-50-FPN(official)	pytorch	42.3	37.4
htc-R-50-FPN(mine)	Paddlepaddle	42.6	37.9

Model & Log Download Address Baidu Web Drive (access code: yc1r )

Detailed information：

weights
├── checkpoints
│   ├── htc_r50_fpn_1x_coco_resnet.pdparams
│   ├── htc_r50_fpn_1x_coco.pdparams
├── output
│   ├── htc_r50_fpn_1x_coco
│   │   ├── model_final.pdparams

3 Dataset

COCO 2017 + stuffthingmaps_trainval2017

Dataset size：
- train: 118287
- val: 5000
Data format：picture

4 Environment

Hardware: GPU, CPU
Framework:
- PaddlePaddle >= 2.1.2

5 Quick start

step1: clone

# clone this repo
git clone https://github.com/laihuihui/htc.git
cd htc

Installation dependency

pip install -r requirements.txt

step2: train

python tools/train.py -c configs/htc/htc_r50_fpn_1x_coco.yml

If the training is interrupted, it can be recovered through the -- resume parameter or -r parameter, for example, using the -- resume output/htc_r50_fpn_1x_coco/3 means the interrupt is resumed at 3epoch:

python tools/train.py -c configs/htc/htc_r50_fpn_1x_coco.yml --resume output/htc_r50_fpn_1x_coco/3

Perform evaluation in train using --eval parameter:

python tools/train.py -c configs/htc/htc_r50_fpn_1x_coco.yml --eval

If you want to train distributed and use multicards:

python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/htc/htc_r50_fpn_1x_coco.yml

The output is:

Epoch: [0] [   0/7329] learning_rate: 0.000020 loss_rpn_cls: 0.691306 loss_rpn_reg: 0.054590 loss_bbox_cls_stage0: 4.189201 loss_bbox_reg_stage0: 0.064000 loss_bbox_cls_stage1: 2.481206 loss_bbox_reg_stage1: 0.016608 loss_bbox_cls_stage2: 1.106741

step3: evaluation

python tools/eval.py -c configs/htc/htc_r50_fpn_1x_coco.yml -o weights=output/htc_r50_fpn_1x_coco/model_final.pdparams

6 Code structure

6.1 main structure

├─config                          
├─dataset                         
├─ppdet                           
├─output                          
├─log                             
├─tools                           
│   ├─eval.py                     
│   ├─train.py                    
│  README.md                      
│  README_cn.md                   
│  README_paddeldetection_cn.md   
│  README_paddeldetection_cn.md   
│  requirement.txt

6.2 Part of the parameter description

Parameters related to training and evaluation can be set in tools/train.py, as follows:

Parameters	default	description
-c	None, Mandatory	Configuration file path
--eval	False, Optional	Whether to perform evaluation in train
--resume or -r	None, Optional	Recovery training

6.3 Training process

Single machine training

python tools/train.py -c $config_file

Multi machine training

python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c $config_file

At this time, the program will import the output log of each process into the path of ./log:

.
├── log
│   ├── endpoints.log
│   ├── workerlog.0
│   ├── workerlog.1
│   ├── workerlog.2
│   └── workerlog.3

Training output

After the training starts, you will get the output similar to the following. Each round of 'batch' training will print the current epoch, step and loss values.

Epoch: [0] [   0/7329] learning_rate: 0.000020 loss_rpn_cls: 0.691306 loss_rpn_reg: 0.054590 loss_bbox_cls_stage0: 4.189201 loss_bbox_reg_stage0: 0.064000 loss_bbox_cls_stage1: 2.481206 loss_bbox_reg_stage1: 0.016608 loss_bbox_cls_stage2: 1.106741

6.4 assessment process

python tools/eval.py -c $config_file -o weights=$weight_file

Pre training model: weights/output/htc_r50_fpn_1x_coco/model_final.pdparams in Baidu Web Drive (access code: yc1r )

7 Model information

For other information about the model, please refer to the following table:

information	description
Author	huihui lai
Date	2021.08
Framework version	Paddle 2.1.2
Application scenarios	Object detection , Instance Segmentation
Support hardware	GPU, CPU
Download link	Pre training model & Logs (access code: yc1r )
Online operation	notebook , Script

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.travis		.travis
configs		configs
dataset		dataset
demo		demo
deploy		deploy
docs		docs
ppdet		ppdet
scripts		scripts
tools		tools
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.style.yapf		.style.yapf
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
README_cn.md		README_cn.md
README_paddeldetection_cn.md		README_paddeldetection_cn.md
README_paddeldetection_en.md		README_paddeldetection_en.md
requirements.txt		requirements.txt
setup.py		setup.py

License

laihuihui/htc

Folders and files

Latest commit

History

Repository files navigation