English | 简体中文
- Hybrid Task Cascade for Instance Segmentation
This project reproduces HTC based on paddledetection framework.
Cascade is a classic yet powerful architecture that has boosted performance on various tasks. However, how to introduce cascade to instance segmentation remains an open question. A simple combination of Cascade R-CNN and Mask R-CNN only brings limited gain. In this work, authors propose a new framework, Hybrid Task Cascade (HTC), which differs in two important aspects: (1) instead of performing cascaded refinement on these two tasks separately, it interweaves them for a joint multi-stage processing; (2) it adopts a fully convolutional branch to provide spatial context, which can help distinguishing hard foreground from cluttered background. Overall, this framework can learn more discriminative features progressively while integrating complementary features together in each stage. Without bells and whistles, a single HTC obtains 38.4% and 1.5% improvement over a strong Cascade Mask R-CNN baseline on MSCOCO dataset. Moreover, the overall system achieves 48.6 mask AP on the test-challenge split, ranking 1st in the COCO 2018 Challenge Object Detection Task.
Paper:
- [1] K. Chen et al., “Hybrid Task Cascade for Instance Segmentation,” ArXiv190107518 Cs, Apr. 2019, Accessed: Aug. 31, 2021. [Online]. Available: http://arxiv.org/abs/1901.0751
Reference project:
The link of aistudio:
- notebook: https://aistudio.baidu.com/aistudio/projectdetail/2253839
- Script: https://aistudio.baidu.com/aistudio/clusterprojectdetail/2270473
model | Style | box AP | mask AP |
---|---|---|---|
htc-R-50-FPN(official) | pytorch | 42.3 | 37.4 |
htc-R-50-FPN(mine) | Paddlepaddle | 42.6 | 37.9 |
Model & Log Download Address Baidu Web Drive (access code: yc1r )
Detailed information:
weights
├── checkpoints
│ ├── htc_r50_fpn_1x_coco_resnet.pdparams
│ ├── htc_r50_fpn_1x_coco.pdparams
├── output
│ ├── htc_r50_fpn_1x_coco
│ │ ├── model_final.pdparams
COCO 2017 + stuffthingmaps_trainval2017
- Dataset size:
- train: 118287
- val: 5000
- Data format:picture
-
Hardware: GPU, CPU
-
Framework:
- PaddlePaddle >= 2.1.2
# clone this repo
git clone https://github.com/laihuihui/htc.git
cd htc
Installation dependency
pip install -r requirements.txt
python tools/train.py -c configs/htc/htc_r50_fpn_1x_coco.yml
If the training is interrupted, it can be recovered through the -- resume
parameter or -r
parameter, for example, using the -- resume output/htc_r50_fpn_1x_coco/3
means the interrupt is resumed at 3epoch:
python tools/train.py -c configs/htc/htc_r50_fpn_1x_coco.yml --resume output/htc_r50_fpn_1x_coco/3
Perform evaluation in train using --eval
parameter:
python tools/train.py -c configs/htc/htc_r50_fpn_1x_coco.yml --eval
If you want to train distributed and use multicards:
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/htc/htc_r50_fpn_1x_coco.yml
The output is:
Epoch: [0] [ 0/7329] learning_rate: 0.000020 loss_rpn_cls: 0.691306 loss_rpn_reg: 0.054590 loss_bbox_cls_stage0: 4.189201 loss_bbox_reg_stage0: 0.064000 loss_bbox_cls_stage1: 2.481206 loss_bbox_reg_stage1: 0.016608 loss_bbox_cls_stage2: 1.106741
python tools/eval.py -c configs/htc/htc_r50_fpn_1x_coco.yml -o weights=output/htc_r50_fpn_1x_coco/model_final.pdparams
├─config
├─dataset
├─ppdet
├─output
├─log
├─tools
│ ├─eval.py
│ ├─train.py
│ README.md
│ README_cn.md
│ README_paddeldetection_cn.md
│ README_paddeldetection_cn.md
│ requirement.txt
Parameters related to training and evaluation can be set in tools/train.py
, as follows:
Parameters | default | description |
---|---|---|
-c | None, Mandatory | Configuration file path |
--eval | False, Optional | Whether to perform evaluation in train |
--resume or -r | None, Optional | Recovery training |
python tools/train.py -c $config_file
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c $config_file
At this time, the program will import the output log of each process into the path of ./log
:
.
├── log
│ ├── endpoints.log
│ ├── workerlog.0
│ ├── workerlog.1
│ ├── workerlog.2
│ └── workerlog.3
After the training starts, you will get the output similar to the following. Each round of 'batch' training will print the current epoch, step and loss values.
Epoch: [0] [ 0/7329] learning_rate: 0.000020 loss_rpn_cls: 0.691306 loss_rpn_reg: 0.054590 loss_bbox_cls_stage0: 4.189201 loss_bbox_reg_stage0: 0.064000 loss_bbox_cls_stage1: 2.481206 loss_bbox_reg_stage1: 0.016608 loss_bbox_cls_stage2: 1.106741
python tools/eval.py -c $config_file -o weights=$weight_file
Pre training model: weights/output/htc_r50_fpn_1x_coco/model_final.pdparams
in Baidu Web Drive (access code: yc1r )
For other information about the model, please refer to the following table:
information | description |
---|---|
Author | huihui lai |
Date | 2021.08 |
Framework version | Paddle 2.1.2 |
Application scenarios | Object detection , Instance Segmentation |
Support hardware | GPU, CPU |
Download link | Pre training model & Logs (access code: yc1r ) |
Online operation | notebook , Script |