Add RetinaNet Implementation #102

chengyangfu · 2018-11-03T04:09:47Z

Hi,
This PR contains the RetinaNet implementation. The following table contains the models which use ResNet50 and ResNet101 as the backbones.

GPU : Pascal Titan X
PyTorch : v1.0.0
Inference time is measured when setting batch size as 1.

Model	Detectron Accuracy	Current Accuracy	inference time(s/im)	download
RetinaNet_R-50-FPN_1x	35.7	36.3	0.102	model
RetinaNet_R-101-FPN_1x	37.7	38.5	0.123	model
RetinaNet_X-101-32x8d-FPN_1x	39.5	39.8	0.200	model
RetinaNet_R-50-FPN_P5_1x	35.7	36.2	0.097	model
RetinaNet_R-101-FPN_P5_1x	37.7	38.5	0.121	model

Add _C.TEST.DETECTIONS_PER_IMG = 100.
After using DETECTIONS_PER_IMG, the mAP drops 0.1.

Not Implemented parts.

Class specific bbox prediction.
Softmax Focal Loss

Updated 02/02/2018
Identify the reason why this branch gets higher AP.

Branch	Accuracy	Difference
This	36.3/55.2/38.9/ 19.7/39.9/49.0	BoxCoder(10, 10, 5, 5), add *4 in classification loss normalization
retinanet-detectron	35.6/55.8/37.7/ 19.6/39.3/48.2	BoxCoder(1, 1, 1, 1)

Updated 01/30/2018
After updating PyTorch to v1.0.0, the inference time reduced around 15~20%.
Update the inference time in the table.

Updated 01/26/2018
Add RetinaNet_X-101-32x8d-FPN_1x model.
AP : 39.8
Inferece time : 0.200 second.

Updated 01/25/2018
In my first version, I accidentally used P5 to generate P6 instead of C5 which was used in Detectron and paper.
The following table compares the performances in these two settings.

Model	C5	P5
RetinaNet_R-50-FPN_1x	36.3/55.2/38.9/19.7/39.9/49.0	36.2/55.1/38.7/19.7/39.5/48.6
RetinaNet_R-101-FPN_1x	38.5/57.6/41.0/20.8/42.3/51.7	38.5/57.9/41.3/21.0/42.8/51.3

Updated 01/23/2018

Train the model without "divide by 4" in the regression loss.
Performance:

Model	AP	AP50	AP75	APs	APm	APl
RetinaNet_R-50-FPN_1x	29.6	45.0	31.5	13.9	31.4	41.2

Updated 11/20/2018

The matching part is slightly different from the Detectron version.
In Detectron matching, anchors with IOU >= 0.5 are considered as positive examples, and anchors with IOU <=0.4 are negative examples. Then for the low qualities matches(best prediction for each gt), Detectron only uses the low-quality examples >= 0.4.

P.S.In Detectron, there are some cases occur in both fg_inds and bg_inds. Although in Line230, Detectron removes all the low-qualities positive examples < 0.4. I think the in Line231, num_fg calculation is not correct.

I also test the threshold used for low-qualities positive examples from 0.5 to 0.0.

threshold	AP	AP50	AP75	P.S.
0.5	35.5	53.7	38.1
0.4	36.0	54.1	38.6	Detectron version.
0.3	36.1	54.5	38.9
0.2	36.1	54.5	38.7
0.0	36.2	55.0	38.7	This branch.

Update the matching part in retinanet_loss.

Need to check it again to see if is there any room for speed improvement.

…-benchmark into retina_net

facebook-github-bot · 2018-11-03T04:09:55Z

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

facebook-github-bot · 2018-11-03T04:19:21Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

fmassa · 2018-11-03T16:48:20Z

This is really awesome, thanks a lot for the PR!

I'll have a closer look at it next week, let us know the result of the training!

chengyangfu · 2018-11-04T23:52:55Z

Finishing the training.
RetinaNet with X_101_32x8d backbone model costs too much time for training now. Due to the CVPR submission deadline is coming, our lab does not have extra machines for training this one. If anyone can train this one, I will highly appreciate it!

fmassa · 2018-11-05T10:04:08Z

No worries about X_101_32x8d training, we can do it on our side.

fmassa

Once again thanks a lot for this awesome PR!

This is not a complete review yet.

One question I have is that I think we might want to move _C.RETINANET into _C.MODEL.RETINANET, but let's wait until @rbgirshick comment on that.

maskrcnn_benchmark/config/defaults.py

maskrcnn_benchmark/modeling/backbone/fpn.py

maskrcnn_benchmark/modeling/rpn/anchor_generator.py

maskrcnn_benchmark/modeling/rpn/retinanet.py

maskrcnn_benchmark/structures/boxlist_ops.py

rbgirshick · 2018-11-06T03:25:22Z

@chengyangfu nice work! Do you know what implementation differences might have caused the improvement in box AP relative to the Detectron implementation?

I'm also curious if you need to use a C++ implementation of sigmoid focal loss or if you can simply use a Python implementation using torch.nn.functional? Ideally it could be simplified to the Python version.

Generate Empty BoxLists instead of [] in retinanet_infer

chengyangfu · 2018-11-06T05:48:15Z

@rbgirshick
For the C++ implementation of sigmoid focal loss, I have tested this in my another project. Python version needs to discard ignored examples first and then calculate focal loss, but C++/CUDA version can combine focal loss and selection. The C++/CUDA version definitely has lower memory footprints and runs faster. The critical part is the selection. If I called labels >=0 and use it to get the positive and negative examples, the performance will drop quickly. Due to a large number of positive and negative examples in the training, I think that's reasonable.

The following is the python version of Focal Loss I tested.

   def forward(self, inputs, targets):
       N = inputs.size(0)
       C = inputs.size(1)
       class_mask = inputs.new_zeros((N, C))
       ids = targets.view(-1, 1)
       class_mask.scatter_(1, ids, 1.)

       class_mask = class_mask[:, 1:]
       inputs = inputs[:, 1:]

       P = torch.sigmoid(inputs)
       PC = P*class_mask + (1-P)*(1-class_mask)
       alpha = self.alpha * class_mask + (1 - self.alpha) * (1 - class_mask)
       focal_weight = alpha * (1 - PC).pow(self.gamma)
       loss = F.binary_cross_entropy_with_logits(inputs, class_mask,
                                                     focal_weight)
       return loss

Add NUM_DETECTIONS_PER_IMAGE

zimenglan-sysu-512 · 2019-02-14T10:17:15Z

hi @laibe
i try to run your yaml files, it encounter OOM (IMS_PER_BATCH=2)

fmassa · 2019-02-14T10:17:34Z

Hi @chengyangfu ,

Thanks for the benchmark!

I believe this is a consequence of the operations not being fused in PyTorch 1.0.0. I think there have been some improvements recently that made it better, but I'd need to check.

cc @ailzhang for the performance timings and memory

Let's keep the CUDA implementation for now then, and dispatch to the Python implementation if we the tensor is on CPU, how does that sound?

Then this will be ready to be merged.

Once again, thanks a lot for this awesome contribution!

laibe · 2019-02-14T12:02:38Z

@zimenglan-sysu-512 what's your training setup? I was using:
1x GeForce GTX 1080 Ti
CUDA runtime version: 9.0.176
with max mem during training of 5132

zimenglan-sysu-512 · 2019-02-14T12:18:02Z

thanks @laibe
i make a mistake resulting in OOM.
btw, do u use CUDA implementation or Python implementation?

after using cuda version, retinanet_MobileNetV2-96-FPN_1x needs 5146 mem, while retinanet_MobileNetV2-FPN_1x.pth needs 5486 mem.

ailzhang · 2019-02-14T16:26:29Z

The numbers might make sense given the current fusion logic in jit, @waochaol @zou3519 could you also help check on the JIT numbers? Thanks!

on cpu.

chengyangfu · 2019-02-14T20:01:58Z

@fmassa
It sounds good to me. I just updated the maskrcnn_benchmark/layers/sigmoid_focal_loss.py according to the suggestion.

fmassa

Great, thanks!

fmassa · 2019-02-15T13:32:32Z

That's an awesome contribution @chengyangfu , thanks a lot for all your effort!

buaaMars · 2019-05-25T02:39:23Z

@chengyangfu
Hi,
Great work!
I have a question on rpn/retinanet/inference.py(77)
Why do you reshape the box_regression when it has been permute_and_flatten just in the last line? According to rpn/utils.py(13), (N, -1, 4) is exactly the shape of box_regression. Do you reshape it again in order to handle with some spacial cases?

thks a lot

as1392 · 2019-07-24T12:43:08Z

Ah... I finally realized why model zoo does not have these trained weights... Removing OUT_CHANNELS: 256 from backbone destroyed trained networks? I hope someone update/convert these weights :(

Edit : OK, never mind this comment. It was just giving scores lower than 0.7(on the whole image). Try predeictions on COCO_val2014_000000355257.jpg.

dedoogong · 2019-08-16T07:41:50Z

could you please support Sigmoid Focal Loss cuda implementation to run on FP16?

Thank you

simaiden · 2019-09-09T02:34:25Z

Can I use this model to train my custom dataset as in #521 ?

chenjoya · 2019-10-10T15:54:31Z

supplement performance of retinanet_r101fpn_2x on COCO minival:
AP, AP50, AP75, APs, APm, APl
0.3878, 0.5811, 0.4132, 0.2081, 0.4256, 0.5183

adizhol · 2019-12-02T15:29:55Z

Hi @chengyangfu :)
Why is the focal loss (the sum of the losses) divided by the number of positive labels plus number of labels (N = len(labels))?

retinanet_cls_loss = self.box_cls_loss_func(
            box_cls,
            labels
        ) / (pos_inds.numel() + N)

* Add RetinetNet parameters in cfg. * hot fix. * Add the retinanet head module now. * Add the function to generate the anchors for RetinaNet. * Add the SigmoidFocalLoss cuda operator. * Fix the bug in the extra layers. * Change the normalizer for SigmoidFocalLoss * Support multiscale in training. * Add retinannet training script. * Add the inference part of RetinaNet. * Fix the bug when building the extra layers in retinanet. Update the matching part in retinanet_loss. * Add the first version of the inference of RetinaNet. Need to check it again to see if is there any room for speed improvement. * Remove the retinanet_R-50-FPN_2x.yaml first. * Optimize the retinanet postprocessing. * quick fix. * Add script for training RetinaNet with ResNet101 backbone. * Move cfg.RETINANET to cfg.MODEL.RETINANET * Remove the variables which are not used. * revert boxlist_ops. Generate Empty BoxLists instead of [] in retinanet_infer * Remove the not used commented lines. Add NUM_DETECTIONS_PER_IMAGE * remove the not used codes. * Move retinanet related files under Modeling/rpn/retinanet * Add retinanet_X_101_32x8d_FPN_1x.yaml script. This model is not fully validated. I only trained it around 5000 iterations and everything is fine. * set RETINANET.PRE_NMS_TOP_N as 0 in level5 (p7), because previous setting may generate zero detections and could cause the program break. This part is used in original Detectron setting. * Fix the rpn only bug when the training ends. * Minor improvements * Comments and add Python-only implementation * Bugfix and remove commented code * keep the generalized_rcnn same. Move the build_retinanet inside build_rpn. * Add USE_C5 in the MODEL.RETINANET * Add two configs using P5 to generate P6. * fix the bug when loading the Caffe2 ImageNet pretrained model. * Reduce the code depulication of RPN loss and RetinaNet loss. * Remove the comment which is not used. * Remove the hard coded number of classes. * share the foward part of rpn inference. * fix the bug in rpn inference. * Remove the conditional part in the inference. * Bug fix: add the utils file for permute and flatten of the box prediction layers. * Update the comment. * quick fix. Adding import cat. * quick fix: forget including import. * Adjust the normalization part according to Detectron's setting. * Use the bbox reg normalization term. * Clean the code according to recent review. * Using CUDA version for training now. And the python version for training on cpu. * rename the directory to retinanet. * Make the train and val datasets are consistent with mask r-cnn setting. * add comment.

Cheng-Yang Fu added 16 commits October 26, 2018 17:42

Add RetinetNet parameters in cfg.

e185c79

hot fix.

6167fa4

Add the retinanet head module now.

99920af

Add the function to generate the anchors for RetinaNet.

9e82436

Add the SigmoidFocalLoss cuda operator.

69d5d3a

Fix the bug in the extra layers.

587ccd8

Change the normalizer for SigmoidFocalLoss

89e35b2

Support multiscale in training.

bd9a817

Add retinannet training script.

882655a

Add the inference part of RetinaNet.

b5ca053

Fix the bug when building the extra layers in retinanet.

a1f7365

Update the matching part in retinanet_loss.

Add the first version of the inference of RetinaNet.

6cc7264

Need to check it again to see if is there any room for speed improvement.

Remove the retinanet_R-50-FPN_2x.yaml first.

dca2453

Optimize the retinanet postprocessing.

ce06ecd

Merge branch 'master' of https://github.com/facebookresearch/maskrcnn…

615af53

…-benchmark into retina_net

quick fix.

a859b1e

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Nov 3, 2018

Add script for training RetinaNet with ResNet101 backbone.

2e9881f

fmassa reviewed Nov 5, 2018

View reviewed changes

Cheng-Yang Fu added 2 commits November 5, 2018 22:19

Move cfg.RETINANET to cfg.MODEL.RETINANET

21a84e7

Remove the variables which are not used.

ee7760f

revert boxlist_ops.

adb25d6

Generate Empty BoxLists instead of [] in retinanet_infer

Remove the not used commented lines.

b84ff0e

Add NUM_DETECTIONS_PER_IMAGE

Cheng-Yang Fu added 4 commits February 14, 2019 14:30

Using CUDA version for training now. And the python version for training

26c707e

on cpu.

rename the directory to retinanet.

45372c4

Make the train and val datasets are consistent with mask r-cnn setting.

3bb285d

add comment.

37e9075

fmassa approved these changes Feb 15, 2019

View reviewed changes

fmassa merged commit 6b1ab01 into facebookresearch:master Feb 15, 2019

chengyangfu deleted the retinanet branch February 27, 2019 19:11

chengyangfu mentioned this pull request Mar 1, 2019

the Retinanet is not available right now? #513

Open

chengyangfu mentioned this pull request Apr 16, 2019

Why implement your own focal-loss chengyangfu/retinamask#13

Closed

SkeletonOne mentioned this pull request Jun 26, 2019

Retinanet is supported? #916

Open

as1392 mentioned this pull request Jul 23, 2019

Can I use train/eval APIs of this repository with other Pytorch models? #989

Open

as1392 referenced this pull request Jul 24, 2019

add mobilenetv2 retinanet

021b12e

Darshan2701 mentioned this pull request Jul 24, 2019

Retinanet Pretrained models #993

Open

fmassa mentioned this pull request Dec 20, 2019

[Feature request] RetinaNet with TorchVision 0.3.0 pytorch/vision#1151

Closed

sfzhang15 mentioned this pull request Dec 21, 2019

RuntimeError: cuda runtime error (98) : invalid device function at atss_core/csrc/cuda/SigmoidFocalLoss_cuda.cu:139 sfzhang15/ATSS#14

Closed

hust-kevin mentioned this pull request Jan 3, 2020

Code for Paper "Bridging the Gap Between Anchor-based and Anchor-free… open-mmlab/mmdetection#1872

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RetinaNet Implementation #102

Add RetinaNet Implementation #102

chengyangfu commented Nov 3, 2018 •

edited

facebook-github-bot commented Nov 3, 2018

facebook-github-bot commented Nov 3, 2018

fmassa commented Nov 3, 2018

chengyangfu commented Nov 4, 2018

fmassa commented Nov 5, 2018

fmassa left a comment

rbgirshick commented Nov 6, 2018

chengyangfu commented Nov 6, 2018 •

edited

zimenglan-sysu-512 commented Feb 14, 2019

fmassa commented Feb 14, 2019

laibe commented Feb 14, 2019

zimenglan-sysu-512 commented Feb 14, 2019 •

edited

ailzhang commented Feb 14, 2019

chengyangfu commented Feb 14, 2019

fmassa left a comment

fmassa commented Feb 15, 2019

buaaMars commented May 25, 2019

as1392 commented Jul 24, 2019 •

edited

dedoogong commented Aug 16, 2019

simaiden commented Sep 9, 2019

chenjoya commented Oct 10, 2019

adizhol commented Dec 2, 2019 •

edited

Add RetinaNet Implementation #102

Add RetinaNet Implementation #102

Conversation

chengyangfu commented Nov 3, 2018 • edited

facebook-github-bot commented Nov 3, 2018

facebook-github-bot commented Nov 3, 2018

fmassa commented Nov 3, 2018

chengyangfu commented Nov 4, 2018

fmassa commented Nov 5, 2018

fmassa left a comment

Choose a reason for hiding this comment

rbgirshick commented Nov 6, 2018

chengyangfu commented Nov 6, 2018 • edited

zimenglan-sysu-512 commented Feb 14, 2019

fmassa commented Feb 14, 2019

laibe commented Feb 14, 2019

zimenglan-sysu-512 commented Feb 14, 2019 • edited

ailzhang commented Feb 14, 2019

chengyangfu commented Feb 14, 2019

fmassa left a comment

Choose a reason for hiding this comment

fmassa commented Feb 15, 2019

buaaMars commented May 25, 2019

as1392 commented Jul 24, 2019 • edited

dedoogong commented Aug 16, 2019

simaiden commented Sep 9, 2019

chenjoya commented Oct 10, 2019

adizhol commented Dec 2, 2019 • edited

chengyangfu commented Nov 3, 2018 •

edited

chengyangfu commented Nov 6, 2018 •

edited

zimenglan-sysu-512 commented Feb 14, 2019 •

edited

as1392 commented Jul 24, 2019 •

edited

adizhol commented Dec 2, 2019 •

edited