Implement RetinaNet for Object Detection #12

daniel-j-h · 2018-06-10T23:46:36Z

I see no reason why we can't implement object detection into robosat for specific use-cases.

The pre-processing and post-processing needs to be slightly adapted to work with bounding boxes but otherwise we can re-use probably 90% of what's already there.

This ticket tracks the task of implementing RetinaNet as an object detection architecture:

https://arxiv.org/abs/1612.03144 - Feature Pyramid Networks for Object Detection
https://arxiv.org/abs/1708.02002 - Focal Loss for Dense Object Detection

RetinaNet because it is an state of the art single-shot object detection architecture following our 80/20 philosophy where we favor simplicity and maintainability, and focus on the 20% of the causes responsible for 80% of the effects. It's simple, elegant and on par with the complex Faster-RCNN wrt. accuracy and runtime.

Here are the three basic ideas; please read the papers for in-depth details:

Use a feature pyramid network (FPN) as a backbone. FPNs augment backbone's like ResNet adding top-down and lateral connections (a bit similar to what the u-net is doing) to handle features at multiple scales.
On top of a FPN build two heads: one for object classification and one for bounding box regression. Have in the order of ~100k bounding boxes.
Use focal loss because the ratio between positive bounding boxes and negative bounding boxes is very skewed. Focal loss allows us to adapt the standard cross entropy loss reducing the loss for easy samples (based on confidence).

Focal Loss

Feature Pyramid Network (FPN)

RetinaNet

Tasks

daniel-j-h · 2018-06-23T05:29:35Z

#46 switches our encoder to a pre-trained ResNet. We can now implement a feature pyramid network and put it on top of the resnet for two use-case: to improve segmentation and to move us towards the RetinaNet for object detection. These two use-cases can then be expressed as two separate heads on top of the FPN.

daniel-j-h · 2018-07-13T15:21:37Z

#75 implements a Feature Pyramid Network (FPN) on top of the pre-trained Resnet. In addition it adds segmentation heads to the FPN. The RetinaNet can happen in parallel to that on top of the FPN now.

daniel-j-h mentioned this issue Jun 21, 2018

Replace U-Net encoder with pre-trained ResNet encoder #45

Closed

daniel-j-h mentioned this issue Jul 2, 2018

Implement Feature Pyramid Network for semantic segmentation #60

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement RetinaNet for Object Detection #12

Implement RetinaNet for Object Detection #12

daniel-j-h commented Jun 10, 2018

daniel-j-h commented Jun 23, 2018

daniel-j-h commented Jul 13, 2018

Implement RetinaNet for Object Detection #12

Implement RetinaNet for Object Detection #12

Comments

daniel-j-h commented Jun 10, 2018

Focal Loss

Feature Pyramid Network (FPN)

RetinaNet

daniel-j-h commented Jun 23, 2018

daniel-j-h commented Jul 13, 2018