# Custom Backbones

This tutorial does not cover basic concepts, be sure to read the Introduction Guide before continuing.

## Imports

In [1]:
from mantisshrimp.imports import *
from mantisshrimp import *
import albumentations as A

  assert isinstance(locations, collections.Iterable), 'Must provide locations for directive.'
  @attr.attrs(cmp=False, slots=True)


## Setup

Skip to the next section if you just want to read on how to use custom backbones, this section is only required if you're running the complete notebook.

Let's continue using the pets dataset and grab everything we need:

In [2]:
image_dir = datasets.pets.load()
parser = datasets.pets.parser(image_dir, mask=False)

We're going to use a 80% train 20% valid data split.

In [3]:
data_splitter = RandomSplitter([.8, .2])
train_records, valid_records = parser.parse(data_splitter)

HBox(children=(FloatProgress(value=0.0, max=3686.0), HTML(value='')))




Now we can quickly setup some transforms:

In [4]:
imagenet_mean, imagenet_std = imagenet_stats

valid_tfms = AlbuTransform(
    [
        A.LongestMaxSize(384),
        A.Normalize(mean=imagenet_mean, std=imagenet_std),
    ]
)

train_tfms = AlbuTransform(
    [
        A.LongestMaxSize(384),
        A.RandomSizedBBoxSafeCrop(320, 320, p=0.3),
        A.HorizontalFlip(),
        A.ShiftScaleRotate(rotate_limit=20),
        A.RGBShift(always_apply=True),
        A.RandomBrightnessContrast(),
        A.Blur(blur_limit=(1, 3)),
        A.Normalize(mean=imagenet_mean, std=imagenet_std),
    ]
)

And the datasets:

In [5]:
train_ds = Dataset(train_records, train_tfms)
valid_ds = Dataset(valid_records, valid_tfms)

## Backbones

With the setup done, we're now ready to start exploring custom backbones.

### Models with FPN

- Feature Pyramid Network (FPN) was an additional made to orginal Faster RCNN paper in 2017.

- For more information please check this orignal paper for Faster RCNN with FPN.

- These lead to improvement in performance in Faster RCNN.

- Mantrisshrimp supports various Resnet styled architectures as backbones for FPN.

- Is supports backbones such as "resnet18", "resnet34","resnet50", "resnet101", "resnet152", "resnext50_32x4d", "resnext101_32x8d", "wide_resnet50_2", "wide_resnet101_2".

- Passing ```pretrained=True``` will create backbone trained on ImageNet weights.

In [6]:
resnet_101_backbone = MantisFasterRCNN.get_backbone_by_name("resnet101", fpn=True, pretrained=True)

In [7]:
resnet_152_backbone = MantisFasterRCNN.get_backbone_by_name("resnet152", fpn=True, pretrained=True)

### Models without FPN

- The orignal implementation of Faster RCNN as in year 2014 did not have FPN.

- These architectures do not use FPN and they too can have multiple backbones.

- Mantisshrimp supports backbones "resnet18", "resnet34", "resnet50","resnet101", "resnet152", "resnext101_32x8d", "mobilenet", "vgg11", "vgg13", "vgg16", "vgg19", without fpn networks

- Mantrisshrimp thus supports all architectures which can be used with FPN as well as additional CNN models as well.

- For now as an example let us instantiate with mobilenetv2 backbone.

- Passing ```pretrained=True``` will create backbone trained on ImageNet weights.

In [8]:
mobilenet_v2_backbone = MantisFasterRCNN.get_backbone_by_name("mobilenet", fpn=False, pretrained=True)

- You can pass all the arguments that you could for torchvision Faster RCNN, E.g. Anchor boxes, iou_threshold, etc.
- Have a look at those arguments in torchvision.

## The complete model

To use the custom backbone, just pass it as the `backbone` argument to the model, simple!

In [9]:
model = MantisFasterRCNN(num_classes=2, backbone=mobilenet_v2_backbone)

And then, continue the rest of the steps normally.

Create the dataloader:

In [10]:
train_dl = model.dataloader(train_ds, shuffle=True, batch_size=8, num_workers=2)
valid_dl = model.dataloader(valid_ds, batch_size=8, num_workers=2)

And let's use fastai for training this time:

In [11]:
from mantisshrimp.engines.fastai import *

In [14]:
learn = rcnn_learner(dls=[train_dl, valid_dl], model=model)

In [13]:
learn.fine_tune(20, 1e-4)

## A more customized backbone

- Note while using these customization make sure you understand them. It will give errors while training if they are not properly adjusted.
- So, make sure your parameters work as per data and model.
- You can pass the same arguments for torchvision FasterRCNN as well.
- These torchvision parameters work for both models with FPN and without FPN networks.

Here is what all can be customized. Parameters that are set to None, take defualt values as in torchvision.

```
num_classes=None,
# transform parameters
min_size=800, max_size=1333,
image_mean=None, image_std=None,

# RPN parameters
rpn_anchor_generator=None, rpn_head=None, rpn_pre_nms_top_n_train=2000, 
rpn_pre_nms_top_n_test=1000, rpn_post_nms_top_n_train=2000, rpn_post_nms_top_n_test=1000, 
rpn_nms_thresh=0.7, rpn_fg_iou_thresh=0.7, rpn_bg_iou_thresh=0.3,
rpn_batch_size_per_image=256, rpn_positive_fraction=0.5,


# Box parameters
box_roi_pool=None, box_head=None, box_predictor=None, box_score_thresh=0.05, 
box_nms_thresh=0.5, box_detections_per_img=100,
box_fg_iou_thresh=0.5, box_bg_iou_thresh=0.5, box_batch_size_per_image=512, 
box_positive_fraction=0.25, bbox_reg_weights=None
```

- Do check these in torchvision https://pytorch.org/docs/stable/_modules/torchvision/models/detection/faster_rcnn.html

Lets say you need modify AnchorGenerator.

(I am modifying it to default value, which would be set if it would be None here. You can experiment with it)

In [None]:
anchor_sizes = ((32,), (64,), (128,), (256,), (512,))
aspect_ratios = ((0.5, 1.0, 2.0),) * len(anchor_sizes)
ft_anchor_generator = AnchorGenerator(anchor_sizes, aspect_ratios)

- And if we need a Region of Interest Pooler.

In [None]:
ft_roi_pooler = torchvision.ops.MultiScaleRoIAlign(
                featmap_names=['0', '1', '2', '3'],
                output_size=7,
                sampling_ratio=2)

Also if you need image mean and std specific for your data.

- Imagenet mean and std it will taken automatically if not explicity given

In [None]:
ft_mean = [0.485, 0.456, 0.406] # ImageNet mean
ft_std = [0.229, 0.224, 0.225] # ImageNet std

Just pass them while you instantiate the Model

In [None]:
model = MantisFasterRCNN(num_class=2, backbone=resnet_101_backbone, image_mean=ft_mean, image_std=ft_std, rpn_anchor_generator=ft_anchor_generator, box_roi_pool=ft_roi_pooler)

## Saving the model

The model is just a plain pytorch model, so it can be saved normally

In [28]:
torch.save(model.state_dict(), "mantiss_faster_rcnn.pt")

## Conclusion

- It's really simple to create models with custom backbones
- You don't have to worry how backbones should be connected with Faster RCNN, Mantisshrimp does that for you