# **Facebook’s D2Go Brings Detectron2 To Mobile**

Facebook has recently introduced D2Go, with in-built Detectron2, the state-of-the-art toolkit for memory-efficient end-to-end training and deployment of deep learning computer vision models on mobile devices.

To read more about it, please refer [this](https://analyticsindiamag.com/facebooks-d2go-brings-detectron2-to-mobile/) article.

## **Inference with a Pre-trained Model on D2Go**

Facebook’s D2Go requires a Python 3.7+ and PyTorch 1.7+ environment with a compatible CUDA GPU runtime. Further, it requires TorchVision, Detectron2 and MobileVision. The following code references this official notebook. Install the nightly version of PyTorch, TorchVision that is compatible with CUDA 10.2.

In [None]:
!python -m pip install pip --upgrade --user -q --no-warn-script-location
!python -m pip install numpy pandas seaborn matplotlib scipy statsmodels sklearn tensorflow keras opencv-python pillow scikit-image torch torchvision \
     tqdm --user -q --no-warn-script-location


In [None]:
# install nightly build of PyTorch, TorchVision, CUDA 10.2
!python -m pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu102/torch_nightly.html -U --user -q
# install Detectron2 from the source
!python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'  --user -q

Once installation is completed, the runtime is required to be restarted. Install MobileVision from its source code.

In [None]:
!python -m pip install 'git+https://github.com/facebookresearch/mobile-vision.git' --user -q

The prerequisites of D2Go are installed. Let’s install D2Go from the Facebook AI Research’s official Github source.

In [None]:
!python -m pip install 'git+https://github.com/facebookresearch/d2go.git' --user -q

In [None]:

import IPython
IPython.Application.instance().kernel.do_shutdown(True)

Import a pre-trained Faster-RCNN FbNetv3A model and load its checkpoint.

In [None]:
from d2go.model_zoo import model_zoo
model = model_zoo.get('faster_rcnn_fbnetv3a_C4.yaml', trained=True) 

Download a sample image from COCO dataset to make inference on it.

In [1]:
!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O input.jpg

In [None]:
import cv2
from matplotlib import pyplot as plt
# read the image
img = cv2.imread("./input.jpg")
plt.imshow(img) 

D2Go’s DemoPredictor method can be used to infer the downloaded image using the pre-trained model.

In [None]:
from d2go.utils.demo_predictor import DemoPredictor
predictor = DemoPredictor(model)
outputs = predictor(img)

The object classes present in the image can be obtained using the following code.

In [None]:
print(outputs["instances"].pred_boxes)

The detected objects, their classes along with bounding boxes can be visualized using the following codes.

In [None]:
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
# Reverse the channel order BGR -> RGB
v = Visualizer(img[:, :, ::-1], MetadataCatalog.get("coco_2017_train"))
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
# Reverse the channel order RGB -> BGR back
# and display the inference
plt.imshow(out.get_image()[:, :, ::-1]) 

### **Custom Training in D2Go**

Download the balloon dataset from the Mask-RCNN datasets. Unzip the compressed file.

In [None]:
# download, decompress the data
!wget https://github.com/matterport/Mask_RCNN/releases/download/v2.1/balloon_dataset.zip
!unzip -o balloon_dataset.zip > /dev/null 

The dataset is expected to be in COCO format. However, the following helper function and codes convert the dataset into the required format as expected by D2Go.

In [None]:
import os
import json
import numpy as np
from detectron2.structures import BoxMode
def get_balloon_dicts(img_dir):
    json_file = os.path.join(img_dir, "via_region_data.json")
    with open(json_file) as f:
        imgs_anns = json.load(f)
    dataset_dicts = []
    for idx, v in enumerate(imgs_anns.values()):
        record = {}
        filename = os.path.join(img_dir, v["filename"])
        height, width = cv2.imread(filename).shape[:2]
        record["file_name"] = filename
        record["image_id"] = idx
        record["height"] = height
        record["width"] = width
        annos = v["regions"]
        objs = []
        for _, anno in annos.items():
            assert not anno["region_attributes"]
            anno = anno["shape_attributes"]
            px = anno["all_points_x"]
            py = anno["all_points_y"]
            poly = [(x + 0.5, y + 0.5) for x, y in zip(px, py)]
            poly = [p for x in poly for p in x]
            obj = {
                "bbox": [np.min(px), np.min(py), np.max(px), np.max(py)],
                "bbox_mode": BoxMode.XYXY_ABS,
                "segmentation": [poly],
                "category_id": 0,
            }
            objs.append(obj)
        record["annotations"] = objs
        dataset_dicts.append(record)
    return dataset_dicts 

In [None]:
for d in ["train", "val"]:
    DatasetCatalog.register("balloon_" + d, lambda d=d: get_balloon_dicts("balloon/" + d))
    MetadataCatalog.get("balloon_" + d).set(thing_classes=["balloon"], evaluator_type="coco")
balloon_metadata = MetadataCatalog.get("balloon_train") 

Since the dataset is converted into the required format, the correctness in the process must be verified. The following codes sample some images randomly from the dataset and display them along with the bounding boxes.

In [None]:
import random
dataset_dicts = get_balloon_dicts("balloon/train")
for d in random.sample(dataset_dicts, 3):
    img = cv2.imread(d["file_name"])
    visualizer = Visualizer(img[:, :, ::-1], metadata=balloon_metadata, scale=0.5)
    out = visualizer.draw_dataset_dict(d)
    plt.figure()
    plt.imshow(out.get_image()[:, :, ::-1])

The dataset is in the right format. We fine-tune a pre-trained FBNetV3A Mask R-CNN model on this dataset.

In [None]:
for txt in ["train", "val"]:
    MetadataCatalog.get("balloon_" + txt).set(thing_classes=["balloon"], evaluator_type="coco")
from d2go.runner import Detectron2GoRunner
def prepare_for_launch():
    runner = Detectron2GoRunner()
    cfg = runner.get_default_cfg()
    cfg.merge_from_file(model_zoo.get_config_file("faster_rcnn_fbnetv3a_C4.yaml"))
    cfg.MODEL_EMA.ENABLED = False
    cfg.DATASETS.TRAIN = ("balloon_train",)
    cfg.DATASETS.TEST = ("balloon_val",)
    cfg.DATALOADER.NUM_WORKERS = 2
    cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("faster_rcnn_fbnetv3a_C4.yaml")  # Let training initialize from model zoo
    cfg.SOLVER.IMS_PER_BATCH = 2
    cfg.SOLVER.BASE_LR = 0.00025  # pick a good LR
    cfg.SOLVER.MAX_ITER = 600    # 300 iterations seems good enough for this toy dataset; you will need to train longer for a practical dataset
    cfg.SOLVER.STEPS = []        # do not decay learning rate
    cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128   # faster, and good enough for this toy dataset (default: 512)
    cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1  # only has one class (balloon). (see https://detectron2.readthedocs.io/tutorials/datasets.html#update-the-config-for-new-datasets)
    # NOTE: this config means the number of classes, but a few popular unofficial tutorials incorrectly use num_classes+1 here.
    os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
    return cfg, runner
cfg, runner = prepare_for_launch()
model = runner.build_model(cfg)
runner.do_train(cfg, model, resume=False) 

Once the model is fine-tuned for the balloon training dataset, we can infer on the evaluation set.

In [None]:
metrics = runner.do_test(cfg, model)

The metrics can be printed using the code,

In [None]:
print(metrics)

#**Related Articles:**

> * [Facebook D2Go to Mobile](https://analyticsindiamag.com/facebooks-d2go-brings-detectron2-to-mobile/)

> * [Multi Class Image Classification with Tensorflow and Keras](https://analyticsindiamag.com/multi-label-image-classification-with-tensorflow-keras/)

> * [Transfer Learning in Tensorflow Keras](https://analyticsindiamag.com/a-practical-guide-to-implement-transfer-learning-in-tensorflow/)

> * [Differentiable Augmentation for Data-Efficient GAN Training](https://analyticsindiamag.com/guide-to-differentiable-augmentation-for-data-efficient-gan-training/)

> * [Guide to Albumentation](https://analyticsindiamag.com/hands-on-guide-to-albumentation/)

> * [Google STAC](https://analyticsindiamag.com/googles-stac-ssl-framework-for-object-detection/)

> * [Comparison of Transfer Learning with Multi Class Classification](https://analyticsindiamag.com/practical-comparison-of-transfer-learning-models-in-multi-class-image-classification/)
