## Export mmdetection models to CoreML format

This notebook will try to convert models from the [mmdetection](https://github.com/open-mmlab/mmdetection) library to CoreML format. This script tries to solve a bunch of dependencies problems that I founded when trying to convert these models.


In [None]:
# firsts thing first, install the requirements.
!pip install -r requirements.txt

In [None]:
# import libraries
import os
import cv2
import numpy as np

We need to build libtorch, install mmdetection and mmdeploy according to the following docs:

https://mmdeploy.readthedocs.io/en/latest/01-how-to-build/macos-arm64.html

https://mmdeploy.readthedocs.io/en/latest/05-supported-backends/coreml.html

### 1. Build libtorch, the slow part.

In [None]:
PYTORCH_VERSION = "2.0.0"
PYTORCH_DIR= "third_party/pytorch"

Clone PyTorch version (~8min)

In [None]:
if not os.path.exists(PYTORCH_DIR):
    !git clone --recursive --depth 1 --branch v{PYTORCH_VERSION} https://github.com/pytorch/pytorch {PYTORCH_DIR}

Build libtorch from source (~16min)

In [None]:
!cd {PYTORCH_DIR} && \
mkdir -p build && cd build && \
cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DPYTHON_EXECUTABLE=`which python` \
    -DCMAKE_INSTALL_PREFIX=install \
    -DDISABLE_SVE=ON

!cd {PYTORCH_DIR}/build/ && make -j4 && make install

### 2. Build mmdeploy

We will need to build mmdeploy as well, more or less following what is described in the lib [docs](https://mmdeploy.readthedocs.io/en/latest/01-how-to-build/macos-arm64.html).

In [None]:
MMDEPLOY_DIR = "third_party/mmdeploy/"
COMMIT_HASH = "bc75c9d6c8940aa03d0e1e5b5962bd930478ba77"
!git -C {MMDEPLOY_DIR} pull || git clone --recursive https://github.com/open-mmlab/mmdeploy.git {MMDEPLOY_DIR}
!cd {MMDEPLOY_DIR} && git reset --hard {COMMIT_HASH}

First thing is that there is an error in the CMakeLists.txt that set the standard to C++14 instead of C++17. [See issue](https://github.com/open-mmlab/mmdeploy/issues/2638) for more details.

In [None]:
with open(f"{MMDEPLOY_DIR}/csrc/mmdeploy/backend_ops/CMakeLists.txt", "r") as f, \
        open(f"{MMDEPLOY_DIR}/csrc/mmdeploy/backend_ops/CMakeLists_fixed.txt", "w") as f_out:
    content = f.readlines()
    for ln in content:
        if "set(CMAKE_CXX_STANDARD 14)" in ln:
            ln = "    set(CMAKE_CXX_STANDARD 17)\n"
        f_out.write(ln)

os.rename(f'{MMDEPLOY_DIR}/csrc/mmdeploy/backend_ops/CMakeLists_fixed.txt', f'{MMDEPLOY_DIR}/csrc/mmdeploy/backend_ops/CMakeLists.txt')
print("CMakeLists.txt updated")

In [None]:
Torch_DIR=os.getcwd()+"/third_party/pytorch/build/install/share/cmake/Torch"

print("will compile mmdeploy using torch from ", Torch_DIR)
!cd {MMDEPLOY_DIR} && \
    mkdir -p build && cd build && \
    cmake -DMMDEPLOY_TARGET_BACKENDS=coreml -DTorch_DIR={Torch_DIR} .. && \
    make -j4 && make install

In [None]:
!cd {MMDEPLOY_DIR} && pip install -v -e .

### 3. Install mmdetection from source

In [None]:
MMDETECTION_DIR = "third_party/mmdetection/"
COMMIT_HASH = "cfd5d3a985b0249de009b67d04f37263e11cdf3"
!git -C {MMDETECTION_DIR} pull || git clone --recursive https://github.com/open-mmlab/mmdetection.git {MMDETECTION_DIR}
!cd {MMDETECTION_DIR} && git reset --hard {COMMIT_HASH}

In [None]:
!mim install mmengine
!mim install "mmcv >=2.0.0rc4, < 2.1.0"

In [None]:
!cd {MMDETECTION_DIR}; pip install -v -e .

#### 4. Finally, convert the detector model!

Download the models

In [None]:
#if not os.path.exists("retinanet_r18_fpn_1x_coco_20220407_171055-614fd399.pth"):
#    !wget https://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r18_fpn_1x_coco/retinanet_r18_fpn_1x_coco_20220407_171055-614fd399.pth .
if not os.path.exists("rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth"):
    !wget https://download.openmmlab.com/mmdetection/v3.0/rtmdet/rtmdet_tiny_8xb32-300e_coco/rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth .

Now will conver the model do CoreML, hopefully. I got an error when running the conversion script in a notebook, because it tried to open matplotlib to show some errors. That's why there is `MPLBACKEND="template"` there.

In [None]:
# !cd third_party/ && MPLBACKEND="template" && python mmdeploy/tools/deploy.py \
#     mmdeploy/configs/mmdet/detection/detection_coreml_static-800x1344.py \
#     mmdetection/configs/retinanet/retinanet_r18_fpn_1x_coco.py \
#     ../retinanet_r18_fpn_1x_coco_20220407_171055-614fd399.pth \
#     mmdetection/demo/demo.jpg \
#     --work-dir ../work_dir/retinanet \
#     --device cpu \
#     --dump-info


!cd third_party/ && MPLBACKEND="template" && python mmdeploy/tools/deploy.py \
    mmdeploy/configs/mmdet/detection/detection_coreml_static-800x1344.py \
    mmdetection/configs/rtmdet/rtmdet_tiny_8xb32-300e_coco.py \
    ../rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth \
    mmdetection/demo/demo.jpg \
    --work-dir ../work_dir/rtmdet \
    --device cpu \
    --dump-info

If everything worked out, you should have a .mlpackage with the CoreML converted model (including NMS). It's a good idea to open it in Xcode to check some metadata.

In [None]:
#!du -sh retinanet_r18_fpn_1x_coco_20220407_171055-614fd399.pth
#!du -sh work_dir/retinanet/end2end.mlpackage/

!du -sh rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth
!du -sh work_dir/rtmdet/end2end.mlpackage/

The conversion script creates a lot of files in the `work_dir` directory. MMdeploy make inferences in both models so that we can check if that worked:

In [None]:
import ipyplot

images = ["work_dir/retinanet/output_pytorch.jpg", "work_dir/retinanet/output_coreml.jpg"]
labels = ["PyTorch", "CoreML"]
ipyplot.plot_images(images, labels, img_width=400)

### 5. CoreML inference

Now let's run inference using the coremltools backend. This basically means that we will not use mmdetection to pre-process the images and we'll need to figure out what is the output.

Pre-processing images comes from these files:
- `mmdetection/configs/_base_/datasets/coco_detection.py`
- `mmdetection/configs/rtmdet/rtmdet_l_8xb32-300e_coco.py`

For some strange reason, mean/std values can be different for different detectors.

In [None]:
def preprocess_image(image_path):
    im = cv2.imread(image_path)
    im = cv2.resize(im, (1344, 800))
    im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
    im = im.astype(np.float32)
    # im /= 255 no need, since mean and std are with respect to 255

    # mean and std values taken from rtmdet_l_8xb32-300e_coco.py file
    mean=[103.53, 116.28, 123.675],
    std=[57.375, 57.12, 58.395],

    im -= mean
    im /= std

    im = im.transpose(2, 0, 1) # HWC -> CHW
    im = np.expand_dims(im, 0)  # Add batch dimension.

    return im

The output of the CoreML model (for RTMDet) will be a dictionary with two items:
- dets: this is a tensor of size 1x200x5 that represent 200 boxes that would be predicted by the model. Each box comes with four coordinates and a score (which we should apply a threshold); the boxes are returned in decreasing order of score, which is cool!
- labels: this is a an array of 200 integers given class numbers (coco) for each box

In [None]:

from mmdet.datasets.coco import CocoDataset
coco_classes = CocoDataset.METAINFO["classes"]

def postprocess_output(output, threshold = 0.5):
    boxes = output["dets"][0,:,:]
    id_good_boxes = np.where(boxes[:,4] > threshold)  # boxes above threshold
    
    good_boxes = boxes[id_good_boxes, :][0]
    good_labels = output["labels"][0, id_good_boxes][0]

    detections = []
    for box, label in zip(good_boxes, good_labels):
        detections.append({"label": coco_classes[label], "box": box[:4].tolist(), "score": box[4]})

    return detections

Quick function to plot object detection

In [None]:
import matplotlib.pyplot as plt
import matplotlib.patches as patches

def plot_detections(image_path, detections, detector_size):
    im = cv2.imread(image_path)
    fig, ax = plt.subplots(1, figsize=(10, 10))
    
    ax.imshow(cv2.cvtColor(im, cv2.COLOR_BGR2RGB))

    for detection in detections:
        box = detection["box"]
        # scale box according to detector_size
        scale_x = im.shape[1] / detector_size[0]
        scale_y = im.shape[0] / detector_size[1]
        box = [box[0] * scale_x, box[1] * scale_y, box[2] * scale_x, box[3] * scale_y]
        
        label = detection["label"]
        score = detection["score"]
        rect = patches.Rectangle((box[0], box[1]), box[2] - box[0], box[3] - box[1], 
                                 linewidth=2, edgecolor='r', facecolor='none')
        ax.add_patch(rect)
        ax.text(box[0], box[1], f"{label} {score:.2f}", backgroundcolor="white", color="red")

    plt.axis('off')
    plt.show()



In [None]:
import coremltools as ct

model = ct.models.MLModel("work_dir/rtmdet/end2end.mlpackage", compute_units=ct.ComputeUnit.ALL)
pred = model.predict({"input": preprocess_image("third_party/mmdetection/demo/demo.jpg")})

plot_detections("third_party/mmdetection/demo/demo.jpg", postprocess_output(pred), detector_size=(1344, 800))

