# Deploy Object Detection Model Use ModelCI

MMDetction is a well-known open source object detection toolbox based on PyTorch. You can refer to <https://arxiv.org/abs/1906.07155> for more details.

By walking through this tutorial, you will be able to:

- Load pretained MMDetction model
- Convert MMDetction model into ONNX format 
- Register and retrieve models by ModelHub

## 1. Prequisities

In order to run this notebook, you have to upgrade your PyTorch version from 1.5.0 to 1.8.0.
 
### 1.1 Installation of MMDetction
 
 Firstly you have to install MMDetction according to official instructions : <https://mmdetection.readthedocs.io/en/latest/get_started.html#installation> 

In [1]:
!pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
!git clone https://github.com/open-mmlab/mmdetection.git
!cd mmdetection && pip install -q -r requirements/build.txt && pip install -q -v -e .

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Looking in links: https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
Collecting mmcv-full
  Downloading https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/mmcv_full-1.3.2-cp37-cp37m-manylinux1_x86_64.whl (16.6 MB)
[K     |████████████████████████████████| 16.6 MB 231 kB/s 
Installing collected packages: mmcv-full
Successfully installed mmcv-full-1.3.2
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Obtaining file:///home/modelci/NTU/ML-Model-CI/example/notebook/mmdetection
Installing collected packages: mmdet
  Attempting uninstall: mmdet
    Found existing installation: mmdet 2.11.0
    Uninstalling mmdet-2.11.0:
      Successfully uninstalled mmdet-2.11.0
  Running setup.py develop for mmdet
Successfully installed mmdet


### 1.2 Start ModelCI Service
Then we can start our ModelCI service, you should at least set env variables once before starting. You can refer to [last notebook](https://github.com/cap-ntu/ML-Model-CI/blob/master/example/notebook/image_classification_model_deployment.ipynb) for more details.

You should run the `modelci service init` command in your own terminal, so that the uvicorn server can be running.

In [2]:
!modelci service init

2021-05-29 10:28:02.981282: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-05-29 10:28:32,723 - ml-modelci Docker Container Manager - INFO - Container name=mongo-43297 stared
2021-05-29 10:28:41,534 - ml-modelci Docker Container Manager - INFO - Container name=cadvisor-61767 started.
2021-05-29 10:28:50,751 - ml-modelci Docker Container Manager - INFO - Container name=dcgm-exporter-29046 started.
2021-05-29 10:29:02,601 - ml-modelci Docker Container Manager - INFO - gpu-metrics-exporter-65952 stared
2021-05-29 10:29:03,620 - modelci backend - INFO - Uvicorn server listening on http://localhost:8000, check full log at /home/modelci/tmp/modelci.log


After a minute, all the services will be started

## 2. Build MMdetection Model
### 2.1 Imports
We should import the following functions:
- preprocess_example_input: for generating tensor and meta info from example image file
- build_model_from_cfg: for building model form config file and checkpoint file

In [3]:
from mmdet.core.export import preprocess_example_input, build_model_from_cfg

### 2.2 Model Config

We should either use a dict or config file for configuration of MMDetection model, to make things simple, we use a config file provided by MMDetection.

Notice: 

- You may need to manually download pretrained model checkpoints from [MMDetection models zoo](https://github.com/open-mmlab/mmdetection/blob/master/docs/model_zoo.md).
- Only a few MMdet models are able to converted into ONNX format, you can refer to [documentation](https://mmdetection.readthedocs.io/en/latest/tutorials/pytorch2onnx.html#list-of-supported-models-exportable-to-onnx) for more detail.

In [4]:
config_file = 'mmdetection/configs/retinanet/retinanet_r50_fpn_1x_coco.py'
checkpoint_file = 'retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth'

### 2.3 Build Model
Then we can build our MMdetection model based on the configuration above and the checkpoint file we already download.

In [5]:
model = build_model_from_cfg(config_file, checkpoint_file)

Use load_from_local loader


Before conversion, we need to modify forward function to provide the necessary **kwargs parameters such as img_metas.

In order to obtain valid bbox data during the onnx tracing process, we also need to use a tensor generated from image file as model input instead of random tensors.

In [6]:
input_config = {
    'input_shape': (1,3,224,224),
    'input_path': 'mmdetection/demo/demo.jpg',
    'normalize_cfg': {
        'mean': (123.675, 116.28, 103.53),
        'std': (58.395, 57.12, 57.375)
        }
}
one_img, one_meta = preprocess_example_input(input_config)

In [7]:
one_img.shape

torch.Size([1, 3, 224, 224])

In [8]:
from functools import partial
model.forward = partial(model.forward, img_metas=[[one_meta]], return_loss=False)

### 2.4 Save Model

In [9]:
import torch
from pathlib import Path
import os

torch_model_path = Path.home()/'.modelci/RetinaNet/PyTorch-PYTORCH/Object_Detection/1.pth'
if not Path.is_dir(torch_model_path.parent):
    os.makedirs(torch_model_path.parent, exist_ok=True)
torch.save(model, torch_model_path)

## 3. Register the Model
We can convert the pytorch model above into optimized formats, such as ONNX through modelci

### 3.1 Construct MLModel Instance

Here are some parameters need to be specified before model conversion.
- inputs: The model inputs info
- outputs: The model outputs info
- metric: The evaludation metric data

We can use YAML file to construct the MLModel Instance

```yaml
weight: "~/.modelci/RetinaNet/PyTorch-PYTORCH/Object_Detection/1.pth"
architecture: RetinaNet
framework: PyTorch
engine: PYTORCH
version: 1
dataset: COCO
task: Object_Detection
metric:
  mAP: 0.365
inputs:
  - name: "input"
    shape: [ -1, 3, 224, 224 ]
    dtype: TYPE_FP32
outputs:
  - name: "BBOX"
    shape: [ -1, 100, 5 ]
    dtype: TYPE_FP32
  - name: "SCORE"
    shape: [ -1, 100 ]
    dtype: TYPE_FP32
convert: true
```

### 3.2 Register


In [10]:
!modelci modelhub publish -f ../retinanet.yml

2021-05-29 19:03:58.430792: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
{'data': {'id': ['60b21fac55849db050b511ba']}, 'status': True}


In [11]:
!modelci modelhub detail 60b21fac55849db050b511ba

2021-05-29 19:04:31.137077: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
[1mID       [0m  [1mArchitec…[0m  [1mFramework[0m  [1mVersion[0m  [1mPretrained[0m  [1mMetric[0m  [1mScore[0m  [1mTask       [0m
                                          [1mDataset   [0m                            
60b21fac…  RetinaNet  PyTorch    1        COCO        mAP     0.365  Object     
                                                                     Detection  
[1;38;5;43mConve…[0m[38;5;247m  [0m        [38;5;247m      [0m[38;5;247m  [0m        [38;5;247m      [0m[38;5;247m  [0m        [38;5;247m      [0m[38;5;247m  [0m        [38;5;247m      [0m[38;5;247m  [0m        
[1;38;5;43mModel [0m[38;5;247m  [0m                                                                        
[1;38;5;43mInfo  [0m[38;5;247m  [0m                                                                        
[

As we could see, MLModelCI support auto conversion of PyTorch models into both torchscript and ONNX format, as a result.

However, this model cannot be transformed into torchscript format, but supportive of ONNX format conversion, there could be serveral factors contributing to model conversion failture such as the model structure and code format.

In this case, we failed to convert this model into ONNX format automatically, the error log is  `imgs must be a list, but got <class 'torch.Tensor'>`, then we need to specify the input image and manually convert this model in the next step.

## 4. Convert the Model

The following steps will convert the model we just registered into ONNX format. One thing to notice is we need to upgrade PyTorch version to 1.8.0 to support this conversion.

In [12]:
from modelci.types.bo import IOShape
from modelci.types.trtis_objects import ModelInputFormat
from pathlib import Path
# get ONNX model saved path
onnx_model_path = Path.home() / '.modelci/RetinaNet/PyTorch-ONNX/Object_Detection/1.onnx'
# specify inputs and outputs shape
inputs = [IOShape([-1, 3, 204, 204], dtype=float, name='IMAGE', format=ModelInputFormat.FORMAT_NCHW)]
outputs = [
    IOShape([-1, 100, 5], dtype=float, name='BBOX'),
    IOShape([-1, 100], dtype=float, name='SCORE')
    ]

In [13]:
from modelci.hub.converter import convert
import torch
convert(
    model=model,
    src_framework='pytorch',
    dst_framework='onnx',
    save_path=onnx_model_path,
    inputs=inputs,
    outputs=outputs,
    model_input=[one_img],
    opset=11,
    optimize=False
)

  dtype=torch.long)
  assert cls_score.size()[-2:] == bbox_pred.size()[-2:]
  if k <= 0 or size <= 0:
  if nms_pre > 0:
  _, topk_inds = max_scores.topk(nms_pre)
  assert pred_bboxes.size(0) == bboxes.size(0)
  assert pred_bboxes.size(1) == bboxes.size(1)
  iou_threshold = torch.tensor([iou_threshold], dtype=torch.float32)
  score_threshold = torch.tensor([score_threshold], dtype=torch.float32)
  nms_pre = torch.tensor(pre_top_k, device=scores.device, dtype=torch.long)
  batch_inds = torch.randint(batch_size, (num_fake_det, 1))
  cls_inds = torch.randint(num_class, (num_fake_det, 1))
  box_inds = torch.randint(num_box, (num_fake_det, 1))
  after_top_k, device=scores.device, dtype=torch.long)
  if nms_after > 0:
  _, topk_inds = scores.topk(nms_after)
  "If indices include negative values, the exported graph will produce incorrect results.")
2021-05-29 19:05:50,486 - converter - INFO - ONNX format converted successfully
2021-05-29 19:05:50,486 - converter - INFO - ONNX format converted 

True

##  5. Deploy the Model As a Service

In [14]:
from modelci.hub.deployer.dispatcher import serve

batch_size =1
server_name = 'RetinaNet'

serve(save_path=onnx_model_path, device='cuda:0', name=server_name, batch_size=batch_size)

<Container: 3276321e72>

In [15]:
!docker ps | grep RetinaNet

3276321e7280   mlmodelci/onnx-serving:latest-gpu   "/bin/sh -c 'python …"   11 seconds ago   Up 2 seconds    0.0.0.0:8001->8000/tcp, 0.0.0.0:8002->8001/tcp   [01;31m[KRetinaNet[m[K


## 6. Profile the Model

Firstly, we retrieve the model and build a client for ONNX serving platform.

In [16]:
from modelci.persistence.service import ModelService
model = ModelService.get_model_by_id("60b21fac55849db050b511ba")
model.inputs[0].dtype = 11

In [17]:
from modelci.hub.client.onnx_client import CVONNXClient
test_img_bytes = torch.rand(3, 224, 224)
onnx_client = CVONNXClient(test_img_bytes, model, batch_num=20, batch_size=1, asynchronous=False)

The we can init a profiler and start the profiling process, one thing to notice is that the batch size can only be 1 in this case.

In [18]:
from modelci.hub.profiler import Profiler
profiler = Profiler(model_info=model, server_name='RetinaNet', inspector=onnx_client)
dps = profiler.diagnose(device='cuda:0')

 latency: 0.2068 sec throughput: 4.8361 req/sec
 latency: 0.2623 sec throughput: 3.8130 req/sec
 latency: 0.2128 sec throughput: 4.6993 req/sec
 latency: 0.1491 sec throughput: 6.7060 req/sec
 latency: 0.2125 sec throughput: 4.7050 req/sec
 latency: 0.2243 sec throughput: 4.4590 req/sec
 latency: 0.1809 sec throughput: 5.5279 req/sec
 latency: 0.2041 sec throughput: 4.9000 req/sec
 latency: 0.2129 sec throughput: 4.6978 req/sec
 latency: 0.2573 sec throughput: 3.8865 req/sec
 latency: 0.1944 sec throughput: 5.1448 req/sec
 latency: 0.2100 sec throughput: 4.7616 req/sec
 latency: 0.2141 sec throughput: 4.6718 req/sec
 latency: 0.2286 sec throughput: 4.3745 req/sec
 latency: 0.2290 sec throughput: 4.3660 req/sec
 latency: 0.2660 sec throughput: 3.7590 req/sec
 latency: 0.2972 sec throughput: 3.3651 req/sec
 latency: 0.2097 sec throughput: 4.7695 req/sec
 latency: 0.2065 sec throughput: 4.8430 req/sec
 latency: 0.1932 sec throughput: 5.1764 req/sec


testing device: GeForce MX110
total ba