# YOLOv5 on SageMaker--Build 推理镜像

## 1 说明
本章内容为build推理镜像，推送到AWS ECR，用户可直接使用build完毕的镜像，不用自己build。

## 2 运行环境
Kernel 选择pytorch_latest_p36。  
本文在boto3 1.17.12和sagemaker 2.26.0下测试通过。

In [None]:
import boto3,sagemaker
print(boto3.__version__)
print(sagemaker.__version__)

## 3 本地notebook推理(可选)

In [34]:
!sudo mkdir /opt/ml
!sudo chmod 777 /opt/ml

mkdir: cannot create directory ‘/opt/ml’: File exists


In [None]:
import os
if not os.path.exists("/opt/ml/model"):
    os.mkdir("/opt/ml/model")

In [None]:
!cp -r ../1-training/runs/ /opt/ml/model/

新启动一个shell窗口，运行`conda activate pytorch_latest_p36`，然后必须cd到`2-inference/source`目录，再运行`python predictor.py`，正常启动会输出以下内容：
```
-------------init_output_dir  /opt/ml/output_dir
 * Serving Flask app "predictor" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
```

In [None]:
#修改请求图片
!curl -H "Content-Type: application/json" -X POST --data '{"bucket":"junzhong","image_uri":"yolov5/training/images/val/000729.jpeg"}' http://127.0.0.1:5000/invocations

In [None]:
#删除model文件，实际运行时，通过S3动态传入model
import os
model_file = "source/yolov5s.pt"
if os.path.isfile(model_file):
    os.remove(model_file)

## 4 Amazon 深度学习容器

* [容器镜像清单](https://github.com/aws/deep-learning-containers/blob/master/available_images.md)
* 本文基于pytorch inference: `727897471807.dkr.ecr.cn-northwest-1.amazonaws.com.cn/pytorch-inference:1.6.0-gpu-py36-cu101-ubuntu16.04`

## 5 设置相关名称

In [None]:
ecr_repository = 'yolov5-inference'
tag = 'latest'

## 6 Build image

In [None]:
#国内pytorch inference基础镜像地址，不要修改
base_img='727897471807.dkr.ecr.cn-northwest-1.amazonaws.com.cn/pytorch-inference:1.6.0-gpu-py36-cu101-ubuntu16.04'
#登录基础镜像ECR，不要修改
!aws ecr get-login-password --region cn-northwest-1 | docker login --username AWS --password-stdin 727897471807.dkr.ecr.cn-northwest-1.amazonaws.com.cn

In [None]:
!docker build -t $ecr_repository:$tag -f Dockerfile --build-arg BASE_IMG=$base_img .

## 7 在本地使用容器进行推理(可选)

In [None]:
import os
if not os.path.exists("model"):
    os.mkdir("model")

In [None]:
!cp -r ../1-training/runs/ model/

本地机器如果带GPU，使用`nvidia-docker run`；如果不带GPU，使用`docker run`。

In [None]:
!docker run -v $(pwd)/model/:/opt/ml/model/ -p 8080:8080 -d --rm $ecr_repository:$tag serve

In [None]:
#修改请求图片
!curl -H "Content-Type: application/json" -X POST --data '{"bucket":"junzhong","image_uri":"yolov5/training/images/val/000729.jpeg"}' http://127.0.0.1:8080/invocations

## 8  推送到ECR

In [None]:
!aws ecr create-repository --repository-name $ecr_repository

In [None]:
import boto3
region = boto3.session.Session().region_name
account_id = boto3.client('sts').get_caller_identity().get('Account')
image_uri = '{}.dkr.ecr.{}.amazonaws.com.cn/{}'.format(account_id, region, ecr_repository + ":" + tag)
!docker tag $ecr_repository:$tag $image_uri
!$(aws ecr get-login --no-include-email)
!docker push $image_uri