In [1]:
import boto3
import sagemaker
import os
from sagemaker import get_execution_role

region = boto3.session.Session().region_name

role = get_execution_role()
print(region, role)

cn-northwest-1 arn:aws-cn:iam::690704700794:role/service-role/AmazonSageMaker-ExecutionRole-20200430T123312


We will demo using Sagemaker inference in BYOC mode, so first we need package our container.

We are using AWS Deep Learning Container as our base container, you can check the available list in https://aws.amazon.com/cn/releasenotes/available-deep-learning-containers-images/

Remember change the base container by the region you are using.

# Container build有两种方式（二选一）

*  自己构建（在中国区会较慢）

*   使用现有的(推荐)

### 本次任务可以使用已经封装的docker image

## 创建ECR 

In [2]:
# Run this cell only onece to create the repository in ECR
import boto3

account_id = boto3.client('sts').get_caller_identity().get('Account')
ecr_repository = 'ocr-inference-container'
tag = ':latest'
uri_suffix = 'amazonaws.com'
if region in ['cn-north-1', 'cn-northwest-1']:
    uri_suffix = 'amazonaws.com.cn'
inference_repository_uri = '{}.dkr.ecr.{}.{}/{}'.format(account_id, region, uri_suffix, ecr_repository + tag)
print(inference_repository_uri)
ecr = '{}.dkr.ecr.{}.{}'.format(account_id, region, uri_suffix)



690704700794.dkr.ecr.cn-northwest-1.amazonaws.com.cn/ocr-inference-container:latest


In [None]:
!aws ecr create-repository --repository-name $ecr_repository

## 构建镜像

In [4]:
inference_repository_uri

'690704700794.dkr.ecr.cn-northwest-1.amazonaws.com.cn/ocr-inference-container:latest'

### Build and push

In [3]:
!aws ecr get-login-password --region cn-northwest-1 | docker login --username AWS --password-stdin 727897471807.dkr.ecr.cn-northwest-1.amazonaws.com.cn

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded


In [6]:
!aws ecr get-login-password --region $region | docker login --username AWS --password-stdin $ecr

# Create ECR repository and push docker image
!docker build -t $ecr_repository ./

!docker tag {ecr_repository + tag} $inference_repository_uri
!docker push $inference_repository_uri

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
Sending build context to Docker daemon  126.5kB
Step 1/12 : FROM 727897471807.dkr.ecr.cn-northwest-1.amazonaws.com.cn/pytorch-inference:1.5.0-gpu-py36-cu101-ubuntu16.04
 ---> fc4de87c9036
Step 2/12 : RUN apt-get -y update && apt-get install -y --no-install-recommends          wget          nginx     && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> a75fa09342f2
Step 3/12 : WORKDIR /opt/ml/code
 ---> Using cache
 ---> d83a56c0b7c3
Step 4/12 : COPY source ./
 ---> b6be52ebee86
Step 5/12 : RUN pip install --upgrade pip  -i https://mirrors.163.com/pypi/simple/
 ---> Running in f0e3124bcc22
Looking in indexes: https://mirrors.163.com/pypi/simple/
Requirement already up-to-date: pip in /opt/conda/lib/python3.6/site-packages (20.1)
Removing intermediate container f0e3124bcc22
 ---> 2b16c13201c4
Step 6/12 : RUN pip install -r requirements.txt  -i https://mirrors.163.com/pypi/simple/
 ---> Runnin

将现有镜像下载到本地，并推送到自己的ECR库中

In [7]:
inference_repository_uri

'690704700794.dkr.ecr.cn-northwest-1.amazonaws.com.cn/ocr-inference-container:latest'

### 注意

**将如下model_uri改为在training阶段得到的模型在S3中的path，形式为s3://YOUR_BUCKET_NAME/spoken/output/-x-x-x-x-x-x-x/output/model.tar.gz**， 可以在console找到该训练任务，在该训练任务的描述页面中，找到“S3 模型构件”，复制即可。

In [8]:
image = inference_repository_uri
# update model_uri to your model S3 uri
model_uri = 's3://dikers-data/sagemaker/ocr-pytorch-train/output/ocr-train-2020-08-03-13-42-44-937/output/model.tar.gz'



推理请求的结构是发送一个json结构体，json结构体里面描述：

bucket: 存放待推理音频数据的存储桶

audio_uri:待推理音频数据在S3的uri，不含桶名

class_count: 语音语言种类，与模型训练时强相关，即模型训练的时候提供了几种语言的种类，这儿就填几，如训练时提供了5种语言，这里就写5；

**即发送推理请求前，先将待推理的音频文件上传到S3**

## 测试文件格式说明

bucket为保存待推理的图片文件桶名，image_uri为该待推理文件在S3中的uri，且不含有桶名，即只有前缀，
如一个名为demo1.mp3文件上传到桶名为test的s3存储桶后（且audio目录下），

```

#s3://test_bucket/image/test.jpg

bucket = 'test_bucket'
audio_uri = 'image/test.jpg'

```


In [9]:
import json
bucket = 'dikers-data'
image_uri = 'images/test013.jpg'


test_data = {
    'bucket' : bucket,
    'image_uri' : image_uri
}
payload = json.dumps(test_data)


In [10]:
print(payload)

{"bucket": "dikers-data", "image_uri": "images/test013.jpg"}


### Method 1: Using sagemaker SDK

In [16]:
# Below could be modified as you want

initial_instance_count = 1
instance_type = 'ml.m5.large'
endpoint_name= 'ocr-endpoint'

In [17]:
# 创建 model

from sagemaker.model import Model
image = inference_repository_uri
model = Model(
            model_data=model_uri, 
            role=role,
            image=image)

Parameter image will be renamed to image_uri in SageMaker Python SDK v2.


In [18]:
# 创建 endpoint

model.deploy(
    initial_instance_count=initial_instance_count,
    instance_type=instance_type,
    endpoint_name=endpoint_name)

------------------!

In [19]:
# 创建推理用的 predictor

new_predictor = sagemaker.predictor.RealTimePredictor(
    endpoint=endpoint_name,
    content_type='application/json')

In [20]:
# 推理请求代码

new_sm_response = new_predictor.predict(payload)

print(json.loads(new_sm_response.decode()))

{'name': 'hello'}


### Method 2: Using boto3 SDK

In [None]:
# Below could be modified as you want

model_name = 'ocr-demo'
endpoint_config_name='ocr-endpoint'
variant_name= 'ocr-endpoint'
initial_instance_count = 1
instance_type = 'ml.m5.large'
endpoint_name= 'endpoint_config_name'

In [None]:
import boto3

sm_client = boto3.client('sagemaker')

# create model object

spl_model_demo = sm_client.create_model(
    ModelName=model_name,
    PrimaryContainer={
        'Image': image,
        'Mode': 'SingleModel',
        'ModelDataUrl': model_uri,
    },
    ExecutionRoleArn= role, 
    EnableNetworkIsolation=False
)

In [None]:
# create endpoint config

spl_endpoint_config = sm_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            'VariantName': variant_name,
            'ModelName': model_name,
            'InitialInstanceCount': initial_instance_count,
            'InstanceType': instance_type
        },
    ]
)

In [None]:
# create endpoint

response = sm_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name
)

待上一步创建完成后再进行下面的发送推理请求。上面创建endpoint的时间大概10分钟左右，可以在console查看状态，inservice即可使用了。

推理代码：

In [21]:
import boto3
import json
import time

region_name='cn-northwest-1'
profile_name='default'

session = boto3.session.Session(region_name=region_name, profile_name=profile_name)
client = session.client('sagemaker-runtime')

start_time = time.time()
spl_response=client.invoke_endpoint(EndpointName=endpoint_name,
        Body=payload,
        ContentType='application/json')
end_time = time.time()

print('time cost %s s' %(end_time - start_time))
print(json.loads(spl_response['Body'].read().decode()))

time cost 0.1443190574645996 s
{'name': 'hello'}
