# 在线更新SageMaker模型

## 1 说明
以训练好的2个模型进行在线更新，演示过程。  
首先部署模型A，然后用模型B进行替换。  
文本以PyTorch为例讲解，如果使用TensorFlow，除部署调用类不同外，具体更换模型的命令是相同的。

## 2 运行环境
Kernel 选择pytorch_latest_p36。  
本文在boto3 1.17.109和sagemaker 2.48.1下测试通过。

In [None]:
import boto3,sagemaker
print(boto3.__version__)
print(sagemaker.__version__)

## 3 获取数据

In [None]:
from torchvision import datasets, transforms

datasets.MNIST('data', download=True, transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
]))

## 4 权限和环境变量

In [None]:
import boto3
from sagemaker.image_uris import retrieve

iam = boto3.client('iam')
roles = iam.list_roles(PathPrefix='/service-role')
role=""
for current_role in roles["Roles"]:
    if current_role["RoleName"].startswith("AmazonSageMaker-ExecutionRole-"):
        role=current_role["Arn"]
        break
#如果role为空表示有问题，需要先打开https://cn-northwest-1.console.amazonaws.cn/sagemaker/home?region=cn-northwest-1#/notebook-instances/create以创建IAM Role
print(role)

## 5 部署模型1

model.deploy实际做了3步操作：1、创建模型；2、创建终端节点配置；3、创建终端节点

如果使用自己的model，请参考[Bring your own model](https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html?highlight=deploy#bring-your-own-model)

In [None]:
model_data_a="s3://nwcd-samples/sagemaker/pytorch-mnist/model-acc91.tar.gz"

In [None]:
from sagemaker.pytorch.model import PyTorchModel
endpoint_name_a = "mnist"
model_a = PyTorchModel(role=role,
                        model_data=model_data_a,
                        entry_point="mnist.py",
                        framework_version='1.6.0',
                        py_version='py3')

#该步骤,大概需要7-8分钟
predictor_a = model_a.deploy(initial_instance_count=1,
                                endpoint_name=endpoint_name_a,
                                instance_type="ml.m5.large"
                                )

## 6 使用模型A进行推理

In [None]:
import gzip 
import numpy as np
import random
import os

data_dir = 'data/MNIST/raw'
with gzip.open(os.path.join(data_dir, "t10k-images-idx3-ubyte.gz"), "rb") as f:
    images = np.frombuffer(f.read(), np.uint8, offset=16).reshape(-1, 28, 28).astype(np.float32)

In [None]:
image_size = 3
mask1 = random.sample(range(len(images)), image_size) # randomly select some of the test images
mask2 = np.array(mask1, dtype=np.int)
data = images[mask2]

In [None]:
from matplotlib import pyplot as plt
plt.figure(figsize=(2,2))
for index, mask in enumerate(mask1):
    plt.subplot(1,image_size,index+1)
    plt.axis('off')
    plt.imshow(images[mask])

In [None]:
response_a = predictor_a.predict(np.expand_dims(data, axis=1))
print(response_a)

## 7 部署模型B

In [None]:
model_data_b="s3://nwcd-samples/sagemaker/pytorch-mnist/model-acc95.tar.gz"

In [None]:
from sagemaker.pytorch.model import PyTorchModel
endpoint_name_b = "mnistb"
model_b = PyTorchModel(role=role,
                        model_data=model_data_b,
                        entry_point="mnist.py",
                        framework_version='1.6.0',
                        py_version='py3')

#该步骤,大概需要7-8分钟
predictor_b = model_b.deploy(initial_instance_count=1,
                                endpoint_name=endpoint_name_b,
                                instance_type="ml.m5.large"
                                )

In [None]:
response_b = predictor_b.predict(np.expand_dims(data, axis=1))
print(response_b)

## 8 修改endpoint的模型为B

更新模型和新部署一个模型所需时间相近,大概需要7-8分钟

In [None]:
import boto3
smclient = boto3.Session().client(service_name='sagemaker')
response = smclient.update_endpoint(
    EndpointName='mnist',
    EndpointConfigName='mnistb')

## 9 测试修改后的模型

把该结果和第6步结果进行对比

In [None]:
response_a = predictor_a.predict(np.expand_dims(data, axis=1))
print(response_a)

## 10 清理

In [None]:
predictor_a.delete_endpoint()

In [None]:
import boto3
sage = boto3.Session().client(service_name='sagemaker') 
sage.delete_endpoint(EndpointName="mnistb")
sage.delete_endpoint_config(EndpointConfigName="mnist")