# SageMakerでsklearnを使ったEndpointの作成
sklearnで学習モデルを作成し、Endpointをデプロイする。

参考: https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-python-sdk/scikit_learn_iris/scikit_learn_estimator_example_with_batch_transform.ipynb

In [1]:
import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()
role = get_execution_role()

## データの準備

In [2]:
import boto3
import numpy as np
import pandas as pd
import os

os.makedirs("./data", exist_ok=True)

s3_client = boto3.client("s3")
s3_client.download_file(
    f"sagemaker-sample-files", "datasets/tabular/iris/iris.data", "./data/iris.csv"
)

df_iris = pd.read_csv("./data/iris.csv", header=None)
df_iris[4] = df_iris[4].map({"Iris-setosa": 0, "Iris-versicolor": 1, "Iris-virginica": 2})
iris = df_iris[[4, 0, 1, 2, 3]].to_numpy()
np.savetxt("./data/iris.csv", iris, delimiter=",", fmt="%1.1f, %1.3f, %1.3f, %1.3f, %1.3f")

In [3]:
WORK_DIRECTORY = "data"

train_input = sagemaker_session.upload_data(
    WORK_DIRECTORY, key_prefix=f"wokshop/sklearn-endpoint/{WORK_DIRECTORY}"
)

## sklearnで学習
`sklearn_custom_ml.py` というスクリプト内でsklearnを使って実装されたモデルを読み込む。
`sklearn_custom_ml.py` では以下のパラメータの実装 (argparseで渡す)。
他にもハイパーパラメータもパラメータとして渡すことができる。

- --output-data-dir
- --model-dir
- --train

In [4]:
from sagemaker.sklearn.estimator import SKLearn

sklearn = SKLearn(
    entry_point="sklearn_custom_ml.py",
    framework_version="1.0-1", # 0.20.0, 0.23-1なども選択可能
    instance_type="ml.c4.xlarge",
    role=role,
    sagemaker_session=sagemaker_session,
    hyperparameters={"max_leaf_nodes": 30},
)

In [5]:
sklearn.fit({"train": train_input})

2022-07-29 21:40:21 Starting - Starting the training job...
2022-07-29 21:40:47 Starting - Preparing the instances for trainingProfilerReport-1659130821: InProgress
.........
2022-07-29 21:42:20 Downloading - Downloading input data...
2022-07-29 21:42:40 Training - Downloading the training image...
2022-07-29 21:43:20 Training - Training image download completed. Training in progress..[34m2022-07-29 21:43:22,331 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training[0m
[34m2022-07-29 21:43:22,335 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2022-07-29 21:43:22,345 sagemaker_sklearn_container.training INFO     Invoking user training script.[0m
[34m2022-07-29 21:43:22,842 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2022-07-29 21:43:22,855 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2022-07-29 21:43:22,871 sagema

## Endpoint作成
簡易的にdeployメソッドでエンドポイントを作成している。
必要あれば、model, endpoint_configを別途作ってもよい。

In [6]:
predictor = sklearn.deploy(initial_instance_count=1, instance_type="ml.m5.xlarge")

-----!

## Endpointの動作確認

In [7]:
endpoint_name = predictor.endpoint_name

In [8]:
import boto3
client = boto3.client("sagemaker-runtime")

response = client.invoke_endpoint(
    EndpointName=endpoint_name,
    Body="5.1,3.5,1.4,0.2\n5.7,2.6,3.5,1.0",
    ContentType='text/csv',
    Accept='application/json'
)

In [9]:
response["Body"].read()

b'[0.0, 1.0]'

## Endpoint削除

In [10]:
predictor.delete_endpoint()