# Part 4 : Deploy, Run Inference, Interpret Inference

<a id='overview-4'></a>

## [Overview](./0-AutoClaimFraudDetection.ipynb)
* [Notebook 0 : Overview, Architecture and Data Exploration](./0-AutoClaimFraudDetection.ipynb)
* [Notebook 1: Data Prep, Process, Store Features](./1-data-prep-e2e.ipynb)
* [Notebook 2: Train, Check Bias, Tune, Record Lineage, and Register a Model](./2-lineage-train-assess-bias-tune-registry-e2e.ipynb)
* [Notebook 3: Mitigate Bias, Train New Model, Store in Registry](./3-mitigate-bias-train-model2-registry-e2e.ipynb)
* **[Notebook 4: Deploy Model, Run Predictions](./4-deploy-run-inference-e2e.ipynb)**
  * **[Architecture](#deploy)**
  * **[Deploy an approved model and Run Inference via Feature Store](#deploy-model)**
  * **[Create a Predictor](#predictor)**
  * **[Run Predictions from Online FeatureStore](#run-predictions)**
* [Notebook 5 : Create and Run an End-to-End Pipeline to Deploy the Model](./5-pipeline-e2e.ipynb)

End-to-end 유즈케이스를 다루는 이 섹션에서는, 사기 탐지 사용 사례의 최종 프로덕션인 mmitigated 모델을 배포합니다. 추론을 실행하는 방법과 Clarify를 사용하여 모델을 해석하거나 "설명"하는 방법을 보여줍니다.

### Load stored variables

이전에 이 노트북을 실행한 경우, AWS에서 생성한 리소스를 재사용할 수 있습니다. 아래 셀을 실행하여 이전에 생성된 변수를 로드합니다. 기존 변수의 출력물이 표시되어야 합니다. 인쇄된 내용이 보이지 않으면 노트북을 처음 실행한 것일 수 있습니다.

In [1]:
%store -r
%store

Stored variables and their in-db values:
bucket                              -> 'sagemaker-us-east-2-143656149352'
claims_fg_name                      -> 'fraud-detect-demo-claims'
claims_table                        -> 'fraud-detect-demo-claims-1629447691'
clarify_bias_job_1_name             -> 'Clarify-Bias-2021-08-20-08-46-43-569'
col_order                           -> ['fraud', 'customer_gender_female', 'customer_gend
customers_fg_name                   -> 'fraud-detect-demo-customers'
customers_table                     -> 'fraud-detect-demo-customers-1629447692'
database_name                       -> 'sagemaker_featurestore'
hyperparameters                     -> {'max_depth': '3', 'eta': '0.2', 'objective': 'bin
model_1_name                        -> 'fraud-detect-demo-xgboost-pre-smote'
model_2_name                        -> 'fraud-detect-demo-xgboost-post-smote'
mp2_arn                             -> 'arn:aws:sagemaker:us-east-2:143656149352:model-pa
mpg_name                  

**<font color='red'>Important</font>: StoreMagic 명령을 사용하여 변수를 검색하려면 이전 노트북을 실행해야 합니다.**

### Import libraries

In [2]:
import json
import time
import boto3
import sagemaker
import numpy as np
import pandas as pd
import awswrangler as wr

### Set region, boto3 and SageMaker SDK variables

In [3]:
#You can change this to a region of your choice
import sagemaker
region = sagemaker.Session().boto_region_name
print("Using AWS Region: {}".format(region))

Using AWS Region: us-east-2


In [4]:
boto3.setup_default_session(region_name=region)

boto_session = boto3.Session(region_name=region)

s3_client = boto3.client('s3', region_name=region)

sagemaker_boto_client = boto_session.client('sagemaker')

sagemaker_session = sagemaker.session.Session(
    boto_session=boto_session,
    sagemaker_client=sagemaker_boto_client)

sagemaker_role = sagemaker.get_execution_role()

account_id = boto3.client('sts').get_caller_identity()["Account"]

In [5]:
# variables used for parameterizing the notebook run
endpoint_name = f"{model_2_name}-endpoint"
endpoint_instance_count = 1
endpoint_instance_type = "ml.m4.xlarge"

predictor_instance_count = 1
predictor_instance_type = "ml.c5.xlarge"
batch_transform_instance_count = 1
batch_transform_instance_type = "ml.c5.xlarge"

<a id ='deploy'> </a>
## Architecture for this ML Lifecycle Stage : Train, Check Bias, Tune, Record Lineage, Register Model
[overview](#overview-4)

![train-assess-tune-register](./images/e2e-3-pipeline-v3b.png)

<a id ='deploy-model'></a>

## Deploy an approved model and make prediction via Feature Store

[overview](#overview-4)

#### Approve the second model

실제 MLOps 라이프사이클에서 모델 패키지는 데이터 과학자, 주제 전문가 및 감사자가 평가한 후 승인됩니다.

In [6]:
second_model_package = sagemaker_boto_client.list_model_packages(ModelPackageGroupName=mpg_name)[
    "ModelPackageSummaryList"
][0]
model_package_update = {
    "ModelPackageArn": second_model_package["ModelPackageArn"],
    "ModelApprovalStatus": "Approved",
}

update_response = sagemaker_boto_client.update_model_package(**model_package_update)

#### Create an endpoint config and an endpoint
엔드포인트를 배포합니다. 약 8분 정도 걸릴 수 있습니다.

In [7]:
primary_container = {'ModelPackageName': second_model_package['ModelPackageArn']}
endpoint_config_name=f'{model_2_name}-endpoint-config'
existing_configs = len(sagemaker_boto_client.list_endpoint_configs(NameContains=endpoint_config_name, MaxResults = 30)['EndpointConfigs'])

if existing_configs == 0:
    create_ep_config_response = sagemaker_boto_client.create_endpoint_config(
        EndpointConfigName=endpoint_config_name,
        ProductionVariants=[{
            'InstanceType': endpoint_instance_type,
            'InitialVariantWeight': 1,
            'InitialInstanceCount': endpoint_instance_count,
            'ModelName': model_2_name,
            'VariantName': 'AllTraffic'
        }]
    )
    %store endpoint_config_name

Stored 'endpoint_config_name' (str)


In [8]:
existing_endpoints = sagemaker_boto_client.list_endpoints(NameContains=endpoint_name, MaxResults = 30)['Endpoints']
if not existing_endpoints:
    create_endpoint_response = sagemaker_boto_client.create_endpoint(
        EndpointName=endpoint_name,
        EndpointConfigName=endpoint_config_name)
    %store endpoint_name

endpoint_info = sagemaker_boto_client.describe_endpoint(EndpointName=endpoint_name)
endpoint_status = endpoint_info['EndpointStatus']

while endpoint_status == 'Creating':
    endpoint_info = sagemaker_boto_client.describe_endpoint(EndpointName=endpoint_name)
    endpoint_status = endpoint_info['EndpointStatus']
    print('Endpoint status:', endpoint_status)
    if endpoint_status == 'Creating':
        time.sleep(60)

Stored 'endpoint_name' (str)
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: InService


<a id='predictor'> </a>

### Create a predictor

In [9]:
predictor = sagemaker.predictor.Predictor(
    endpoint_name=endpoint_name, sagemaker_session=sagemaker_session
)

### Sample a claim from the test data

In [10]:
dataset = pd.read_csv("data/dataset.csv")
train = dataset.sample(frac=0.8, random_state=0)
test = dataset.drop(train.index)
sample_policy_id = int(test.sample(1)["policy_id"])

In [11]:
test.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000 entries, 0 to 4997
Data columns (total 48 columns):
 #   Column                           Non-Null Count  Dtype  
---  ------                           --------------  -----  
 0   Unnamed: 0                       1000 non-null   int64  
 1   policy_id                        1000 non-null   int64  
 2   customer_gender_female           1000 non-null   int64  
 3   customer_gender_male             1000 non-null   int64  
 4   policy_state_or                  1000 non-null   int64  
 5   injury_claim                     1000 non-null   float64
 6   policy_state_ca                  1000 non-null   int64  
 7   collision_type_na                1000 non-null   int64  
 8   police_report_available          1000 non-null   int64  
 9   incident_month                   1000 non-null   int64  
 10  policy_state_nv                  1000 non-null   int64  
 11  customer_age                     1000 non-null   int64  
 12  collision_type_front

### Get sample's claim data from online feature store

아래 코드 셀은 고객의 보험 청구 제출에서 실시간으로 데이터를 가져 오는 것을 시뮬레이션합니다.

In [12]:
featurestore_runtime = boto_session.client(
    service_name="sagemaker-featurestore-runtime", region_name=region
)

feature_store_session = sagemaker.Session(
    boto_session=boto_session,
    sagemaker_client=sagemaker_boto_client,
    sagemaker_featurestore_runtime_client=featurestore_runtime,
)

<a id='run-predictions'> </a>
## Run Predictions on Multiple Claims

[overview](#overview-4)

In [13]:
import datetime as datetime

timer = []
MAXRECS = 100


def barrage_of_inference():
    sample_policy_id = int(test.sample(1)["policy_id"])

    temp_fg_name = "fraud-detect-demo-claims"

    claims_response = featurestore_runtime.get_record(
        FeatureGroupName=temp_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
    )

    if claims_response.get("Record"):
        claims_record = claims_response["Record"]
        claims_df = pd.DataFrame(claims_record).set_index("FeatureName")
    else:
        print("No Record returned / Record Key  \n")

    t0 = datetime.datetime.now()

    customers_response = featurestore_runtime.get_record(
        FeatureGroupName=customers_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
    )

    t1 = datetime.datetime.now()

    customer_record = customers_response["Record"]
    customer_df = pd.DataFrame(customer_record).set_index("FeatureName")

    blended_df = pd.concat([claims_df, customer_df]).loc[col_order].drop("fraud")
    data_input = ",".join(blended_df["ValueAsString"])

    results = predictor.predict(data_input, initial_args={"ContentType": "text/csv"})
    prediction = json.loads(results)
    # print (f'Probablitity the claim from policy {int(sample_policy_id)} is fraudulent:', prediction)

    arr = t1 - t0
    minutes, seconds = divmod(arr.total_seconds(), 60)

    timer.append(seconds)
    # print (prediction, " done in {} ".format(seconds))

    return sample_policy_id, prediction


for i in range(MAXRECS):
    sample_policy_id, prediction = barrage_of_inference()
    print(f"Probablitity the claim from policy {int(sample_policy_id)} is fraudulent:", prediction)

Probablitity the claim from policy 4275 is fraudulent: 0.004453680943697691
Probablitity the claim from policy 6 is fraudulent: 0.0025477695744484663
Probablitity the claim from policy 4203 is fraudulent: 0.0060280426405370235
Probablitity the claim from policy 3032 is fraudulent: 0.025583526119589806
Probablitity the claim from policy 3322 is fraudulent: 0.005415594670921564
Probablitity the claim from policy 435 is fraudulent: 0.008797605521976948
Probablitity the claim from policy 4280 is fraudulent: 0.008219864219427109
Probablitity the claim from policy 3598 is fraudulent: 0.0020912406034767628
Probablitity the claim from policy 2060 is fraudulent: 0.0026974573265761137
Probablitity the claim from policy 2445 is fraudulent: 0.03141326084733009
Probablitity the claim from policy 1580 is fraudulent: 0.018626952543854713
Probablitity the claim from policy 2570 is fraudulent: 0.012887710705399513
Probablitity the claim from policy 1571 is fraudulent: 0.009747525677084923
Probablitity 

In [14]:
timer

[0.055528,
 0.010904,
 0.010296,
 0.015531,
 0.009257,
 0.011363,
 0.010686,
 0.01114,
 0.010639,
 0.010603,
 0.009144,
 0.009253,
 0.008001,
 0.008496,
 0.01081,
 0.008759,
 0.009071,
 0.009832,
 0.009386,
 0.01023,
 0.009602,
 0.008734,
 0.009512,
 0.01012,
 0.008678,
 0.008363,
 0.007918,
 0.009214,
 0.008701,
 0.010373,
 0.007721,
 0.008937,
 0.008652,
 0.014426,
 0.008472,
 0.010487,
 0.00951,
 0.01098,
 0.009535,
 0.009695,
 0.0086,
 0.009866,
 0.008824,
 0.008887,
 0.008709,
 0.009451,
 0.009261,
 0.010371,
 0.009849,
 0.008604,
 0.009123,
 0.009151,
 0.009229,
 0.00915,
 0.007966,
 0.00862,
 0.009586,
 0.010286,
 0.009264,
 0.00883,
 0.009874,
 0.00861,
 0.007907,
 0.008505,
 0.009337,
 0.008142,
 0.008959,
 0.00833,
 0.010141,
 0.007726,
 0.008245,
 0.010703,
 0.008917,
 0.008525,
 0.00851,
 0.008354,
 0.008517,
 0.008341,
 0.008291,
 0.007897,
 0.010368,
 0.007758,
 0.014702,
 0.008001,
 0.008769,
 0.00812,
 0.008633,
 0.008526,
 0.008281,
 0.008447,
 0.008468,
 0.008122,
 0.

Note: 위의 "timer"는 첫 번째 통화를 기록한 다음 온라인 피쳐 저장소에 대한 후속 호출을 기록합니다.

In [15]:
import statistics
import numpy as np

statistics.mean(timer)

arr = np.array(timer)
print(
    "p95: {}, p99: {}, mean: {} for {} distinct feature store gets".format(
        np.percentile(arr, 95), np.percentile(arr, 99), np.mean(arr), MAXRECS
    )
)

p95: 0.011151149999999999, p99: 0.015930970000000204, mean: 0.00972615 for 100 distinct feature store gets


### Pull customer data from Customers feature group

고객이 즉각적인 승인을 위해 온라인으로 보험 청구를 제출하면, 보험 회사는 온라인 피쳐 저장소에서 고객별 데이터를 가져와 모델 예측을 위한 입력으로 청구 데이터에 추가해야 합니다.

In [16]:
customers_response = featurestore_runtime.get_record(
    FeatureGroupName=customers_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
)

customer_record = customers_response["Record"]
customer_df = pd.DataFrame(customer_record).set_index("FeatureName")


claims_response = featurestore_runtime.get_record(
    FeatureGroupName=claims_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
)

claims_record = claims_response["Record"]
claims_df = pd.DataFrame(claims_record).set_index("FeatureName")

### Format the datapoint

데이터 포인트는 모델이 훈련되었을 때, 모든 피쳐가 올바른 순서로 된 정확한 입력 형식과 일치해야 합니다. 이 예에서 `col_order` 변수는 가이드의 앞부분에서 훈련 및 테스트 데이터셋을 만들 때 저장되었습니다. 

In [17]:
blended_df = pd.concat([claims_df, customer_df]).loc[col_order].drop("fraud")
data_input = ",".join(blended_df["ValueAsString"])

### Make prediction

In [18]:
results = predictor.predict(data_input, initial_args={"ContentType": "text/csv"})
prediction = json.loads(results)
print(f"Probablitity the claim from policy {int(sample_policy_id)} is fraudulent:", prediction)

Probablitity the claim from policy 485 is fraudulent: 0.01206170953810215


___

<a id='aud-workflow-pipeline'></a>
### Next Notebook: [Create and Run an End-to-End Pipeline to Deploy the Model](./5-pipeline-e2e.ipynb)

이제 데이터 과학자로서 머신 러닝 워크플로의 각 단계를 수동으로 실험했으므로, 모델 계보를 통한 투명성 및 추적을 희생하지 않고도 더 빠른 모델 생성 및 배포를 허용하는 특정 단계를 수행할 수 있습니다. 다음 섹션에서는 SageMaker에서 새 모델을 훈련하고 SageMaker에서 모델을 유지한 다음, 모델을 레지스트리에 추가하고 SageMaker 호스팅 엔드 포인트로 배포하는 파이프라인을 생성합니다.