# Part 4 : Deploy, Run Inference, Interpret Inference

<a id='overview-4'></a>

## [Overview](./0-AutoClaimFraudDetection.ipynb)
* [Notebook 0 : Overview, Architecture and Data Exploration](./0-AutoClaimFraudDetection.ipynb)
* [Notebook 1: Data Prep, Process, Store Features](./1-data-prep-e2e.ipynb)
* [Notebook 2: Train, Check Bias, Tune, Record Lineage, and Register a Model](./2-lineage-train-assess-bias-tune-registry-e2e.ipynb)
* [Notebook 3: Mitigate Bias, Train New Model, Store in Registry](./3-mitigate-bias-train-model2-registry-e2e.ipynb)
* **[Notebook 4: Deploy Model, Run Predictions](./4-deploy-run-inference-e2e.ipynb)**
  * **[Architecture](#deploy)**
  * **[Deploy an approved model and Run Inference via Feature Store](#deploy-model)**
  * **[Create a Predictor](#predictor)**
  * **[Run Predictions from Online FeatureStore](#run-predictions)**
* [Notebook 5 : Create and Run an End-to-End Pipeline to Deploy the Model](./5-pipeline-e2e.ipynb)

In this section of the end to end use case, we will deploy the mitigated model that is the end-product of this fraud detection use-case. We will show how to run inference and also how to use Clarify to interpret or "explain" the model.

### Install required and/or update third-party libraries

In [None]:
!python -m pip install -Uq pip
!python -m pip install -q awswrangler imbalanced-learn sagemaker boto3

### Load stored variables
Run the cell below to load any prevously created variables. You should see a print-out of the existing variables. If you don't see anything you may need to create them again or it may be your first time running this notebook.

In [1]:
%store -r
%store

Stored variables and their in-db values:
bucket                              -> 'sagemaker-us-east-1-875692608981'
claims_fg_name                      -> 'fraud-detect-demo-claims'
claims_preprocessed                 ->       policy_id  incident_severity  num_vehicles_i
claims_table                        -> 'fraud-detect-demo-claims-1637021687'
clarify_bias_job_1_name             -> 'Clarify-Bias-2021-11-16-01-24-37-110'
clarify_bias_job_2_name             -> 'Clarify-Bias-2021-11-16-02-36-07-707'
clarify_expl_job_name               -> 'Clarify-Explainability-2021-11-16-02-48-36-081'
col_order                           -> ['fraud', 'driver_relationship_spouse', 'num_insur
customers_fg_name                   -> 'fraud-detect-demo-customers'
customers_preprocessed              ->       policy_id  customer_age  customer_education 
customers_table                     -> 'fraud-detect-demo-customers-1637021688'
database_name                       -> 'sagemaker_featurestore'
dataset_uri    

**<font color='red'>Important</font>: You must have run the previous sequential notebooks to retrieve variables using the StoreMagic command.**

### Import libraries

In [2]:
import json
import time
import boto3
import sagemaker
import numpy as np
import pandas as pd
import awswrangler as wr

### Set region, boto3 and SageMaker SDK variables

In [3]:
# You can change this to a region of your choice
region = sagemaker.Session().boto_region_name
print("Using AWS Region: {}".format(region))

Using AWS Region: us-east-1


In [4]:
boto3.setup_default_session(region_name=region)

boto_session = boto3.Session(region_name=region)

s3_client = boto3.client("s3", region_name=region)

sagemaker_client = boto_session.client("sagemaker")

sagemaker_session = sagemaker.session.Session(
    boto_session=boto_session, sagemaker_client=sagemaker_client
)

sagemaker_role = sagemaker.get_execution_role()

account_id = boto3.client("sts").get_caller_identity()["Account"]

In [5]:
# variables used for parameterizing the notebook run
endpoint_name = f"{model_2_name}-endpoint"
endpoint_instance_count = 1
endpoint_instance_type = "ml.m4.xlarge"

predictor_instance_count = 1
predictor_instance_type = "ml.c5.xlarge"
batch_transform_instance_count = 1
batch_transform_instance_type = "ml.c5.xlarge"

<a id ='deploy'> </a>

## Architecture for this ML Lifecycle Stage : Train, Check Bias, Tune, Record Lineage, Register Model
[overview](#overview-4)

![train-assess-tune-register](./images/e2e-3-pipeline-v3b.png)

<a id ='deploy-model'></a>

## Deploy an approved model and make prediction via Feature Store

[overview](#overview-4)

#### Approve the second model
In the real-life MLOps lifecycle, a model package gets approved after evaluation by data scientists, subject matter experts and auditors.

![train-assess-tune-register](./images/bestmodel.png)

In [7]:
sagemaker_client.list_model_packages(ModelPackageGroupName=mpg_name)["ModelPackageSummaryList"][0]

{'ModelPackageGroupName': 'fraud-detect-demo',
 'ModelPackageVersion': 2,
 'ModelPackageArn': 'arn:aws:sagemaker:us-east-1:875692608981:model-package/fraud-detect-demo/2',
 'ModelPackageDescription': 'XGBoost classifier to detect insurance fraud with SMOTE.',
 'CreationTime': datetime.datetime(2021, 11, 16, 3, 6, 57, 593000, tzinfo=tzlocal()),
 'ModelPackageStatus': 'Completed',
 'ModelApprovalStatus': 'PendingManualApproval'}

In [8]:
second_model_package = sagemaker_client.list_model_packages(ModelPackageGroupName=mpg_name)["ModelPackageSummaryList"][0]
model_package_update = {
    "ModelPackageArn": second_model_package["ModelPackageArn"],
    "ModelApprovalStatus": "Approved",
}

update_response = sagemaker_client.update_model_package(**model_package_update)
update_response

{'ModelPackageArn': 'arn:aws:sagemaker:us-east-1:875692608981:model-package/fraud-detect-demo/2',
 'ResponseMetadata': {'RequestId': '268eaf65-3577-4263-a912-553025d6a587',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '268eaf65-3577-4263-a912-553025d6a587',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '96',
   'date': 'Tue, 16 Nov 2021 03:09:14 GMT'},
  'RetryAttempts': 0}}

#### Create an endpoint config and an endpoint
Deploy the endpoint. This might take about 8minutes.

![train-assess-tune-register](./images/endpoint.png)

In [9]:
primary_container = {'ModelPackageName': second_model_package['ModelPackageArn']}
endpoint_config_name=f'{model_2_name}-endpoint-config'
existing_configs = len(sagemaker_client.list_endpoint_configs(NameContains=endpoint_config_name, MaxResults = 30)['EndpointConfigs'])

if existing_configs == 0:
    create_ep_config_response = sagemaker_client.create_endpoint_config(
        EndpointConfigName=endpoint_config_name,
        ProductionVariants=[{
            'InstanceType': endpoint_instance_type,
            'InitialVariantWeight': 1,
            'InitialInstanceCount': endpoint_instance_count,
            'ModelName': model_2_name,
            'VariantName': 'AllTraffic'
        }]
    )
    %store endpoint_config_name
    print(f"Endpoint Config name: {endpoint_config_name}")

Stored 'endpoint_config_name' (str)
Endpoint Config name: fraud-detect-demo-xgboost-post-smote-endpoint-config


In [10]:
existing_endpoints = sagemaker_client.list_endpoints(NameContains=endpoint_name, MaxResults = 30)['Endpoints']
if not existing_endpoints:
    create_endpoint_response = sagemaker_client.create_endpoint(
        EndpointName=endpoint_name,
        EndpointConfigName=endpoint_config_name)
    %store endpoint_name
    print(f"Endpoint name: {endpoint_name}")

endpoint_info = sagemaker_client.describe_endpoint(EndpointName=endpoint_name)
endpoint_status = endpoint_info['EndpointStatus']

while endpoint_status == 'Creating':
    endpoint_info = sagemaker_client.describe_endpoint(EndpointName=endpoint_name)
    endpoint_status = endpoint_info['EndpointStatus']
    print('Endpoint status:', endpoint_status)
    if endpoint_status == 'Creating':
        time.sleep(60)

Stored 'endpoint_name' (str)
Endpoint name: fraud-detect-demo-xgboost-post-smote-endpoint
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: InService


<a id='predictor'> </a>

### Create a predictor

In [11]:
predictor = sagemaker.predictor.Predictor(
    endpoint_name=endpoint_name, sagemaker_session=sagemaker_session
)

predictor.enable_data_capture()

-------------------!

### Split Dataset

In [12]:
dataset = pd.read_csv("data/dataset.csv")
train = dataset.sample(frac=0.8, random_state=0)
test = dataset.drop(train.index)

In [13]:
test.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000 entries, 0 to 4997
Data columns (total 48 columns):
 #   Column                           Non-Null Count  Dtype  
---  ------                           --------------  -----  
 0   Unnamed: 0                       1000 non-null   int64  
 1   policy_id                        1000 non-null   int64  
 2   driver_relationship_spouse       1000 non-null   int64  
 3   num_insurers_past_5_years        1000 non-null   int64  
 4   policy_state_id                  1000 non-null   int64  
 5   vehicle_claim                    1000 non-null   float64
 6   authorities_contacted_fire       1000 non-null   int64  
 7   incident_type_collision          1000 non-null   int64  
 8   incident_severity                1000 non-null   int64  
 9   injury_claim                     1000 non-null   float64
 10  num_vehicles_involved            1000 non-null   int64  
 11  policy_liability                 1000 non-null   int64  
 12  authorities_contacte

### Get sample's claim data from online feature store
This will simulate getting data in real-time from a customer's insurance claim submission.

![train-assess-tune-register](./images/endpoint2.png)

In [14]:
featurestore_runtime = boto_session.client(
    service_name="sagemaker-featurestore-runtime", region_name=region
)

feature_store_session = sagemaker.Session(
    boto_session=boto_session,
    sagemaker_client=sagemaker_client,
    sagemaker_featurestore_runtime_client=featurestore_runtime,
)
feature_store_session

<sagemaker.session.Session at 0x7f344a4cf450>

<a id='run-predictions'> </a>

## Run Predictions on Multiple Claims

[overview](#overview-4)

In [15]:
import datetime as datetime

timer = []
MAXRECS = 100


def barrage_of_inference():
    sample_policy_id = int(test.sample(1)["policy_id"])

    temp_fg_name = "fraud-detect-demo-claims"

    claims_response = featurestore_runtime.get_record(
        FeatureGroupName=temp_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
    )

    if claims_response.get("Record"):
        claims_record = claims_response["Record"]
        claims_df = pd.DataFrame(claims_record).set_index("FeatureName")
    else:
        print("No Record returned / Record Key  \n")

    t0 = datetime.datetime.now()

    customers_response = featurestore_runtime.get_record(
        FeatureGroupName=customers_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
    )

    t1 = datetime.datetime.now()

    customer_record = customers_response["Record"]
    customer_df = pd.DataFrame(customer_record).set_index("FeatureName")

    blended_df = pd.concat([claims_df, customer_df]).loc[col_order].drop("fraud")
    data_input = ",".join(blended_df["ValueAsString"])

    results = predictor.predict(data_input, initial_args={"ContentType": "text/csv"})
    prediction = json.loads(results)
    # print (f'Probablitity the claim from policy {int(sample_policy_id)} is fraudulent:', prediction)

    arr = t1 - t0
    minutes, seconds = divmod(arr.total_seconds(), 60)

    timer.append(seconds)
    # print (prediction, " done in {} ".format(seconds))

    return sample_policy_id, prediction, arr


for i in range(MAXRECS):
    sample_policy_id, prediction, arr = barrage_of_inference()
    print(f"Probablitity the claim from policy {int(sample_policy_id)} is fraudulent:, {prediction}  Time {str(arr).split(':')[2]}")

Probablitity the claim from policy 3521 is fraudulent:, 0.008149230852723122  Time 00.065846
Probablitity the claim from policy 4017 is fraudulent:, 0.02115439809858799  Time 00.008960
Probablitity the claim from policy 4016 is fraudulent:, 0.007386560086160898  Time 00.008483
Probablitity the claim from policy 4106 is fraudulent:, 0.08545010536909103  Time 00.008397
Probablitity the claim from policy 500 is fraudulent:, 0.021511288359761238  Time 00.008972
Probablitity the claim from policy 3626 is fraudulent:, 0.0041765919886529446  Time 00.008201
Probablitity the claim from policy 2591 is fraudulent:, 0.006619956344366074  Time 00.009933
Probablitity the claim from policy 491 is fraudulent:, 0.024231871590018272  Time 00.012815
Probablitity the claim from policy 1422 is fraudulent:, 0.0077125937677919865  Time 00.009441
Probablitity the claim from policy 388 is fraudulent:, 0.004656988196074963  Time 00.010078
Probablitity the claim from policy 2675 is fraudulent:, 0.018705651164054

Note: the above "timer" records the first call and then subsequent calls to the online Feature Store

In [16]:
import statistics
import numpy as np

statistics.mean(timer)


arr = np.array(timer)
print(
    "p95: {}, p99: {}, mean: {} for {} distinct feature store gets".format(
        np.percentile(arr, 95), np.percentile(arr, 99), np.mean(arr), MAXRECS
    )
)

p95: 0.01051735, p99: 0.013345310000000271, mean: 0.008838819999999999 for 100 distinct feature store gets


### Pull customer data from Customers feature group
When a customer submits an insurance claim online for instant approval, the insurance company will need to pull customer-specific data from the online feature store to add to the claim data as input for a model prediction.

### Sample a claim from the test data

In [17]:
sample_policy_id = int(test.sample(1)["policy_id"])
print(f"Sample Policy ID: {sample_policy_id}")

Sample Policy ID: 1983


In [18]:
customers_response = featurestore_runtime.get_record(
    FeatureGroupName=customers_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
)

customer_record = customers_response["Record"]
customer_df = pd.DataFrame(customer_record).set_index("FeatureName")


claims_response = featurestore_runtime.get_record(
    FeatureGroupName=claims_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
)

claims_record = claims_response["Record"]
claims_df = pd.DataFrame(claims_record).set_index("FeatureName")
claims_df

Unnamed: 0_level_0,ValueAsString
FeatureName,Unnamed: 1_level_1
policy_id,1983.0
incident_severity,1.0
num_vehicles_involved,3.0
num_injuries,0.0
num_witnesses,1.0
police_report_available,1.0
injury_claim,8100.0
vehicle_claim,14388.0
total_claim_amount,22488.0
incident_month,2.0


### Format the datapoint
The datapoint must match the exact input format as the model was trained--with all features in the correct order. In this example, the `col_order` variable was saved when you created the train and test datasets earlier in the guide.

In [19]:
blended_df = pd.concat([claims_df, customer_df]).loc[col_order].drop("fraud")
blended_df
data_input = ",".join(blended_df["ValueAsString"])
data_input

'0,1,0,14388.0,0,1,1,8100.0,3,0,0,0,0,0,0,1,0,1,2,0,55,0,0,0,143,0,3,2008,0,1,0,19,22488.0,0,750,0,0,1,0,1,5,1,1,3000,1'

### Make prediction

In [20]:
results = predictor.predict(data_input, initial_args={"ContentType": "text/csv"})
prediction = json.loads(results)
print(f"Probablitity the claim from policy {int(sample_policy_id)} is fraudulent:", prediction)

Probablitity the claim from policy 1983 is fraudulent: 0.011180935427546501


----

<a id='aud-workflow-pipeline'></a>

### Next Notebook: [Create and Run an End-to-End Pipeline to Deploy the Model](./07-Pipeline.ipynb)
Now that as a Data Scientist, you've manually experimented with each step in our machine learning workflow, you can take certain steps to allow for faster model creation and deployment without sacrificing transparency and tracking via model lineage. In the next section you will create a pipeline which trains a new model on SageMaker, persists the model in SageMaker and then adds the model to the registry and deploys it as a SageMaker hosted endpoint.