# Part 4 : Deploy, Run Inference, Interpret Inference

<a id='overview-4'></a>

## [Overview](./0-AutoClaimFraudDetection.ipynb)
* [Notebook 0 : Overview, Architecture and Data Exploration](./0-AutoClaimFraudDetection.ipynb)
* [Notebook 1: Data Prep, Process, Store Features](./1-data-prep-e2e.ipynb)
* [Notebook 2: Train, Check Bias, Tune, Record Lineage, and Register a Model](./2-lineage-train-assess-bias-tune-registry-e2e.ipynb)
* [Notebook 3: Mitigate Bias, Train New Model, Store in Registry](./3-mitigate-bias-train-model2-registry-e2e.ipynb)
* **[Notebook 4: Deploy Model, Run Predictions](./4-deploy-run-inference-e2e.ipynb)**
  * **[Architecture](#deploy)**
  * **[Deploy an approved model and Run Inference via Feature Store](#deploy-model)**
  * **[Create a Predictor](#predictor)**
  * **[Run Predictions from Online FeatureStore](#run-predictions)**
* [Notebook 5 : Create and Run an End-to-End Pipeline to Deploy the Model](./5-pipeline-e2e.ipynb)

In this section of the end to end use case, we will deploy the mitigated model that is the end-product of this fraud detection use-case. We will show how to run inference and also how to use Clarify to interpret or "explain" the model.

### Install required and/or update third-party libraries

In [None]:
!python -m pip install -Uq pip
!python -m pip install -q awswrangler==2.2.0 imbalanced-learn==0.7.0 sagemaker==2.41.0 boto3==1.17.70

### Load stored variables
Run the cell below to load any prevously created variables. You should see a print-out of the existing variables. If you don't see anything you may need to create them again or it may be your first time running this notebook.

In [3]:
%store -r
%store

Stored variables and their in-db values:
bucket                              -> 'sagemaker-us-east-1-875692608981'
claims_fg_name                      -> 'fraud-detect-demo-claims'
claims_table                        -> 'fraud-detect-demo-claims-1635991472'
clarify_bias_job_1_name             -> 'Clarify-Bias-2021-11-04-02-57-18-432'
clarify_bias_job_2_name             -> 'Clarify-Bias-2021-11-04-03-30-25-420'
clarify_expl_job_name               -> 'Clarify-Explainability-2021-11-04-03-43-09-860'
col_order                           -> ['fraud', 'incident_type_breakin', 'num_vehicles_i
customers_fg_name                   -> 'fraud-detect-demo-customers'
customers_table                     -> 'fraud-detect-demo-customers-1635991475'
database_name                       -> 'sagemaker_featurestore'
hyperparameters                     -> {'max_depth': '3', 'eta': '0.2', 'objective': 'bin
model_1_name                        -> 'fraud-detect-demo-xgboost-pre-smote'
model_2_name                

**<font color='red'>Important</font>: You must have run the previous sequential notebooks to retrieve variables using the StoreMagic command.**

### Import libraries

In [4]:
import json
import time
import boto3
import sagemaker
import numpy as np
import pandas as pd
import awswrangler as wr

### Set region, boto3 and SageMaker SDK variables

In [5]:
# You can change this to a region of your choice
import sagemaker

region = sagemaker.Session().boto_region_name
print("Using AWS Region: {}".format(region))

Using AWS Region: us-east-1


In [6]:
boto3.setup_default_session(region_name=region)

boto_session = boto3.Session(region_name=region)

s3_client = boto3.client("s3", region_name=region)

sagemaker_boto_client = boto_session.client("sagemaker")

sagemaker_session = sagemaker.session.Session(
    boto_session=boto_session, sagemaker_client=sagemaker_boto_client
)

sagemaker_role = sagemaker.get_execution_role()

account_id = boto3.client("sts").get_caller_identity()["Account"]

In [7]:
# variables used for parameterizing the notebook run
endpoint_name = f"{model_2_name}-endpoint"
endpoint_instance_count = 1
endpoint_instance_type = "ml.m4.xlarge"

predictor_instance_count = 1
predictor_instance_type = "ml.c5.xlarge"
batch_transform_instance_count = 1
batch_transform_instance_type = "ml.c5.xlarge"

<a id ='deploy'> </a>

## Architecture for this ML Lifecycle Stage : Train, Check Bias, Tune, Record Lineage, Register Model
[overview](#overview-4)

![train-assess-tune-register](./images/e2e-3-pipeline-v3b.png)

<a id ='deploy-model'></a>

## Deploy an approved model and make prediction via Feature Store

[overview](#overview-4)

#### Approve the second model
In the real-life MLOps lifecycle, a model package gets approved after evaluation by data scientists, subject matter experts and auditors.

In [8]:
second_model_package = sagemaker_boto_client.list_model_packages(ModelPackageGroupName=mpg_name)[
    "ModelPackageSummaryList"
][0]
model_package_update = {
    "ModelPackageArn": second_model_package["ModelPackageArn"],
    "ModelApprovalStatus": "Approved",
}

update_response = sagemaker_boto_client.update_model_package(**model_package_update)

#### Create an endpoint config and an endpoint
Deploy the endpoint. This might take about 8minutes.

In [9]:
primary_container = {'ModelPackageName': second_model_package['ModelPackageArn']}
endpoint_config_name=f'{model_2_name}-endpoint-config'
existing_configs = len(sagemaker_boto_client.list_endpoint_configs(NameContains=endpoint_config_name, MaxResults = 30)['EndpointConfigs'])

if existing_configs == 0:
    create_ep_config_response = sagemaker_boto_client.create_endpoint_config(
        EndpointConfigName=endpoint_config_name,
        ProductionVariants=[{
            'InstanceType': endpoint_instance_type,
            'InitialVariantWeight': 1,
            'InitialInstanceCount': endpoint_instance_count,
            'ModelName': model_2_name,
            'VariantName': 'AllTraffic'
        }]
    )
    %store endpoint_config_name

Stored 'endpoint_config_name' (str)


In [10]:
existing_endpoints = sagemaker_boto_client.list_endpoints(NameContains=endpoint_name, MaxResults = 30)['Endpoints']
if not existing_endpoints:
    create_endpoint_response = sagemaker_boto_client.create_endpoint(
        EndpointName=endpoint_name,
        EndpointConfigName=endpoint_config_name)
    %store endpoint_name

endpoint_info = sagemaker_boto_client.describe_endpoint(EndpointName=endpoint_name)
endpoint_status = endpoint_info['EndpointStatus']

while endpoint_status == 'Creating':
    endpoint_info = sagemaker_boto_client.describe_endpoint(EndpointName=endpoint_name)
    endpoint_status = endpoint_info['EndpointStatus']
    print('Endpoint status:', endpoint_status)
    if endpoint_status == 'Creating':
        time.sleep(60)

Stored 'endpoint_name' (str)
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: Creating
Endpoint status: InService


<a id='predictor'> </a>

### Create a predictor

In [11]:
predictor = sagemaker.predictor.Predictor(
    endpoint_name=endpoint_name, sagemaker_session=sagemaker_session
)

### Sample a claim from the test data

In [12]:
dataset = pd.read_csv("data/dataset.csv")
train = dataset.sample(frac=0.8, random_state=0)
test = dataset.drop(train.index)
sample_policy_id = int(test.sample(1)["policy_id"])

In [13]:
test.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1000 entries, 0 to 4997
Data columns (total 48 columns):
 #   Column                           Non-Null Count  Dtype  
---  ------                           --------------  -----  
 0   Unnamed: 0                       1000 non-null   int64  
 1   policy_id                        1000 non-null   int64  
 2   incident_type_breakin            1000 non-null   int64  
 3   num_vehicles_involved            1000 non-null   int64  
 4   collision_type_na                1000 non-null   int64  
 5   policy_state_nv                  1000 non-null   int64  
 6   authorities_contacted_fire       1000 non-null   int64  
 7   customer_gender_male             1000 non-null   int64  
 8   authorities_contacted_ambulance  1000 non-null   int64  
 9   injury_claim                     1000 non-null   float64
 10  policy_deductable                1000 non-null   int64  
 11  collision_type_front             1000 non-null   int64  
 12  authorities_contacte

### Get sample's claim data from online feature store
This will simulate getting data in real-time from a customer's insurance claim submission.

In [14]:
featurestore_runtime = boto_session.client(
    service_name="sagemaker-featurestore-runtime", region_name=region
)

feature_store_session = sagemaker.Session(
    boto_session=boto_session,
    sagemaker_client=sagemaker_boto_client,
    sagemaker_featurestore_runtime_client=featurestore_runtime,
)

<a id='run-predictions'> </a>

## Run Predictions on Multiple Claims

[overview](#overview-4)

In [15]:
import datetime as datetime

timer = []
MAXRECS = 100


def barrage_of_inference():
    sample_policy_id = int(test.sample(1)["policy_id"])

    temp_fg_name = "fraud-detect-demo-claims"

    claims_response = featurestore_runtime.get_record(
        FeatureGroupName=temp_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
    )

    if claims_response.get("Record"):
        claims_record = claims_response["Record"]
        claims_df = pd.DataFrame(claims_record).set_index("FeatureName")
    else:
        print("No Record returned / Record Key  \n")

    t0 = datetime.datetime.now()

    customers_response = featurestore_runtime.get_record(
        FeatureGroupName=customers_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
    )

    t1 = datetime.datetime.now()

    customer_record = customers_response["Record"]
    customer_df = pd.DataFrame(customer_record).set_index("FeatureName")

    blended_df = pd.concat([claims_df, customer_df]).loc[col_order].drop("fraud")
    data_input = ",".join(blended_df["ValueAsString"])

    results = predictor.predict(data_input, initial_args={"ContentType": "text/csv"})
    prediction = json.loads(results)
    # print (f'Probablitity the claim from policy {int(sample_policy_id)} is fraudulent:', prediction)

    arr = t1 - t0
    minutes, seconds = divmod(arr.total_seconds(), 60)

    timer.append(seconds)
    # print (prediction, " done in {} ".format(seconds))

    return sample_policy_id, prediction


for i in range(MAXRECS):
    sample_policy_id, prediction = barrage_of_inference()
    print(f"Probablitity the claim from policy {int(sample_policy_id)} is fraudulent:", prediction)

Probablitity the claim from policy 3740 is fraudulent: 0.12106682360172272
Probablitity the claim from policy 2667 is fraudulent: 0.005608935374766588
Probablitity the claim from policy 801 is fraudulent: 0.013018963858485222
Probablitity the claim from policy 4478 is fraudulent: 0.020848188549280167
Probablitity the claim from policy 275 is fraudulent: 0.013813871890306473
Probablitity the claim from policy 3836 is fraudulent: 0.010482619516551495
Probablitity the claim from policy 2791 is fraudulent: 0.007228601723909378
Probablitity the claim from policy 1955 is fraudulent: 0.11650000512599945
Probablitity the claim from policy 2667 is fraudulent: 0.005608935374766588
Probablitity the claim from policy 2954 is fraudulent: 0.003533664159476757
Probablitity the claim from policy 4042 is fraudulent: 0.37386268377304077
Probablitity the claim from policy 3237 is fraudulent: 0.007228601723909378
Probablitity the claim from policy 1808 is fraudulent: 0.01955937035381794
Probablitity the c

In [16]:
timer

[0.059719,
 0.009545,
 0.013062,
 0.012054,
 0.014059,
 0.009755,
 0.010177,
 0.011736,
 0.010717,
 0.012913,
 0.01318,
 0.009719,
 0.010391,
 0.010631,
 0.011139,
 0.010635,
 0.009286,
 0.010314,
 0.008947,
 0.009487,
 0.009279,
 0.008239,
 0.008908,
 0.00879,
 0.00986,
 0.009515,
 0.009892,
 0.008619,
 0.008853,
 0.00829,
 0.009192,
 0.009973,
 0.009538,
 0.01034,
 0.009005,
 0.008588,
 0.010041,
 0.008806,
 0.008841,
 0.010646,
 0.010396,
 0.009049,
 0.009247,
 0.008071,
 0.008042,
 0.011088,
 0.008095,
 0.008471,
 0.010173,
 0.008392,
 0.008582,
 0.008739,
 0.008754,
 0.008282,
 0.009852,
 0.0131,
 0.008456,
 0.00965,
 0.008407,
 0.01043,
 0.009084,
 0.008627,
 0.008486,
 0.009835,
 0.009006,
 0.008949,
 0.010101,
 0.008911,
 0.010445,
 0.008179,
 0.014384,
 0.009462,
 0.008765,
 0.020293,
 0.009559,
 0.009583,
 0.007928,
 0.008838,
 0.008323,
 0.013265,
 0.008512,
 0.00845,
 0.00892,
 0.010605,
 0.010422,
 0.009748,
 0.008009,
 0.008402,
 0.00838,
 0.008775,
 0.008309,
 0.008933,


Note: the above "timer" records the first call and then subsequent calls to the online Feature Store

In [17]:
import statistics
import numpy as np

statistics.mean(timer)


arr = np.array(timer)
print(
    "p95: {}, p99: {}, mean: {} for {} distinct feature store gets".format(
        np.percentile(arr, 95), np.percentile(arr, 99), np.mean(arr), MAXRECS
    )
)

p95: 0.01318425, p99: 0.0206872600000002, mean: 0.010192139999999999 for 100 distinct feature store gets


### Pull customer data from Customers feature group
When a customer submits an insurance claim online for instant approval, the insurance company will need to pull customer-specific data from the online feature store to add to the claim data as input for a model prediction.

In [18]:
customers_response = featurestore_runtime.get_record(
    FeatureGroupName=customers_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
)

customer_record = customers_response["Record"]
customer_df = pd.DataFrame(customer_record).set_index("FeatureName")


claims_response = featurestore_runtime.get_record(
    FeatureGroupName=claims_fg_name, RecordIdentifierValueAsString=str(sample_policy_id)
)

claims_record = claims_response["Record"]
claims_df = pd.DataFrame(claims_record).set_index("FeatureName")

### Format the datapoint
The datapoint must match the exact input format as the model was trained--with all features in the correct order. In this example, the `col_order` variable was saved when you created the train and test datasets earlier in the guide.

In [19]:
blended_df = pd.concat([claims_df, customer_df]).loc[col_order].drop("fraud")
data_input = ",".join(blended_df["ValueAsString"])

### Make prediction

In [20]:
results = predictor.predict(data_input, initial_args={"ContentType": "text/csv"})
prediction = json.loads(results)
print(f"Probablitity the claim from policy {int(sample_policy_id)} is fraudulent:", prediction)

Probablitity the claim from policy 4962 is fraudulent: 0.01727212592959404


----

<a id='aud-workflow-pipeline'></a>

### Next Notebook: [Create and Run an End-to-End Pipeline to Deploy the Model](./07-Pipeline.ipynb)
Now that as a Data Scientist, you've manually experimented with each step in our machine learning workflow, you can take certain steps to allow for faster model creation and deployment without sacrificing transparency and tracking via model lineage. In the next section you will create a pipeline which trains a new model on SageMaker, persists the model in SageMaker and then adds the model to the registry and deploys it as a SageMaker hosted endpoint.