# Module 4: Inference Patterns- endpoint based feature look up
**This notebook uses the feature groups created in `module 1` and `module 2` and model trained in `module 3` to show how we can look up features from online feature store in realtime from an endpoint**

**Note:** Please set kernel to `Python 3 (Data Science)` and select instance to `ml.t3.medium`


# Contents

1. [Background](#Background)
2. [Setup](#Setup)
3. [Load feature groups names](#feature_names)
4. [Load the traind model to use for inference](#load_train_model)
5. [Deploy the model](#deploy_model)
6. [Make inference](#Real-time-inference-using-the-deployed-endpoint)


# Background
In this notebook, we demonstrate how to retreive features from two online feature groups within an endpoint. 

First we use the feature sets derived and ingested to online features stores in Modules 1 and 2 as well as the model trained in module 3 that was a SageMaker XGBoost binary classifier algorithm predicting whether a given product is added by a user to their basket.

In this notebook, we will  deploy the already trained model from the model artifact on to a SageMaker endpoint for real-time inference. Our endpoint will get the features from two online features groups (customers and products feature groups created in Module 2). By sending the request body as customer id and product id we retreive the associated features from customer and product feature groups from low latency online feature stores and send them to the model endpoint for real-time inference.

For this we will create a custom inference script, and specify the feature groups as well as features. We Utilise a custom library (helper.py) that faciliates returning the resutls from featurestore using some helper functions. The returned result is then fed into the model for inference.

Take a few minutes reviewing the following architecture:

![Inference endpoint lookup](../images/m4_nb2_inference_pattern.png "Inference endpoint feature look up")


# Setup

In [None]:
from sagemaker.serializers import CSVSerializer
from sagemaker.inputs import TrainingInput
from sagemaker.predictor import Predictor
from sagemaker import get_execution_role
import pandas as pd
import numpy as np
import sagemaker
import logging
import json
import os
import sys
sys.path.append('..')
#!pip install boto3 --upgrade
import boto3


### Essentials


In [None]:
logger = logging.getLogger('__name__')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())

In [None]:
sagemaker_execution_role = get_execution_role()
logger.info(f'Role = {sagemaker_execution_role}')
session = boto3.Session()
sagemaker_session = sagemaker.Session()
sagemaker_client = session.client(service_name="sagemaker")
default_bucket = sagemaker_session.default_bucket()
prefix = 'sagemaker-featurestore-workshop'
s3 = session.resource('s3')


# Load feature groups names <a class="anchor" id="feature_names"></a>

In [None]:
%store -r customers_feature_group_name
%store -r products_feature_group_name
%store -r endpoint_name

# Load the traind model to use for inference <a class="anchor" id="load_train_model"></a>
We will use the model artificats we trained in module 3 and use it for making inference- We will customise the inference of the model as described below

In [None]:
%store -r training_jobName

from sagemaker.xgboost.model import XGBoostModel

training_job_info = sagemaker_client.describe_training_job(
    TrainingJobName=training_jobName
)
xgb_model_data = training_job_info["ModelArtifacts"]["S3ModelArtifacts"]


# Deploy the model <a class="anchor" id="deploy_model"></a>
We deploy our model into an real-time endpoint and customise that by passing an inference script as well as our helper python library.

First we prepare an inference.py entry script. If you pay attention inside the script, you will see few changes. Pay attention to the following details in the script:

1. Firstly, you will see that we have defined a list that includes feature group names along side feature names, where it allows to select for a pre-defined features by specifying their names or all the features by specifying '*'. 
For example:

To get selected features from users and product featurs stores:

**`feature_list=['fscw-products-10-18-00-12: age_60-69, age_70-plus',
              'fscw-customers-10-18-00-12: category_packaged_cheese']`**

To get all features from users and product features stroes:  

**`feature_list=['fscw-products-10-18-00-12:*',
              'fscw-customers-10-18-00-12:*']`**

2. Our model was built using all the features from the two feature groups and therefore we will exctract all the features.

3. We are defining a custom input_fn function. Our request body is in a form of customer id and product id that calls the feature store client, using the helper library.

4. You can take note of the time it takes for the features to be returned from the feature stores within the cloudwatch logs.


In [None]:
%%writefile custom_library/inference_entry.py

import json
from io import StringIO
import os
import pickle as pkl
import joblib
import time
import sys
import subprocess
import numpy as np
from sagemaker_containers.beta.framework import (
    content_types,
    encoders,
    env,
    modules,
    transformer,
    worker,
)


import pandas as pd
import boto3
import sagemaker
import helper
import sagemaker_xgboost_container.encoder as xgb_encoders

boto_session = boto3.Session()
region= boto_session.region_name
print(region)



feature_list=['fscw-products-10-18-00-12:*','fscw-customers-10-18-00-12:*']

import json
import os
import pickle as pkl

import numpy as np
import sagemaker_xgboost_container.encoder as xgb_encoders


def model_fn(model_dir):
    """
    Deserialize and return fitted model.
    """
    model_file = "xgboost-model"
    booster = pkl.load(open(os.path.join(model_dir, model_file), "rb"))
    return booster

def input_fn(request_body, request_content_type):
    print(request_content_type)
    """
    The SageMaker XGBoost model server receives the request data body and the content type,
    and invokes the `input_fn`.
    Return a DMatrix (an object that can be passed to predict_fn).
    """
    print(request_content_type)
    if request_content_type == "text/csv":
        params =request_body.split(',')
        id_dict={'customer_id':params[0], 'product_id':params[1]}
        start = time.time()
        rec=f'{id_dict}, {feature_list}'
        records= helper.get_latest_featureset_values_e(id_dict, feature_list)
        end= time.time()
        duration= end-start
        print ("time to lookup features from two feature stores:", duration)
        records= ",".join([str(e) for e in records.values()])
        return xgb_encoders.csv_to_dmatrix(records)
    else:
        raise ValueError("{} not supported by script!".format(request_content_type))
        
    

In [None]:
from sagemaker.xgboost.model import XGBoostModel
from sagemaker.serializers import CSVSerializer

instance_type = "ml.m4.xlarge"

xgboost_inference_feature = XGBoostModel(
    model_data= xgb_model_data,
    role=sagemaker_execution_role,
    source_dir= './custom_library',
    entry_point="inference_entry.py",
    framework_version="1.2-2")

predictor_feature = xgboost_inference_feature.deploy(
    initial_instance_count=1, instance_type=instance_type)

# Make inferences using the deployed model <a class="anchor" id="Real-time-inference-using-the-deployed-endpoint"></a>

In [None]:
predictor_feature.serializer = CSVSerializer()

cust_id='C50'
prod_id='P2'
test_data= f'{cust_id}, {prod_id}'
print(test_data)

print(predictor_feature.predict(test_data))

### *Optional*- if you like to play around abit and see how fast features are returned from the two features stores within your inference endpoint you can use the following code

In [None]:
import numpy as np
import sagemaker
import time
import pandas as pd
from custom_library import helper

feature_list=['fscw-products-10-18-00-12:*','fscw-customers-10-18-00-12:*']


def input_fn(request_data, request_content_type):
    if request_content_type == "text/csv":
        params =request_data.split(',')
        id_dict={'customer_id':params[0], 'product_id':params[1]}
        start = time.time()
        records= helper.get_latest_featureset_values(id_dict, feature_list)
        end= time.time()
        duration= end-start
        print("time to lookup features from two feature stores:", duration)
        records= ",".join([str(e) for e in records.values()])
        return records, duration
    else:
        raise ValueError("{} not supported by script!".format(request_content_type))


In [None]:
cust_id='C45'
prod_id='P26'
payload= f'{cust_id},{prod_id}'
print(payload)

input_fn(payload, "text/csv")