# Module 4: Working with the Online Store
**This notebook uses the feature set and the XGBoost model created in `module-3`**

**Note:** Please set kernel to `Python 3 (Data Science)` and set instance to `ml.t3.medium` (2x vCPU, 4GB)

---

# Contents

1. [Background](#Background)
1. [Setup](#Setup)
1. [Create Feature Store helper functions](#Create-Feature-Store-helper-functions)
1. [Fetch a single Customer record from the Online Feature Store](#Fetch-a-single-Customer-record-from-the-Online-Feature-Store)
1. [Batch fetch multiple Product records from the Online Feature Store](#Batch-fetch-multiple-Product-records-from-the-Online-Feature-Store)
1. [Real time inference using the deployed endpoint](#Real-time-inference-using-the-deployed-endpoint)


<br>

<a id='Background'></a>

# Background

In this notebook, we will learn how to extract the records stored in the online Customer and Products feature sets created in `Module-1`.
We will then use the XGBoost model that was deployed to the endpoint in `Module-3`, making inference calls to it using the Customer and Product data we extracted from the feature store as parameters. 
Finally, we will then process the response from the XGBoost endpoint to determine if the customer should be offered a discount for each of the items in their basket.

![Data Flow](../images/FS4_1.png "Data Flow")

### Important Notes
##### `This module depends on Modules 1, 2 and 3 being run first, as it uses the feature groups created in Module 1 and the endpoint created in Module 3.`
##### In order to make this example simple, there is no "Orders" feature group to connect the Customer and the Products in a basket, as we would use in a real life scenario.

<br>

<a id='Setup'></a>

# Setup

##### Upgrade to the latest version of the boto3 
This ensures the `batch_get_record` API call is available for pulling multi records from feature store in a single operation.

In [None]:
%%capture 

!pip install --upgrade boto3

### Imports

In [None]:
import sagemaker
import logging
import boto3

### Retrieve endpoint and Feature Store group names from previous modules

In [None]:
%store -r customers_feature_group_name
%store -r products_feature_group_name
%store -r endpoint_name

# give the endpoint a more meaningful name as we're using it for a different purpose here.
discount_endpoint_name=endpoint_name 

### Essentials

In [None]:
# Set up SageMaker specific variables
sagemaker_session = sagemaker.Session()

boto_session = boto3.Session()
sagemaker_runtime = boto_session.client('sagemaker-runtime') # for endpoint response
sagemaker_client = sagemaker_session.boto_session.client('sagemaker')
featurestore_runtime = boto_session.client(service_name='sagemaker-featurestore-runtime') # used for FS fetch

### Set up logging

In [None]:
logger = logging.getLogger('__name__')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())

logger.info(f'Using SageMaker version: {sagemaker.__version__}')

<br>

<a id='Create Feature Store helper functions'></a>

# Create Feature Store helper functions
##### These are a group of helper functions used to manipulate and fetch records from Feature Store.
##### The primary function to be aware of is get_online_feature_group_records which calls the batch_get_record API call in the Feature Store SDK.

In [None]:
# Helper function to convert the record pulled back from the Feature Store to data of type dictionary.
def _record_to_dict(rec, feature_types):
    tmp_dict = {}
    for f in rec:
        feature_name = f['FeatureName']
        string_feature_val = f['ValueAsString']
        feature_type = feature_types[feature_name]
        
        if feature_type == 'Integral':
            tmp_dict[f['FeatureName']] = int(string_feature_val)
        elif feature_type == 'Fractional':
            tmp_dict[f['FeatureName']] = float(string_feature_val)
        else:
            tmp_dict[f['FeatureName']] = string_feature_val

    return tmp_dict


def get_feature_definitions(fg_name):
    fgdescription = sagemaker_client.describe_feature_group(FeatureGroupName=fg_name)    
    return fgdescription 


def get_online_feature_group_records(fg_name, id_value_list):
##### This function demonstrates how to get a batch of records in a single operation from the online feature store using batch_get_record.
##### Previously we needed to call the getrecord API multiple times and manage parallelization of the API calls to achieve lower latency.
##### Fetching a single record at a time is time consuming and increased operational complexity.
##### To read multiple records from SageMaker Feature Store in a single, efficient API call, we'll use the batch_get_record() API call.
    
    feature_defs = get_feature_definitions(fg_name)['FeatureDefinitions']
    feature_types = {}
    feature_names = []
    for fd in feature_defs:
        feature_names.append(fd['FeatureName'])
        feature_types[fd['FeatureName']] = fd['FeatureType']
        
    results = []
    
    identifiers = []
    ids_list = []
    for curr_id in id_value_list:
        record_identifier_value = str(curr_id)
        ids_list.append(record_identifier_value)
    
    identifiers.append({'FeatureGroupName': fg_name,
                        'RecordIdentifiersValueAsString': ids_list,
                        'FeatureNames': feature_names})
        
    resp = featurestore_runtime.batch_get_record(Identifiers=identifiers)
    
    for rec_dict in resp['Records']:
        results.append(_record_to_dict(rec_dict['Record'], feature_types))

    return results


# Count the number of items in a key-value pair dictionary record returned from the feature store
def get_number_of_products_in_feature_set(dict):
    record_count = 0
    for i in enumerate(dict):
        record_count += 1
    return record_count

<br>

<a id='Fetch a single Customer record from the Online Feature Store'></a>

# Fetch a single Customer record from the Online Feature Store

In [None]:
# Fetch details of the customer with the given ID from the Customer feature store group.
# Here, it is used for a single Customer record fetch, but it will also be used later to fetch multiple Product records.
customer_record = get_online_feature_group_records(customers_feature_group_name, ['C50'])


# store the customer_id as we'll need it in the summary section
customer_id=customer_record[0]["customer_id"]

# convert the customer features to a list
customer_values = customer_record[0].values() # get dict_values from customer 

# convert to list, eg: [0, 0, 0, 0, 0, 0, 1, 0, 0] - we'll append the product value to this next.
# remove the customer_id and event_time values as they are not needed in the vector passed to the inference endpoint
customer_features_list = list(customer_values)[1:3] + list(customer_values)[4:]
#print(customer_features_list) # should be nine Customer features in a list

In [None]:
customer_features_list

<br>

<a id='Batch fetch multiple Product records from the Online Feature Store'></a>

# Batch fetch multiple Product records from the Online Feature Store
##### Fetch a list of randomly selected Products from the Products feature group created in module 1.
##### Up to 100 records can be fetched from an online Feature Store in a single batch operation.
##### For brevity, we will only fetch 10 Product records.

In [None]:
product_records = get_online_feature_group_records(products_feature_group_name, ['P256','P2','P6','P17','P28','P42','P71','P106','P242','P4'])

In [None]:
product_records[0]

<br>

<a id='Real time inference using the deployed endpoint'></a>

# Real time inference using the deployed endpoint
##### Now we will combine the vector of Customer features with a vector of Product features, one per Product in the customer basket.
##### We will then pass the combined vector to the inference endpoint we created in Module 3, one call for each Product.
##### The value returned from the inference call will determine if each Product in the Customer's basket should have a discount applied.

In [None]:
full_payload = ''

print('Customer ID is ' + customer_id)

# Process the set of features for each Product in the Customer's basket
products_in_basket_count = get_number_of_products_in_feature_set(product_records)

# Now iterate through the Products record and combine the data for each product with that of the single Customer to whom the basket belongs.
for i in range(products_in_basket_count):
    
    # store the product_id of the current Product record as we'll need it after the inference
    product_id=product_records[i]["product_id"] 
    
    # get feature values for this Product from the record retrived from Feature Store
    product_values = product_records[i].values() 
    
    # convert feature values to a list of product features, e.g.: [0, 0, 0, 0, 0, 0, 1, 0, .....,0]
    # be sure to skip the first two features (product_id and event_time) since the model doesn't use them
    product_features_list = list(product_values)[2:]    

    # Concatenate the transformed Customer and Product parameter lists into a single 29 x 1 vector
    inferencepayloadlist = customer_features_list + product_features_list

    # convert the combined payload list into a string ready for inference
    inferencepayloadstring=','.join([str(item) for item in inferencepayloadlist])
    
    # concatenate this payload onto the full multi-record payload, separating with a csv newline character
    full_payload = full_payload + inferencepayloadstring + '\n'

In [None]:
print(full_payload)

In [None]:
# Call the inference endpoint we created in module 3, passing all ten sets of Customer+Product parameters
endpoint_response = sagemaker_runtime.invoke_endpoint(EndpointName=discount_endpoint_name, 
                                                      Body=full_payload, ContentType='text/csv') # , Accept='Accept') ##serializer(CSVSerializer), 

product_discount_ratings = endpoint_response['Body'].read().decode() # Response will be in the range 0.0 - 1.0  

In [None]:
product_discount_ratings

In [None]:
#Split out individual responses from endpoint
ratings_list = product_discount_ratings.split(',')

In [None]:
# Set an arbitrary discount value between 0 and 1.00
discount_rating_threshold = 0.65

# Display discount or standard pricing for each Product in the Customer's basket.
for i in range(len(product_records)):
    rating = float(ratings_list[i])
    product_id = product_records[i]['product_id']
    if rating < discount_rating_threshold:
        print(f'   Product ID {product_id} discount rating = {rating:2.2f}: Standard list price.')
    else:
        print(f'   Product ID {product_id} discount rating = {rating:2.2f}: Offer discount or subscription to customer.')