# Building  Personalize ReRanking

## Imports 

Python ships with a broad collection of libraries and we need to import those as well as the ones installed to help us like boto3(The AWS SDK) and Pandas/Numpy which are core data science tools

In [1]:
# Imports
import boto3
import json
import numpy as np
import pandas as pd
import time


Next you will want to validate that your environment can communicate successfully with Amazon Personalize, the lines below do just that.

In [2]:
# Configure the SDK to Personalize:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

In [3]:
%store -r dataset_group_arn
%store -r Rerank_arn
%store -r useritems
%store -r bucket

## Create the Solution and Version

In Amazon Personalize a trained model is called a Solution, each Solution can have many specific versions that relate to a given volume of data when the model was trained.

To begin we will list all the recipies that are supported, a recipie is an algorithm that has not been trained on your data yet. After listing you'll select one and use that to build your model.

### Select Recipe

In [4]:
recipe_arn = "arn:aws:personalize:::recipe/aws-personalized-ranking" 
# aws-personalized-ranking selected for this solution. 

### Create and Wait for Solution

First you will create the solution with the API, then you will create a version. It will take several minutes to train the model and thus create your version of a solution. Once it gets started and you are seeing the in progress notifications it is a good time to take a break, grab a coffee, etc.

#### Create Solution

In [5]:
create_solution_response = personalize.create_solution(
    name = "personalize-demo-soln-personal-ranking",   # Please change the solution name if you are changing the recipe
    datasetGroupArn = dataset_group_arn,
    recipeArn = recipe_arn
)

prr_solution_arn = create_solution_response['solutionArn']
print(json.dumps(create_solution_response, indent=2))

{
  "solutionArn": "arn:aws:personalize:us-east-1:011777388888:solution/personalize-demo-soln-personal-ranking",
  "ResponseMetadata": {
    "RequestId": "677d6672-3e4e-420e-b028-918d35970306",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Thu, 27 Feb 2020 05:40:31 GMT",
      "x-amzn-requestid": "677d6672-3e4e-420e-b028-918d35970306",
      "content-length": "108",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Create Solution Version

In [6]:
create_solution_version_response = personalize.create_solution_version(
    solutionArn = prr_solution_arn
)

prr_solution_version_arn = create_solution_version_response['solutionVersionArn']
print(json.dumps(create_solution_version_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:011777388888:solution/personalize-demo-soln-personal-ranking/7c934dd6",
  "ResponseMetadata": {
    "RequestId": "23014adf-c7d9-4cc8-8b09-387a12507d62",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Thu, 27 Feb 2020 05:40:31 GMT",
      "x-amzn-requestid": "23014adf-c7d9-4cc8-8b09-387a12507d62",
      "content-length": "124",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Wait for Solution Version to Have ACTIVE Status

This will take at least 20 minutes.

In [7]:
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_solution_version_response = personalize.describe_solution_version(
        solutionVersionArn = prr_solution_version_arn
    )
    status = describe_solution_version_response["solutionVersion"]["status"]
    print("SolutionVersion: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

SolutionVersion: CREATE PENDING
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGRESS
SolutionVersion: CREATE IN_PROGR

#### Get Metrics of Solution Version

Now that your solution and version exists, you can obtain the metrics for it to judge its performance. These metrics are not particularly good as it is a demo set of data, but with larger more compelx datasets you should see improvements.

In [8]:
get_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = prr_solution_version_arn
)

print(json.dumps(get_solution_metrics_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:011777388888:solution/personalize-demo-soln-personal-ranking/7c934dd6",
  "metrics": {
    "coverage": 0.0014,
    "mean_reciprocal_rank_at_25": 0.0039,
    "normalized_discounted_cumulative_gain_at_10": 0.0058,
    "normalized_discounted_cumulative_gain_at_25": 0.0072,
    "normalized_discounted_cumulative_gain_at_5": 0.004,
    "precision_at_10": 0.001,
    "precision_at_25": 0.0006,
    "precision_at_5": 0.001
  },
  "ResponseMetadata": {
    "RequestId": "1dcd2836-7911-4164-81c0-24848be2dbb7",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Thu, 27 Feb 2020 06:35:36 GMT",
      "x-amzn-requestid": "1dcd2836-7911-4164-81c0-24848be2dbb7",
      "content-length": "419",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


## Create and Wait for the Campaign

Now that you have a working solution version you will need to create a campaign to use it with your applications. A campaign is simply a hosted copy of your model. Again there will be a short wait so after executing you can take a quick break while the infrastructure is being provisioned.

#### Create Campaign

In [9]:
create_campaign_response = personalize.create_campaign(
    name = "personalize-campaign-reranking",
    solutionVersionArn = prr_solution_version_arn,
    minProvisionedTPS = 1
)

prr_campaign_arn = create_campaign_response['campaignArn']
print(json.dumps(create_campaign_response, indent=2))

{
  "campaignArn": "arn:aws:personalize:us-east-1:011777388888:campaign/personalize-campaign-reranking",
  "ResponseMetadata": {
    "RequestId": "328d3ea6-9572-4fbf-a470-53a3df09dee3",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "content-type": "application/x-amz-json-1.1",
      "date": "Thu, 27 Feb 2020 06:35:36 GMT",
      "x-amzn-requestid": "328d3ea6-9572-4fbf-a470-53a3df09dee3",
      "content-length": "100",
      "connection": "keep-alive"
    },
    "RetryAttempts": 0
  }
}


#### Wait for Campaign to Have ACTIVE Status

In [10]:
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:
    describe_campaign_response = personalize.describe_campaign(
        campaignArn = prr_campaign_arn
    )
    status = describe_campaign_response["campaign"]["status"]
    print("Campaign: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)

Campaign: CREATE PENDING
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: CREATE IN_PROGRESS
Campaign: ACTIVE


## Get Sample ReRanking



In [11]:
# Getting random itemlist
datasample = useritems.sample(n=20)
sampleitemlist = []
for index, row in datasample.iterrows():
    sampleitemlist.append(row[1])
    
    
#Getting a random user
user_id, item_id, _ = useritems.sample().values[0]

# print("USER: {}".format(item_id))
# print("ItemList:")
# print(sampleitemlist)

In [12]:
# First load items into memory
%store -r bucket
allitemuri = f's3://{bucket}/items_w_metadata.csv'
items = pd.read_csv(allitemuri, sep=',', usecols=[0,1], names=['asin', 'title'], index_col='asin',header=0)

# print(items)

def get_allstore_products(asin):
    """
    Takes in an ID, returns a title
    """
    asin = str(asin)
    return items.loc[asin]['title']

#### Call GetRecommendations

Using the user that you obtained above, the lines below will get recommendations for you and return the list of movies that are recommended.


In [13]:
get_recommendations_response = personalize_runtime.get_personalized_ranking(
    campaignArn=prr_campaign_arn,
    inputList=sampleitemlist,
    userId = str(user_id)
)
# Update DF rendering
pd.set_option('display.max_rows', 30)

print("Recommendations for user: ", user_id)

item_list = get_recommendations_response['personalizedRanking']


recommendation_list = []

for item in item_list:
    title = get_allstore_products(item['itemId'])
    recommendation_list.append(title)  
    
recommendations_df = pd.DataFrame(recommendation_list, columns = ['Item Description'])
recommendations_df

Recommendations for user:  A377DC2CPICKNZ


Unnamed: 0,Item Description
0,Hoppe's BoreSnake Rifle Bore Cleaner (Choose Y...
1,Lee Precision Universal Press Shell Holder Set...
2,GLOCK DISASSEMBLY TOOL
3,AccuShot Premium 1-Inch Weaver Style See-Thru ...
4,Pelican 1750 Protect Case Hard 50.5X13.5X5 175...
5,DeSantis Sof-Tuck S&amp;W Shield Right Hand Tan
6,NcStar Ruger Mini 14/Mini 30 Ranch Rifle Weave...
7,"Magpul iPhone 3 Field Case, Black"
8,Bicycle Bell black Alloy mini by Biria
9,Sport-Brella Versa-Brella All Position Umbrell...


## Update Lambda Configuration

In [14]:
lambdaclient = boto3.client('lambda')

response = lambdaclient.get_function(
    FunctionName=Rerank_arn,
)

Environment = response['Configuration']['Environment']
print(Environment)

#update Env var

Environment['Variables']['RANKING_ARN'] = prr_campaign_arn

response = lambdaclient.update_function_configuration(
    FunctionName=Rerank_arn,
    Environment=Environment
)

{'Variables': {'DDB_TABLE': 'pstore-Items', 'SEARCH_ARN': 'arn:aws:lambda:us-east-1:011777388888:function:pstore-Search', 'RANKING_ARN': 'arn:aws:personalize:us-east-1:387269085412:campaign/personalize-demo-soln-ranking', 'ESENDPOINT': 'vpc-pstore-domain-73zivgp5mdco232mldmdogsppm.us-east-1.es.amazonaws.com'}}


## Review

Using the codes above you have successfully trained a deep learning model to generate Allstore Merchandise recommendations based on prior user behavior. Think about other types of problems where this data is available and what it might look like to build a system like this to offer those recommendations.

Now you are ready to move onto the next notebook `3.Building_Campaign_P-rank.ipynb`

