# Training a Recommender <a class="anchor" id="top"></a>


## Outline

1. [Introduction](#intro)
1. [Create a "Top picks for you" Recommender](#recommenders)
1. [Evaluate recommenders](#eval)
1. [Using Evaluation Metrics](#usemetrics)


## Introduction <a class="anchor" id="intro"></a>

In the previous notebook: [`02_Train_Personalize_Model_01_Data.ipynb`](02_Train_Personalize_Model_01_Data.ipynb) you prepared datasets that represent User interactions, and Media catalog data, and created Datasets in Amazon Personalize for this data.

In this Notebook we will train a Domain Optimized Recommender that returns video recommendations. The goal is to recommend products that are relevant based on a particular user.


## Define your Use Case <a class="anchor" id="usecase"></a>
[Back to top](#top)

There are a few guidelines for scoping a problem suitable for Amazon Personalize. We recommend the values below as a starting point, although the [official limits](https://docs.aws.amazon.com/personalize/latest/dg/limits.html) lie a little lower.

* Authenticated users
* At least 50 unique users
* At least 100 unique items
* At least 2 dozen interactions for each user 

Most of the time this is easily attainable, and if you are low in one category, you can often make up for it by having a larger number in another category.

The user-item-iteraction data is key for getting started with the service. This means we need to look for use cases that generate that kind of data, a few common examples are:

1. Video-on-demand applications
1. E-commerce platforms

Defining your use-case will inform what data and what type of data you need.

### Train Models and create API's for recommendations

In this section we will be creating Video on Demand Use Case Optimized Recommender for the following use case:

1. [Top picks for you](https://docs.aws.amazon.com/personalize/latest/dg/VIDEO_ON_DEMAND-use-cases.html#top-picks-use-case): personalized content recommendations for a user that you specify. With this use case, Amazon Personalize automatically filters videos the user watched based on the userId that you specify and Watch events.


Similar to the previous notebook, start by importing the relevant packages, and set up a connection to Amazon Personalize using the SDK.

In [None]:
import time
from time import sleep
import json
from datetime import datetime
import uuid
import random
import boto3
import botocore
from botocore.exceptions import ClientError

In [None]:
# retrive the saved variables from the previous notebook
%store -r

In [None]:
# Configure the SDK to Personalize:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

# How to train your Use Case Optimized Recommender

Below we will walk you through the steps we used to create these resources. 

### Ready... Set... Train! :

Now that the data is imported and ready for use, we will create Video on Demand Use Case Optimized Recommender for the following use cases:

1. [Top picks for you](https://docs.aws.amazon.com/personalize/latest/dg/VIDEO_ON_DEMAND-use-cases.html#top-picks-use-case): personalized content recommendations for a user that you specify. With this use case, Amazon Personalize automatically filters videos the user watched based on the userId that you specify and Watch events.


## Create a "Top picks for you" Recommender <a class="anchor" id="recommenders"></a>
[Back to top](#top)

We'll create a pre-configured VIDEO_ON_DEMAND Recommender that matches our use case. 

Each domain has different use cases. When you create a recommender you create it for a specific use case, and each use case has different requirements for getting recommendations.

Let us look at the recommenders supported for the VIDEO_ON_DEMAND domain:

In [None]:
available_recipes = personalize.list_recipes(domain='VIDEO_ON_DEMAND')
display_available_recipes = available_recipes ['recipes']
available_recipes = personalize.list_recipes(domain='VIDEO_ON_DEMAND',nextToken=available_recipes['nextToken'])#paging to get the rest of the recipes 
display_available_recipes = display_available_recipes + available_recipes['recipes']
display(display_available_recipes)

We are going to create a recommender of the type "Top picks for you". This type of recommender offers personalized streaming content recommendations for a user that you specify. With this use case, Amazon Personalize automatically filters videos the user watched based on the userId that you specify and Watch events.

We will add:

```python
recommenderConfig = {"enableMetadataWithRecommendations": True}
```
in order to include metadata when you getRecommendations request. This is not required, but if not enabled, the recommender will only return the itemIds of the recommended items. You can find more about this feature in the documentation ([domain](https://docs.aws.amazon.com/personalize/latest/dg/creating-recommenders.html)).

In [None]:
try:
    create_recommender_response = personalize.create_recommender(
        name = recommender_top_picks_for_you_name,
        recipeArn = 'arn:aws:personalize:::recipe/aws-vod-top-picks',
        datasetGroupArn = workshop_dataset_group_arn,
        recommenderConfig = {"enableMetadataWithRecommendations": True}
    )
    workshop_recommender_top_picks_arn = create_recommender_response["recommenderArn"]
    
    print (json.dumps(create_recommender_response))
    print ('\nCreating the Top Picks For You recommender with workshop_recommender_top_picks_arn = {}'.format(workshop_recommender_top_picks_arn))
    
except personalize.exceptions.ResourceAlreadyExistsException as e:
    workshop_recommender_top_picks_arn =  'arn:aws:personalize:'+region+':'+account_id+':recommender/'+recommender_top_picks_for_you_name
    print('The Top Picks For You recommender {} already exists.'.format(workshop_recommender_top_picks_arn))
    print ('\nWe will be using the existing Top Picks For You recommender with workshop_recommender_top_picks_arn = {}'.format(workshop_recommender_top_picks_arn))
    
    

### View recommender creation status

We set up a loop to see the status of the recommender creation. This can take more than 60 minutes to train. 

In [None]:
max_time = time.time() + 10*60*60 # 10 hours
while time.time() < max_time:

    # Recommender top_picks_for_you
    version_response = personalize.describe_recommender(
        recommenderArn = workshop_recommender_top_picks_arn
    )
    status_top_picks = version_response["recommender"]["status"]

    if status_top_picks == "ACTIVE":
        print("Build succeeded for {}".format(workshop_recommender_top_picks_arn))
    elif status_top_picks == "CREATE FAILED":
        print("Build failed for {}".format(workshop_recommender_top_picks_arn))
        break

    if not status_top_picks == "ACTIVE":
        print("The Top Picks for You recommender build is still in progress")
    else:
        print("The Top Picks for You recommender is ACTIVE")

    if status_top_picks == 'ACTIVE':
        break
    print()
    time.sleep(60)

## Evaluate recommenders <a class="anchor" id="eval"></a>
[Back to top](#top)

Personalize calculates these metrics based on a subset of the training data. The image below illustrates how Personalize splits the data. Given 10 users, with 10 interactions each (a circle represents an interaction), the interactions are ordered from oldest to newest based on the timestamp. Personalize uses all of the interaction data from 90% of the users (blue circles) to train the solution version, and the remaining 10% for evaluation. For each of the users in the remaining 10%, 90% of their interaction data (green circles) is used as input for the call to the trained model. The remaining 10% of their data (orange circle) is compared to the output produced by the model and used to calculate the evaluation metrics.

![personalize metrics](./images/personalize_metrics.png)

We recommend reading [the documentation](https://docs.aws.amazon.com/personalize/latest/dg/working-with-training-metrics.html) to understand the metrics, but we have also copied parts of the documentation below for convenience.

You need to understand the following terms regarding evaluation in Personalize:

* *Relevant recommendation* refers to a recommendation that matches a value in the testing data for the particular user.
* *Rank* refers to the position of a recommended item in the list of recommendations. Position 1 (the top of the list) is presumed to be the most relevant to the user.
* *Query* refers to the internal equivalent of a GetRecommendations call.

The metrics produced by Personalize are:
* **coverage**: The proportion of unique recommended items from all queries out of the total number of unique items in the training data (includes both the Items and Interactions datasets).
* **mean_reciprocal_rank_at_25**: The [mean of the reciprocal ranks](https://en.wikipedia.org/wiki/Mean_reciprocal_rank) of the first relevant recommendation out of the top 25 recommendations over all queries. This metric is appropriate if you're interested in the single highest ranked recommendation.
* **normalized_discounted_cumulative_gain_at_K**: Discounted gain assumes that recommendations lower on a list of recommendations are less relevant than higher recommendations. Therefore, each recommendation is discounted (given a lower weight) by a factor dependent on its position. To produce the [cumulative discounted gain](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) (DCG) at K, each relevant discounted recommendation in the top K recommendations is summed together. The normalized discounted cumulative gain (NDCG) is the DCG divided by the ideal DCG such that NDCG is between 0 - 1. (The ideal DCG is where the top K recommendations are sorted by relevance.) Amazon Personalize uses a weighting factor of 1/log(1 + position), where the top of the list is position 1. This metric rewards relevant items that appear near the top of the list, because the top of a list usually draws more attention.
* **precision_at_K**: The number of relevant recommendations out of the top K recommendations divided by K. This metric rewards precise recommendation of the relevant items.

Let's take a look at the evaluation metrics for the recommender.

## "Top Picks for You" recommender metrics

Retrieve the evaluation metrics for the "Top Picks For you" Recommender.

In [None]:
workshop_recommender_top_picks_metrics_response = personalize.describe_recommender(
    recommenderArn = workshop_recommender_top_picks_arn
)

status_top_picks = version_response["recommender"]["status"]

if status_top_picks == "ACTIVE":
    for metric in workshop_recommender_top_picks_metrics_response['recommender']['modelMetrics']:
        print ("{}: {}".format(metric, workshop_recommender_top_picks_metrics_response['recommender']['modelMetrics'][metric]))
elif status_top_picks == "CREATE FAILED":
    print("Build failed for {}".format(workshop_recommender_top_picks_arn))
elif not status_top_picks == "ACTIVE":
    print("The Top Picks for You recommender build is still in progress")
   


## Using evaluation metrics <a class="anchor" id="usemetrics"></a>
[Back to top](#top)

It is important to use evaluation metrics carefully. There are a number of factors to keep in mind.

* If there is an existing recommendation system in place, this will have influenced the user's interaction history which you use to train your new solutions. This means the evaluation metrics are biased to favor the existing solution. If you work to push the evaluation metrics to match or exceed the existing solution, you may just be pushing the User Personalization to behave like the existing solution and might not end up with something better.


Keeping in mind these factors, the evaluation metrics produced by Personalize are generally useful for two cases:
1. Comparing the performance of solution versions trained on the same recipe, but with different values for the hyperparameters and features (impression data etc)
1. Comparing the performance of solution versions trained on different recipes. Here also keep in mind that the recipes answer different use cases and comparing them to each other might not make sense in your solution.

Properly evaluating a recommendation system is always best done through A/B testing while measuring actual business outcomes. Since recommendations generated by a system usually influence the user behavior which it is based on, it is better to run small experiments and apply A/B testing for longer periods of time. Over time, the bias from the existing model will fade.

In [None]:
# %store dataset_dir
%store data_dir
%store interactions_filename
%store items_filename
%store workshop_dataset_group_arn
%store workshop_recommender_top_picks_arn
%store region
%store account_id
%store role_name
%store role_arn

[Go to the next notebook `04_Personalized_Emails_with_Amazon_Personalize_and_Generative_AI.ipynb`](04_Personalized_Emails_with_Amazon_Personalize_and_Generative_AI.ipynb) to continue.