# Recommenders and Solutions Recap and Evaluation<a class="anchor" id="top"></a>

## Outline

1. [Introduction](#intro)
1. [Create Domain Recommenders](#recommenders)
1. [Create Solutions](#solutions)
1. [Evaluate Solutions](#eval)
1. [Using Evaluation Metrics](#use)
1. [Deploy a Campaign](#deploy)
1. [Create Filters](#interact)
1. [Storing useful variables](#wrapup)

## Introduction <a class="anchor" id="intro"></a>

In the previous notebook we prepared 3 different datasets that represent sample data that would exist in a Media & Entertainment applicatiopn (User interactions, Media catalog data and subscriber/user data). In order to complete this workshoop within the time set, we have already created several resources on your behalf. If you would like to create these yourself, run through notebook `04_Optional_Import_and_Training.ipynb`. This notebook will add approximately 80 minutes to the workshop. Here is an overview of what resources were created on your behalf. 

[Domain dataset group](https://docs.aws.amazon.com/personalize/latest/dg/domain-dataset-groups.html) if you have a video on demand or e-commerce application and want Amazon Personalize to find the best configurations for your use cases. If you start with a Domain dataset group, you can also add custom resources such as solutions with solution versions trained with recipes for custom use cases. We created a Domain dataset group which is the location where the 3 datasets were imported.

[Interactions Data](https://docs.aws.amazon.com/personalize/latest/dg/interactions-datasets.html) The Interactions dataset is the core dataset that Amazon Personalize uses to provide recommendations. In Media & Entertainment use cases, this data is usually imported from a system that has data about the ways customers interact with media assets. Often this is sourced from Customer Data Platforms (CDP) or Analytics tools. For this lab we are simulating customer interaction data using the MovieLens dataset (https://grouplens.org/datasets/movielens/). 

We imported a dataset with the following fields:

1. **UserID** - The user who interacted
1. **ItemID** - The item the user interacted with
1. **Timestamp** - The time at which the interaction occurred
1. **Event Type** - Categorical label of an event (clicked or watched).

Note: This is just a subset of the data that can be sent in to provide/improve recommendations, we are not showing additional Amazon Personalize features for contextual receommendations, modeling on impression data etc. For full information see the documentation linked above.


[Item Metadata](https://docs.aws.amazon.com/personalize/latest/dg/items-datasets.html) he item data that you can import into Amazon Personalize includes numerical and categorical metadata such as creation timestamp, price, genre, description, and availability. You import metadata about your items into an Amazon Personalize Items dataset. Some domains and recipes require an Items dataset. 

Insert IMDB Data descriptoin


[User Metadata](https://docs.aws.amazon.com/personalize/latest/dg/users-datasets.html) The user data that you can import into Amazon Personalize includes numerical and categorical metadata about your users, such as gender or loyalty membership. Since we do not have user data for the users in the Movielens data set, we are generating a synthetic dataset for the purposes of this workshop.

1. **UserID** - The user who interacted
1. **Membershiplevel** - A randomly assigned membership level

### In this notebook we will accomplish the following:

Create Video on Demand Domain Recommenders for the following use cases:

1. [More like X](https://docs.aws.amazon.com/personalize/latest/dg/VIDEO_ON_DEMAND-use-cases.html#more-like-y-use-case): recommendations for movies that are similar to a movie that you specify. With this use case, Amazon Personalize automatically filters movies the user watched based on the userId that you specify and Watch events.

1. [Top picks for you](https://docs.aws.amazon.com/personalize/latest/dg/VIDEO_ON_DEMAND-use-cases.html#top-picks-use-case): personalized content recommendations for a user that you specify. With this use case, Amazon Personalize automatically filters videos the user watched based on the userId that you specify and Watch events.

Create a custom solution and solution versions for the following use case:

3. [Personalized-Ranking](https://docs.aws.amazon.com/personalize/latest/dg/working-with-predefined-recipes.html): will be used to rerank a list of movies.

![Workflow](images/image2.png)

To run this notebook, you need to have run the previous notebook, `01_Data_Layer.ipynb`, where you created a dataset and imported interaction, item, and user metadata data into Amazon Personalize. At the end of that notebook, you saved some of the variable values, which you now need to load into this notebook.

In [4]:
%store -r

Similar to the previous notebook, start by importing the relevant packages, and set up a connection to Amazon Personalize using the SDK.

In [5]:
import time
from time import sleep
import json
from datetime import datetime
import uuid
import random
import boto3
import botocore
from botocore.exceptions import ClientError
import pandas as pd

In [6]:
# Configure the SDK to Personalize:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

## Retrieve your automated deployment variables

As mentioned at the top of this notebook, a dataset group, schemas, datasets, solutions, and campaigns have already been created for you. You can open another browser tab/window to view these resources in the Personalize AWS Console. 

The code below will lookup the pre-created resources using the Personalize API.

In [61]:
# Configure the SDK to Personalize:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')
personalize_events = boto3.client(service_name='personalize-events')
from script import get_dataset_group_info, filter_arns, dataset_arns, schema_arns, event_tracker_arns, campaign_arns, recommender_arns, solution_arns
datasetGroupArn = get_dataset_group_info('personalize-poc-movielens)
for dataset in dataset_arns:
    if dataset.find("INTERACTIONS") != -1:
        interactions_dataset_arn = dataset
for dataset in dataset_arns:
    if dataset.find("ITEMS") != -1:
        items_dataset_arn = dataset
for dataset in dataset_arns:
    if dataset.find("USERS") != -1:
        users_dataset_arn = dataset
for schema in schema_arns:
    if schema.find("Interactions") != -1:
        interactions_schema_arn = schema
for schema in schema_arns:
    if schema.find("User") != -1:
        users_schema_arn = schema  
for schema in schema_arns:
    if schema.find("Item") != -1:
        items_schema_arn = schema  
for recommender in recommender_arns:
    if campaign.find("Personalization") != -1:
        personalization_campaign_arn = campaign
for recommender in recommender_arns:
    if campaign.find("sims") != -1:
        sims_campaign_arn = campaign
for campaign in campaign_arns:
    if campaign.find("Ranking") != -1:
        ranking_campaign_arn = campaign
for solution in solution_arns:
    if solution.find("ranking") != -1:
        ranking_solution_arn = solution

event_tracker_arn = event_tracker_arns[0]
with open('/opt/ml/metadata/resource-metadata.json') as notebook_info:
    data = json.load(notebook_info)
    resource_arn = data['ResourceArn']
    region = resource_arn.split(':')[3]
print(region)

ModuleNotFoundError: No module named 'script'

In [7]:
if workshop_training_complete:
    dataset_group_arn = workshop_dataset_group_arn
    interactions_dataset_arn = workshop_interactions_dataset_arn
    rerank_solution_version_arn = workshop_rerank_solution_version_arn
    rerank_campaign_arn = workshop_rerank_campaign_arn
    recommender_top_picks_arn = workshop_recommender_top_picks_arn
    recommender_more_like_x_arn = workshop_recommender_more_like_x_arn
    items_schema_arn = workshop_items_schema_arn
    users_schema_arn = workshop_users_schema_arn
    interactions_schema_arn = workshop_interactions_schema_arn
    users_dataset_arn = workshop_users_dataset_arn
    items_dataset_arn = workshop_items_dataset_arn

## Dataset groups and the interactions dataset <a class="anchor" id="group_dataset"></a>
[Back to top](#top)

The highest level of isolation and abstraction with Amazon Personalize is a *dataset group*. Information stored within one of these dataset groups has no impact on any other dataset group or models created from one - they are completely isolated. This allows you to run many experiments and is part of how we keep your models private and fully trained only on your data. 

Before importing the data prepared earlier, there needs to be a dataset group and a dataset added to it that handles the interactions.

Dataset groups can house the following types of information:

* User-item-interactions
* Event streams (real-time interactions)
* User metadata
* Item metadata

Before we create the dataset group and the dataset for our interaction data, let's validate that your environment can communicate successfully with Amazon Personalize.

In [8]:
print(dataset_group_arn)

arn:aws:personalize:us-east-1:051545784337:dataset-group/workshop-personalize-poc-movielens


### Evaluate solution versions 

Personalize calculates these metrics based on a subset of the training data. The image below illustrates how Personalize splits the data. Given 10 users, with 10 interactions each (a circle represents an interaction), the interactions are ordered from oldest to newest based on the timestamp. Personalize uses all of the interaction data from 90% of the users (blue circles) to train the solution version, and the remaining 10% for evaluation. For each of the users in the remaining 10%, 90% of their interaction data (green circles) is used as input for the call to the trained model. The remaining 10% of their data (orange circle) is compared to the output produced by the model and used to calculate the evaluation metrics.

![personalize metrics](../../static/imgs/personalize_metrics.png)

We recommend reading [the documentation](https://docs.aws.amazon.com/personalize/latest/dg/working-with-training-metrics.html) to understand the metrics, but we have also copied parts of the documentation below for convenience.

You need to understand the following terms regarding evaluation in Personalize:

* *Relevant recommendation* refers to a recommendation that matches a value in the testing data for the particular user.
* *Rank* refers to the position of a recommended item in the list of recommendations. Position 1 (the top of the list) is presumed to be the most relevant to the user.
* *Query* refers to the internal equivalent of a GetRecommendations call.

The metrics produced by Personalize are:
* **coverage**: The proportion of unique recommended items from all queries out of the total number of unique items in the training data (includes both the Items and Interactions datasets).
* **mean_reciprocal_rank_at_25**: The [mean of the reciprocal ranks](https://en.wikipedia.org/wiki/Mean_reciprocal_rank) of the first relevant recommendation out of the top 25 recommendations over all queries. This metric is appropriate if you're interested in the single highest ranked recommendation.
* **normalized_discounted_cumulative_gain_at_K**: Discounted gain assumes that recommendations lower on a list of recommendations are less relevant than higher recommendations. Therefore, each recommendation is discounted (given a lower weight) by a factor dependent on its position. To produce the [cumulative discounted gain](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) (DCG) at K, each relevant discounted recommendation in the top K recommendations is summed together. The normalized discounted cumulative gain (NDCG) is the DCG divided by the ideal DCG such that NDCG is between 0 - 1. (The ideal DCG is where the top K recommendations are sorted by relevance.) Amazon Personalize uses a weighting factor of 1/log(1 + position), where the top of the list is position 1. This metric rewards relevant items that appear near the top of the list, because the top of a list usually draws more attention.
* **precision_at_K**: The number of relevant recommendations out of the top K recommendations divided by K. This metric rewards precise recommendation of the relevant items.

Let's take a look at the evaluation metrics for each of the solutions produced in this notebook. Please note that your results might differ from the results described in the text of this notebook, due to the quality of the Movielens dataset. 

### Personalized ranking metrics

Retrieve the evaluation metrics for the personalized ranking solution version.

In [9]:
rerank_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = rerank_solution_version_arn
)

print(json.dumps(rerank_solution_metrics_response, indent=2))

{
  "solutionVersionArn": "arn:aws:personalize:us-east-1:051545784337:solution/workshop_personalize-poc-rerank/80054531",
  "metrics": {
    "coverage": 0.0782,
    "mean_reciprocal_rank_at_25": 0.2777,
    "normalized_discounted_cumulative_gain_at_10": 0.3065,
    "normalized_discounted_cumulative_gain_at_25": 0.3723,
    "normalized_discounted_cumulative_gain_at_5": 0.2446,
    "precision_at_10": 0.0589,
    "precision_at_25": 0.0364,
    "precision_at_5": 0.0643
  },
  "ResponseMetadata": {
    "RequestId": "2b4614f3-7dac-44ca-a0c0-c1d944937db8",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Sat, 15 Oct 2022 12:19:42 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "415",
      "connection": "keep-alive",
      "x-amzn-requestid": "2b4614f3-7dac-44ca-a0c0-c1d944937db8"
    },
    "RetryAttempts": 0
  }
}



## Using Evaluation Metrics <a class="anchor" id="use"></a>
[Back to top](#top)

It is important to use evaluation metrics carefully. There are a number of factors to keep in mind.

* If there is an existing recommendation system in place, this will have influenced the user's interaction history which you use to train your new solutions. This means the evaluation metrics are biased to favor the existing solution. If you work to push the evaluation metrics to match or exceed the existing solution, you may just be pushing the User Personalization to behave like the existing solution and might not end up with something better.


Keeping in mind these factors, the evaluation metrics produced by Personalize are generally useful for two cases:
1. Comparing the performance of solution versions trained on the same recipe, but with different values for the hyperparameters and features (impression data etc)
1. Comparing the performance of solution versions trained on different recipes. Here also keep in mind that the recipes answer different use cases and comparing them to each other might not make sense in your solution.

Properly evaluating a recommendation system is always best done through A/B testing while measuring actual business outcomes. Since recommendations generated by a system usually influence the user behavior which it is based on, it is better to run small experiments and apply A/B testing for longer periods of time. Over time, the bias from the existing model will fade.

## Storing useful variables <a class="anchor" id="vars"></a>
[Back to top](#top)

Before exiting this notebook, run the following cells to save the version ARNs for use in the next notebook.

In [10]:
%store rerank_solution_version_arn
%store recommender_top_picks_arn
%store recommender_more_like_x_arn
%store rerank_solution_arn
%store rerank_campaign_arn

Stored 'rerank_solution_version_arn' (str)
Stored 'recommender_top_picks_arn' (str)
Stored 'recommender_more_like_x_arn' (str)
Stored 'rerank_solution_arn' (str)
Stored 'rerank_campaign_arn' (str)


You're all set to move on to the last exploratory notebook: `03_Inference_Layer.ipynb`. Open it from the browser and you can start interacting with the Recommenders and Campaign and getting recommendations!