# Creating Recommenders and Solutions <a class="anchor" id="top"></a>

## Outline

1. [Introduction](#intro)
1. [How to train your Use Case Optimized Recommenders and Solution Versions](#recommenders)
1. [Create Solutions](#solutions)
1. [Evaluate Solutions](#eval)
1. [Using Evaluation Metrics](#use)
1. [Deploy a Campaign](#deploy)
1. [Create Filters](#filters)
1. [Storing Useful Variables](#vars)

To run this notebook, you need to have run [the previous notebook: `01_Data_Layer.ipynb`](01_Data_Layer.ipynb), where you prepared 3 datasets (interactions, items, and users) for use in Amazon Personalize. At the end of that notebook, you saved some variable values, which you now need to load into this notebook.

## Introduction <a class="anchor" id="intro"></a>

In the previous notebook we prepared 3 different datasets that represent our fictional retail store (User interactions, Product catalog data and buyer/user data) and created Datasets in Amazon Personalize for this data.

In this notebook we will define our use-case, train models and create APIs to get recommendations.

## Define your Use Case <a class="anchor" id="usecase"></a>
[Back to top](#top)

There are a few guidelines for scoping a problem suitable for Personalize. We recommend the values below as a starting point.

* At least 50 unique users (identified by a user_id)
* At least 100 unique items
* At least 2 dozen interactions for each user 

The [minimum data requirements](https://docs.aws.amazon.com/personalize/latest/dg/how-it-works-dataset-schema.html) can be found in the documentation.

Most of the time this is easily attainable, and if you are low in one category, you can often make up for it by having a larger number in another category.

The user-item-iteraction data is key for getting started with the service. This means we need to look for use cases that generate that kind of data, a few common examples are:

1. Video-on-demand applications
1. E-commerce platforms

Defining your use-case will inform what data and what type of data you need.

### Train Models and create APIs to get recommendations

In this section we will be creating ECOMMERCE Use Case Optimized Recommenders for the following use cases:

1. [Customers who viewed X also viewed](https://docs.aws.amazon.com/personalize/latest/dg/ECOMMERCE-use-cases.html#customers-also-viewed-use-case): Get recommendations for items that customers also viewed based on an item that you specify. With this use case, Amazon Personalize automatically filters items the user purchased based on the userId that you specify and `Purchase` events.

1. [Recommended For You](https://docs.aws.amazon.com/personalize/latest/dg/ECOMMERCE-use-cases.html#recommended-for-you-use-case): Get personalized recommendations for items based on a user that you specify. With this use case, Amazon Personalize automatically filters items the user purchased based on the userId that you specify and Purchase events. For better performance, include `Purchase` events along with the required `View` events.

in addition we will create a custom solution, solution version and campaign for the following use case:

3. [Personalized Ranking](https://docs.aws.amazon.com/personalize/latest/dg/personalized-ranking-recipes.html): will be used to rank a list of items.

All of these will be created within the same dataset group and with the same input data.

The following diagram shows the resources that we will create in this section. The part we are building  in this notebook is highlighted in blue with a dashed outline.

![Workflow](images/02_Training_Layer_Resources.jpg)

Similar to the previous notebook, start by importing the relevant packages, and set up a connection to Amazon Personalize using the SDK.

In [1]:
import time
from time import sleep
import json
from datetime import datetime
import uuid
import random
import boto3
import botocore
from botocore.exceptions import ClientError
import pandas as pd

In [2]:
%store -r

Unable to restore variable 'articles_mlfeatures', ignoring (use %store -d to forget!)
The error was: <class 'KeyError'>


In [3]:
# Configure the SDK to Personalize:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

## How to train your Use Case Optimized Recommenders and Solution Versions <a class="anchor" id="recommenders"></a>
[Back to top](#top)

As mentioned previously, a dataset group, schemas, datasets, solutions, and campaigns have already been created for you. You can open another browser tab/window to view these resources in the Personalize AWS Console.

Below, we will walk you through the steps we used to create these resources. Since they are already created, we will only be retrieving the automated deployment variables, however you can also run this code to train resources if you have not run the automation.

<div class="alert alert-block alert-warning">
<b>Note:</b> Please take into account that creating these resources in your own account will incur a cost. If you are not using the CloudFormation template, it will take upload and training time to do these steps through this notebook (this can be several hours).
</div>

## Ready... Set... Train! :

Now that the data is imported and ready for use, we will create ECOMMERCE Use Case Optimized Recommenders for the following use cases: [Customers who viewed X also viewed](https://docs.aws.amazon.com/personalize/latest/dg/ECOMMERCE-use-cases.html#customers-also-viewed-use-case) and [Recommended for you](hhttps://docs.aws.amazon.com/personalize/latest/dg/ECOMMERCE-use-cases.html#recommended-for-you-use-case).

We will also create a custom solution and solution versions for the use case [Personalized-Ranking](https://docs.aws.amazon.com/personalize/latest/dg/personalized-ranking-recipes.html).


### Create Use Case Optimized Recommenders <a class="anchor" id="recommenders"></a>
[Back to top](#top)

We'll start with pre-configured ECOMMERCE Recommenders that match some of our core use cases. Each domain has different use cases. When you create a recommender you create it for a specific use case, and each use case has different requirements for getting recommendations.

Let us look at the recommenders supported for the ECOMMERCE domain:

In [4]:
available_recipes = personalize.list_recipes(domain='ECOMMERCE')
display_available_recipes = available_recipes ['recipes']
available_recipes = personalize.list_recipes(domain='ECOMMERCE',nextToken=available_recipes['nextToken'])#paging to get the rest of the recipes 
display_available_recipes = display_available_recipes + available_recipes['recipes']
display(display_available_recipes)

[{'name': 'aws-ecomm-customers-who-viewed-x-also-viewed',
  'recipeArn': 'arn:aws:personalize:::recipe/aws-ecomm-customers-who-viewed-x-also-viewed',
  'status': 'ACTIVE',
  'creationDateTime': datetime.datetime(2019, 6, 10, 0, 0, tzinfo=tzlocal()),
  'lastUpdatedDateTime': datetime.datetime(2024, 6, 19, 16, 47, 19, 191000, tzinfo=tzlocal()),
  'domain': 'ECOMMERCE'},
 {'name': 'aws-ecomm-frequently-bought-together',
  'recipeArn': 'arn:aws:personalize:::recipe/aws-ecomm-frequently-bought-together',
  'status': 'ACTIVE',
  'creationDateTime': datetime.datetime(2019, 6, 10, 0, 0, tzinfo=tzlocal()),
  'lastUpdatedDateTime': datetime.datetime(2024, 6, 19, 16, 47, 19, 191000, tzinfo=tzlocal()),
  'domain': 'ECOMMERCE'},
 {'name': 'aws-ecomm-popular-items-by-purchases',
  'recipeArn': 'arn:aws:personalize:::recipe/aws-ecomm-popular-items-by-purchases',
  'status': 'ACTIVE',
  'creationDateTime': datetime.datetime(2019, 6, 10, 0, 0, tzinfo=tzlocal()),
  'lastUpdatedDateTime': datetime.dateti

[More use cases per domain](https://docs.aws.amazon.com/personalize/latest/dg/domain-use-cases.html).

### Create a "Customers who viewed X also viewed" Recommender

We are going to create a recommender of the type [Customers who viewed X also viewed](https://docs.aws.amazon.com/personalize/latest/dg/ECOMMERCE-use-cases.html#customers-also-viewed-use-case). This recommender gives recommendations for items that customers also viewed based on an item that you specify. With this use case, Amazon Personalize automatically filters items the user purchased based on the userId that you specify and `Purchase` events.

In [22]:
try:
    create_recommender_response = personalize.create_recommender(
      name = recommender_customers_who_viewed_name,
      recipeArn = 'arn:aws:personalize:::recipe/aws-ecomm-customers-who-viewed-x-also-viewed',
      datasetGroupArn = workshop_dataset_group_arn,
      recommenderConfig = {"enableMetadataWithRecommendations": True}
    )
    workshop_recommender_customers_who_viewed_arn = create_recommender_response["recommenderArn"]
    
    print (json.dumps(create_recommender_response))
    print ('\nCreating the Customers who viewed X also Viewed recommender with workshop_recommender_customers_who_viewed_arn = {}'.format(workshop_recommender_customers_who_viewed_arn))
    
except personalize.exceptions.ResourceAlreadyExistsException as e:
    workshop_recommender_customers_who_viewed_arn =  'arn:aws:personalize:'+region+':'+account_id+':recommender/'+recommender_customers_who_viewed_name
    print('The Customers who viewed X also Viewed recommender {} already exists.'.format(recommender_customers_who_viewed_name))
    print('\nWe will be using the existing Customers who viewed X also Viewed recommender with workshop_recommender_customers_who_viewed_arn = {}'.format(workshop_recommender_customers_who_viewed_arn))
    try:
        print('\nUpdating recommender to return metadata')
        response = personalize.update_recommender(
            recommenderArn = workshop_recommender_customers_who_viewed_arn,
            recommenderConfig={
                'enableMetadataWithRecommendations': True
            }
        )
    
    except personalize.exceptions.InvalidInputException as e:
        print('\nRecommender {} already returns metadata as desired'.format(workshop_recommender_customers_who_viewed_arn))

The Customers who viewed X also Viewed recommender workshop_viewed_x_also_viewed already exists.

We will be using the existing Customers who viewed X also Viewed recommender with workshop_recommender_customers_who_viewed_arn = arn:aws:personalize:us-east-1:381491864570:recommender/workshop_viewed_x_also_viewed

Updating recommender to return metadata

Recommender arn:aws:personalize:us-east-1:381491864570:recommender/workshop_viewed_x_also_viewed already returns metadata as desired


### Create a "Recommended For You" Recommender

We are going to create a second recommender of the type [Recommended for you](https://docs.aws.amazon.com/personalize/latest/dg/ECOMMERCE-use-cases.html#recommended-for-you-use-case). This type of recommender offers personalized recommendations for items based on a user that you specify. With this use case, Amazon Personalize automatically filters items the user purchased based on the userId that you specify and `Purchase` events.

In [23]:
try:
    create_recommender_response = personalize.create_recommender(
      name = recommender_recommended_for_you_name,
      recipeArn = 'arn:aws:personalize:::recipe/aws-ecomm-recommended-for-you',
      datasetGroupArn = workshop_dataset_group_arn,
      recommenderConfig = {"enableMetadataWithRecommendations": True}
    )
    workshop_recommender_recommended_for_you_arn = create_recommender_response["recommenderArn"]
    
    print (json.dumps(create_recommender_response))
    print ('\nCreating the Recommended For You recommender with workshop_recommender_recommended_for_you_arn = {}'.format(workshop_recommender_recommended_for_you_arn))
    
except personalize.exceptions.ResourceAlreadyExistsException as e:
    workshop_recommender_recommended_for_you_arn =  'arn:aws:personalize:'+region+':'+account_id+':recommender/'+recommender_recommended_for_you_name
    print('The Recommended For You For You recommender {} already exists.'.format(workshop_recommender_recommended_for_you_arn))
    print ('\nWe will be using the existing Recommended For You recommender with workshop_recommender_top_picks_arn = {}'.format(workshop_recommender_recommended_for_you_arn))
    try:
        print('\nUpdating recommender to return metadata')
        response = personalize.update_recommender(
            recommenderArn = workshop_recommender_recommended_for_you_arn,
            recommenderConfig={
                'enableMetadataWithRecommendations': True
            }
        )
    
    except personalize.exceptions.InvalidInputException as e:
        print('\nRecommender {} already returns metadata as desired'.format(workshop_recommender_recommended_for_you_arn))

The Recommended For You For You recommender arn:aws:personalize:us-east-1:381491864570:recommender/workshop_recommended_for_you already exists.

We will be using the existing Recommended For You recommender with workshop_recommender_top_picks_arn = arn:aws:personalize:us-east-1:381491864570:recommender/workshop_recommended_for_you

Updating recommender to return metadata

Recommender arn:aws:personalize:us-east-1:381491864570:recommender/workshop_recommended_for_you already returns metadata as desired


## Create Solutions <a class="anchor" id="solutions"></a>
[Back to top](#top)

Some use cases require a custom implementation. 

In Amazon Personalize, a specific variation of an algorithm is called a recipe. Different recipes are suitable for different situations. A trained model is called a solution, and each solution can have many versions that relate to a given volume of data when the model was trained.

Let's look at all available recipes that are not of a specific domain and can be used to create cusom solutions. 

In [12]:
available_recipes = personalize.list_recipes()
display_available_recipes = available_recipes ['recipes']
available_recipes = personalize.list_recipes(nextToken=available_recipes['nextToken'])#paging to get the rest of the recipes 
display_available_recipes = display_available_recipes + available_recipes['recipes']

display ([recipe  for recipe in display_available_recipes if 'domain' not in recipe])

[{'name': 'aws-item-affinity',
  'recipeArn': 'arn:aws:personalize:::recipe/aws-item-affinity',
  'status': 'ACTIVE',
  'creationDateTime': datetime.datetime(2021, 7, 15, 0, 0, tzinfo=tzlocal()),
  'lastUpdatedDateTime': datetime.datetime(2024, 6, 19, 16, 47, 19, 191000, tzinfo=tzlocal())},
 {'name': 'aws-item-attribute-affinity',
  'recipeArn': 'arn:aws:personalize:::recipe/aws-item-attribute-affinity',
  'status': 'ACTIVE',
  'creationDateTime': datetime.datetime(2021, 8, 25, 0, 0, tzinfo=tzlocal()),
  'lastUpdatedDateTime': datetime.datetime(2024, 6, 19, 16, 47, 19, 191000, tzinfo=tzlocal())},
 {'name': 'aws-next-best-action',
  'recipeArn': 'arn:aws:personalize:::recipe/aws-next-best-action',
  'status': 'ACTIVE',
  'creationDateTime': datetime.datetime(2023, 8, 11, 0, 0, tzinfo=tzlocal()),
  'lastUpdatedDateTime': datetime.datetime(2024, 6, 19, 16, 47, 19, 191000, tzinfo=tzlocal())},
 {'name': 'aws-personalized-ranking',
  'recipeArn': 'arn:aws:personalize:::recipe/aws-personalize

We want to rank a list of items for a specific user. This is useful if you have a collection of ordered items, such as search results, promotions, or curated lists, and you want to provide a personalized re-ranking for each of your users. To implement this use case, we will create a custom solution using the recipe.

The [Personalized-Ranking](https://docs.aws.amazon.com/personalize/latest/dg/personalized-ranking-recipes.html) recipe provides recommendations in ranked order based on predicted interest level. This recipe generates personalized rankings of items. A personalized ranking is a list of recommended items that are re-ranked for a specific user. This is useful if you have a collection of ordered items, such as search results, promotions, or curated lists, and you want to provide a personalized re-ranking for each of your users.

These custom solution will use the same datasets that we already implemented so all we need to do is create a solution and solution version for this recipe.

### Personalized Ranking

[Personalized-Ranking](https://docs.aws.amazon.com/personalize/latest/dg/personalized-ranking-recipes.html) takes in a list of items as well as a user. The items are then returned back in the order of most probable relevance for the user. Use cases are: to filter on unique categories that you do not have item metadata to create a filter, reorder a curated list by an expert, or when you have a broad collection that you would like better ordered for a particular user.

For our use case, using the product data, we could imagine that a retail application may want to create a shelf of Halloween products. We can generate these lists based on metadata we have. We would use personalized ranking to re-order the list of products for each user. 

We start by selecting the recipe.

In [13]:
workshop_rerank_recipe_arn = "arn:aws:personalize:::recipe/aws-personalized-ranking"

### Create the solution

First you create a solution using the recipe. Although you provide the dataset ARN in this step, the model is not yet trained. See this as an identifier instead of a trained model.

In [14]:
try:
    rerank_create_solution_response = personalize.create_solution(
        name = workshop_rerank_solution_name,
        datasetGroupArn = workshop_dataset_group_arn,
        recipeArn = workshop_rerank_recipe_arn
    )

    workshop_rerank_solution_arn = rerank_create_solution_response['solutionArn']
    print(json.dumps(rerank_create_solution_response, indent=2))

    print ('\nCreating the Personalize Ranking Solution with workshop_rerank_solution_arn = {}'.format(workshop_rerank_solution_arn))

except personalize.exceptions.ResourceAlreadyExistsException as e:
    workshop_rerank_solution_arn =  'arn:aws:personalize:'+region+':'+account_id+':solution/'+workshop_rerank_solution_name
    print('The Personalize Ranking Solution {} already exists.'.format(workshop_rerank_solution_arn))
    print ('\nWe will be using the existing Personalize Ranking Solution with workshop_rerank_solution_arn = {}'.format(workshop_rerank_solution_arn))
    

The Personalize Ranking Solution arn:aws:personalize:us-east-1:381491864570:solution/workshop_personalized_ranking_retail already exists.

We will be using the existing Personalize Ranking Solution with workshop_rerank_solution_arn = arn:aws:personalize:us-east-1:381491864570:solution/workshop_personalized_ranking_retail


Wait for the Solution to be created.

In [15]:
%%time

max_time = time.time() + 6*60*60 # 6 hours
while time.time() < max_time:
    describe_solution_response = personalize.describe_solution(
        solutionArn = workshop_rerank_solution_arn
    )
    status_solution =  describe_solution_response["solution"]['status']
    print("Solution: {}".format(status_solution))
    
    if status_solution == "ACTIVE":
        print("Build succeeded for {}".format(workshop_interactions_dataset_arn))
        break
        
    elif status_solution == "CREATE FAILED":
        print("Build failed for {}".format(workshop_interactions_dataset_arn))
        break
        
    if not status_solution == "ACTIVE":
        print("The Solution creation is still in progress")
    else:
        print("The Solution dataset is ACTIVE")
      

Solution: ACTIVE
Build succeeded for arn:aws:personalize:us-east-1:381491864570:dataset/personalize-immersion-day-retail/INTERACTIONS
CPU times: user 5.57 ms, sys: 0 ns, total: 5.57 ms
Wall time: 42.7 ms


### Create the solution version

Once you have a solution, you need to create a version in order to complete the model training. The training can take a while to complete, upwards of 25 minutes, and an average of 35 minutes for this recipe with our dataset. Normally, we would use a while loop to poll until the task is completed. 

In [16]:
workshop_rerank_solution_version_arn = None

solution_versions_list = personalize.list_solution_versions(
    solutionArn=workshop_rerank_solution_arn,
    maxResults=10
)['solutionVersions']

for solution_vers in solution_versions_list:
    if solution_vers['status'] in ['CREATE PENDING', 'CREATE IN_PROGRESS', 'ACTIVE']:
        workshop_rerank_solution_version_arn = solution_vers['solutionVersionArn']
    if workshop_rerank_solution_version_arn:
        break

if workshop_rerank_solution_version_arn:
    print ('\nWe will be using the existing Personalize Ranking Solution Version with workshop_rerank_solution_version_arn = {}'.format(workshop_rerank_solution_version_arn))
else:
    rerank_create_solution_version_response = personalize.create_solution_version(
        solutionArn = workshop_rerank_solution_arn
    )
    workshop_rerank_solution_version_arn = rerank_create_solution_version_response['solutionVersionArn']
    print(json.dumps(rerank_create_solution_version_response, indent=2))
    
    print ('\nTraining the Personalize Ranking Solution Version with workshop_rerank_solution_version_arn = {}'.format(workshop_rerank_solution_version_arn))
 


We will be using the existing Personalize Ranking Solution Version with workshop_rerank_solution_version_arn = arn:aws:personalize:us-east-1:381491864570:solution/workshop_personalized_ranking_retail/4faeb2e7


### View solution and Recommender creation status

To view the status updates in the console:

* In another browser tab you should already have the AWS Console up from opening this notebook instance. 
* Switch to that tab and search at the top for the service `Personalize`, then go to that service page. 
* Click `Dataset groups`.
* Click the name of your dataset group, if you did not change it, it is "personalize-poc-retail".
* Click `Recommenders`.
* You will see a list of the two recommenders you created above, including a column with the status of the recommender. Once it is `Active`, your recommender is ready.
* Click on `Custom Resources`. This opens up the list of custom resources that you have created.
* Click on `Solutions and Recipes` to see your re-ranking solutions. If you click on `workshop_personalized_ranking_retail` you can see the status of the solution versions. Once it is `Active`, your solution is ready to be reviewed. It is also capable of being deployed.

Or simply run the cell below to keep track of the recommenders and solution version creation status.

In [17]:
max_time = time.time() + 10*60*60 # 10 hours
while time.time() < max_time:

    # Recommender viewed_x_also_viewed
    version_response_recommender_viewed = personalize.describe_recommender(
        recommenderArn = workshop_recommender_customers_who_viewed_arn
    )
    status_viewed_x_also_viewed = version_response_recommender_viewed["recommender"]["status"]

    if status_viewed_x_also_viewed == "ACTIVE":
        print("Build succeeded for {}".format(workshop_recommender_customers_who_viewed_arn))
        
    elif status_viewed_x_also_viewed == "CREATE FAILED":
        print("Build failed for {}".format(workshop_recommender_customers_who_viewed_arn))
        break

    if not status_viewed_x_also_viewed == "ACTIVE":
        print('The recommender "Customers who viewed X also viewed" build is still in progress')
    else:
        print('The recommender "Customers who viewed X also viewed" is ACTIVE')

    # Recommender recommended_for_you
    version_response_recommender_recforu = personalize.describe_recommender(
        recommenderArn = workshop_recommender_recommended_for_you_arn
    )
    status_recommended_for_you = version_response_recommender_recforu["recommender"]["status"]

    if status_recommended_for_you == "ACTIVE":
        print("Build succeeded for {}".format(workshop_recommender_recommended_for_you_arn))
    elif status_recommended_for_you == "CREATE FAILED":
        print("Build failed for {}".format(workshop_recommender_recommended_for_you_arn))
        break

    if not status_recommended_for_you == "ACTIVE":
        print('The recommender "Recommended for you" build is still in progress')
    else:
        print('The recommender "Recommended for you" is ACTIVE')
        
    # Reranking Solution 
    version_response_rerank = personalize.describe_solution_version(
        solutionVersionArn = workshop_rerank_solution_version_arn
    )
    status_rerank_solution = version_response_rerank["solutionVersion"]["status"]

    if status_rerank_solution == "ACTIVE":
        print("Build succeeded for {}".format(workshop_rerank_solution_version_arn))
        
    elif status_rerank_solution == "CREATE FAILED":
        print("Build failed for {}".format(workshop_rerank_solution_version_arn))
        break

    if not status_rerank_solution == "ACTIVE":
        print("Rerank Solution Version build is still in progress")
    else:
        print("The Rerank solution is ACTIVE")
        
    if status_viewed_x_also_viewed == "ACTIVE" and status_recommended_for_you == 'ACTIVE' and status_rerank_solution == "ACTIVE":
        break

    print()
    time.sleep(60)

Build succeeded for arn:aws:personalize:us-east-1:381491864570:recommender/workshop_viewed_x_also_viewed
The recommender "Customers who viewed X also viewed" is ACTIVE
Build succeeded for arn:aws:personalize:us-east-1:381491864570:recommender/workshop_recommended_for_you
The recommender "Recommended for you" is ACTIVE
Build succeeded for arn:aws:personalize:us-east-1:381491864570:solution/workshop_personalized_ranking_retail/4faeb2e7
The Rerank solution is ACTIVE


## Deploy a Campaign <a class="anchor" id="deploy"></a>
[Back to top](#top)

Once a solution version is created, it is possible to get recommendations from them, and to get a feel for their overall behavior.

For real-time recommendations, after you prepare and import data and creating a solution, you are ready to deploy your solution version to generate recommendations. You deploy a solution version by creating an Amazon Personalize campaign. If you are getting batch recommendations, you don't need to create a campaign. For more information, see [Getting batch recommendations and user segments](https://docs.aws.amazon.com/personalize/latest/dg/recommendations-batch.html).

We will deploy a campaign for the personalized ranking solution version. 

### Create a campaign

A campaign is a hosted solution version: an endpoint which you can query for recommendations. Pricing is set by estimating throughput capacity (requests from users for personalization per second). When deploying a campaign, you set a minimum throughput per second (TPS) value. This service, like many within AWS, will automatically scale based on demand, but if latency is critical, you may want to provision ahead for larger demand. For this workshop, all minimum throughput thresholds are set to 1. For more information, see the [pricing page](https://aws.amazon.com/personalize/pricing/).

Once we're satisfied with our solution version, we need to create Campaigns for each solution version. When creating a campaign you specify the minimum transactions per second (`minProvisionedTPS`) that you expect to make against the service for this campaign. Personalize will automatically scale the inference endpoint up and down for the campaign to match demand but will never scale below `minProvisionedTPS`.

Let's create a campaign for our solution versions set at `minProvisionedTPS` of 1.

We will also add:

```python
    campaignConfig = {"enableMetadataWithRecommendations": True}
```
in order to inclde metadata when you ger recommendations. This is not required, but if not enabled, the recommender will only return the itemIDs of the recommended items. You can find more about this feature in the documentation([custom domain](https://docs.aws.amazon.com/personalize/latest/dg/campaigns.html)).

<div class="alert alert-block alert-warning">
<b>Important:</b> When you enable metadata in recommendations, you incur additional costs. For more information see Amazon Personalize pricing: https://aws.amazon.com/personalize/pricing/.
</div>



In [27]:
try:
    rerank_create_campaign_response = personalize.create_campaign(
        name = workshop_rerank_campaign_name,
        solutionVersionArn = workshop_rerank_solution_version_arn,
        minProvisionedTPS = 1,
        campaignConfig = {"enableMetadataWithRecommendations": True}
    )

    workshop_rerank_campaign_arn = rerank_create_campaign_response['campaignArn']
    print(json.dumps(rerank_create_campaign_response, indent=2))

    print ('\nCreating the personalize ranking campaign with workshop_rerank_campaign_arn = {}'.format(workshop_rerank_campaign_arn))

except personalize.exceptions.ResourceAlreadyExistsException as e:
    workshop_rerank_campaign_arn =  'arn:aws:personalize:'+region+':'+account_id+':campaign/'+workshop_rerank_campaign_name
    print('The personalize ranking campaign {} already exists.'.format(workshop_rerank_campaign_arn))
    print ('\nWe will be using the existing personalize ranking campaign with workshop_rerank_campaign_arn = {}'.format(workshop_rerank_campaign_arn))
    print('\nUpdating campaign to return metadata')
    try:
        response = personalize.update_campaign(
            campaignArn = workshop_rerank_campaign_arn,
            campaignConfig={
                'enableMetadataWithRecommendations': True
            }
        )
    except personalize.exceptions.InvalidInputException as e:
        print('\nCampaign {} already returns metadata as desired'.format(workshop_rerank_campaign_arn))

The personalize ranking campaign arn:aws:personalize:us-east-1:381491864570:campaign/workshop_personalized_ranking_retail_campaign already exists.

We will be using the existing personalize ranking campaign with workshop_rerank_campaign_arn = arn:aws:personalize:us-east-1:381491864570:campaign/workshop_personalized_ranking_retail_campaign

Updating campaign to return metadata


### View campaign creation status

This is how you view the status updates in the console:

* In another browser tab you should already have the AWS Console open from opening this notebook instance. 
* Switch to that tab and search at the top for the service `Personalize`, then go to that service page. 
* Click `Manage dataset groups`.
* Click the name of your dataset group.
* Click `Custom Resources`
* Click `Campaigns`.
* You will now see a list of all of the campaigns you created above, including a column with the status of the campaign. Once it is `Active`, your campaign is ready to be queried.

Or simply run the cell below to keep track of the campaign creation status of the campaign we created.

While you are waiting for this to complete you can learn more about campaigns in [the documentation](https://docs.aws.amazon.com/personalize/latest/dg/campaigns.html)

In [28]:
%%time

max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:

    version_response = personalize.describe_campaign(
        campaignArn = workshop_rerank_campaign_arn
    )
    status = version_response['campaign']['status']

    if status == 'ACTIVE':
        print('Build succeeded for {}'.format(workshop_rerank_campaign_arn))
    elif status == "CREATE FAILED":
        print('Build failed for {}'.format(workshop_rerank_campaign_arn))
        in_progress_campaigns.remove(workshop_rerank_campaign_arn)
    
    if status == 'ACTIVE' or status == 'CREATE FAILED':
        break
    else:
        print('The campaign build is still in progress')
        
    time.sleep(60)

Build succeeded for arn:aws:personalize:us-east-1:381491864570:campaign/workshop_personalized_ranking_retail_campaign
CPU times: user 3.94 ms, sys: 381 µs, total: 4.32 ms
Wall time: 40.2 ms


## Evaluate solution versions and recommenders <a class="anchor" id="eval"></a>
[Back to top](#top)

Personalize calculates these metrics based on a subset of the training data. The image below illustrates how Personalize splits the data. Given 10 users, with 10 interactions each (a circle represents an interaction), the interactions are ordered from oldest to newest based on the timestamp. Personalize uses all the interaction data from 90% of the users (blue circles) to train the solution version, and the remaining 10% for evaluation. For each of the users in the remaining 10%, 90% of their interaction data (green circles) is used as input for the call to the trained model. The remaining 10% of their data (orange circle) is compared to the output produced by the model and used to calculate the evaluation metrics.

![personalize metrics](images/personalize_metrics.png)

We recommend reading [the documentation](https://docs.aws.amazon.com/personalize/latest/dg/working-with-training-metrics.html) to understand the metrics, but we have also copied parts of the documentation below for convenience.

You need to understand the following terms regarding evaluation in Personalize:

* *Relevant recommendation* refers to a recommendation that matches a value in the testing data for the particular user.
* *Rank* refers to the position of a recommended item in the list of recommendations. Position 1 (the top of the list) is presumed to be the most relevant to the user.
* *Query* refers to the internal equivalent of a GetRecommendations call.

The metrics produced by Personalize are:
* **coverage**: The proportion of unique recommended items from all queries out of the total number of unique items in the training data (includes both the Items and Interactions datasets).
* **mean_reciprocal_rank_at_25**: The [mean of the reciprocal ranks](https://en.wikipedia.org/wiki/Mean_reciprocal_rank) of the first relevant recommendation out of the top 25 recommendations over all queries. This metric is appropriate if you're interested in the single highest ranked recommendation.
* **normalized_discounted_cumulative_gain_at_K**: Discounted gain assumes that recommendations lower on a list of recommendations are less relevant than higher recommendations. Therefore, each recommendation is discounted (given a lower weight) by a factor dependent on its position. To produce the [cumulative discounted gain](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) (DCG) at K, each relevant discounted recommendation in the top K recommendations is summed together. The normalized discounted cumulative gain (NDCG) is the DCG divided by the ideal DCG such that NDCG is between 0 - 1. (The ideal DCG is where the top K recommendations are sorted by relevance.) Amazon Personalize uses a weighting factor of 1/log(1 + position), where the top of the list is position 1. This metric rewards relevant items that appear near the top of the list, because the top of a list usually draws more attention.
* **precision_at_K**: The number of relevant recommendations out of the top K recommendations divided by K. This metric rewards precise recommendation of the relevant items.

Let's take a look at the evaluation metrics for each of the solutions produced in this notebook. Please note that your results might differ from the results described in the text of this notebook, due to the quality of the synthetic dataset. 

## "Customers who viewed X also viewed" recommender metrics

Retrieve the evaluation metrics for the "Customers who viewed X also viewed" recommender.

In [29]:
workshop_recommender_customers_who_viewed_metrics_response = personalize.describe_recommender(
    recommenderArn = workshop_recommender_customers_who_viewed_arn
)

for metric in workshop_recommender_customers_who_viewed_metrics_response['recommender']['modelMetrics']:
    print ("{}: {}".format(metric, workshop_recommender_customers_who_viewed_metrics_response['recommender']['modelMetrics'][metric]))

coverage: 0.9951
mean_reciprocal_rank_at_25: 0.3589
normalized_discounted_cumulative_gain_at_10: 0.3734
normalized_discounted_cumulative_gain_at_25: 0.4444
normalized_discounted_cumulative_gain_at_5: 0.3198
precision_at_10: 0.0972
precision_at_25: 0.0594
precision_at_5: 0.1355


* **coverage**: The proportion of unique recommended items from all queries out of the total number of unique items in the training data (includes both the Items and Interactions datasets).
* **mean_reciprocal_rank_at_25**: The [mean of the reciprocal ranks](https://en.wikipedia.org/wiki/Mean_reciprocal_rank) of the first relevant recommendation out of the top 25 recommendations over all queries. This metric is appropriate if you're interested in the single highest ranked recommendation.
* **normalized_discounted_cumulative_gain_at_K**: Discounted gain assumes that recommendations lower on a list of recommendations are less relevant than higher recommendations. Therefore, each recommendation is discounted (given a lower weight) by a factor dependent on its position. To produce the [cumulative discounted gain](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) (DCG) at K, each relevant discounted recommendation in the top K recommendations is summed together. The normalized discounted cumulative gain (NDCG) is the DCG divided by the ideal DCG such that NDCG is between 0 - 1. (The ideal DCG is where the top K recommendations are sorted by relevance.) Amazon Personalize uses a weighting factor of 1/log(1 + position), where the top of the list is position 1. This metric rewards relevant items that appear near the top of the list, because the top of a list usually draws more attention.
* **precision_at_K**: The number of relevant recommendations out of the top K recommendations divided by K. This metric rewards precise recommendation of the relevant items.

## "Recommended for you" recommender metrics

Retrieve the evaluation metrics for the "Recommended for you" Recommender.

In [30]:
workshop_recommender_recommended_for_you_metrics_response = personalize.describe_recommender(
    recommenderArn = workshop_recommender_recommended_for_you_arn
)

for metric in workshop_recommender_recommended_for_you_metrics_response['recommender']['modelMetrics']:
    print ("{}: {}".format(metric, workshop_recommender_recommended_for_you_metrics_response['recommender']['modelMetrics'][metric]))
   

coverage: 0.9765
mean_reciprocal_rank_at_25: 0.8359
normalized_discounted_cumulative_gain_at_10: 0.7491
normalized_discounted_cumulative_gain_at_25: 0.7801
normalized_discounted_cumulative_gain_at_5: 0.7229
precision_at_10: 0.1325
precision_at_25: 0.0623
precision_at_5: 0.2362


* **coverage**: The proportion of unique recommended items from all queries out of the total number of unique items in the training data (includes both the Items and Interactions datasets).
* **mean_reciprocal_rank_at_25**: The [mean of the reciprocal ranks](https://en.wikipedia.org/wiki/Mean_reciprocal_rank) of the first relevant recommendation out of the top 25 recommendations over all queries. This metric is appropriate if you're interested in the single highest ranked recommendation.
* **normalized_discounted_cumulative_gain_at_K**: Discounted gain assumes that recommendations lower on a list of recommendations are less relevant than higher recommendations. Therefore, each recommendation is discounted (given a lower weight) by a factor dependent on its position. To produce the [cumulative discounted gain](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) (DCG) at K, each relevant discounted recommendation in the top K recommendations is summed together. The normalized discounted cumulative gain (NDCG) is the DCG divided by the ideal DCG such that NDCG is between 0 - 1. (The ideal DCG is where the top K recommendations are sorted by relevance.) Amazon Personalize uses a weighting factor of 1/log(1 + position), where the top of the list is position 1. This metric rewards relevant items that appear near the top of the list, because the top of a list usually draws more attention.
* **precision_at_K**: The number of relevant recommendations out of the top K recommendations divided by K. This metric rewards precise recommendation of the relevant items.

## Personalized Ranking Metrics
Retrieve the evaluation metrics for the personalized ranking solution version.

In [31]:
rerank_solution_metrics_response = personalize.get_solution_metrics(
    solutionVersionArn = workshop_rerank_solution_version_arn
)

for metric in rerank_solution_metrics_response["metrics"]:
    print ("{}: {}".format(metric,rerank_solution_metrics_response["metrics"][metric] ))

coverage: 0.9684
mean_reciprocal_rank_at_25: 0.8342
normalized_discounted_cumulative_gain_at_10: 0.7478
normalized_discounted_cumulative_gain_at_25: 0.7779
normalized_discounted_cumulative_gain_at_5: 0.7234
precision_at_10: 0.1272
precision_at_25: 0.06
precision_at_5: 0.2274


* **coverage**: The proportion of unique recommended items from all queries out of the total number of unique items in the training data (includes both the Items and Interactions datasets).
* **mean_reciprocal_rank_at_25**: The [mean of the reciprocal ranks](https://en.wikipedia.org/wiki/Mean_reciprocal_rank) of the first relevant recommendation out of the top 25 recommendations over all queries. This metric is appropriate if you're interested in the single highest ranked recommendation.
* **normalized_discounted_cumulative_gain_at_K**: Discounted gain assumes that recommendations lower on a list of recommendations are less relevant than higher recommendations. Therefore, each recommendation is discounted (given a lower weight) by a factor dependent on its position. To produce the [cumulative discounted gain](https://en.wikipedia.org/wiki/Discounted_cumulative_gain) (DCG) at K, each relevant discounted recommendation in the top K recommendations is summed together. The normalized discounted cumulative gain (NDCG) is the DCG divided by the ideal DCG such that NDCG is between 0 - 1. (The ideal DCG is where the top K recommendations are sorted by relevance.) Amazon Personalize uses a weighting factor of 1/log(1 + position), where the top of the list is position 1. This metric rewards relevant items that appear near the top of the list, because the top of a list usually draws more attention.
* **precision_at_K**: The number of relevant recommendations out of the top K recommendations divided by K. This metric rewards precise recommendation of the relevant items.

## Using evaluation metrics <a class="anchor" id="usemetrics"></a>
[Back to top](#top)

It is important to use evaluation metrics carefully. There are a number of factors to keep in mind.

* If there is an existing recommendation system in place, this will have influenced the user's interaction history which you use to train your new solutions. This means the evaluation metrics are biased to favor the existing solution. If you work to push the evaluation metrics to match or exceed the existing solution, you may just be pushing the User Personalization to behave like the existing solution and might not end up with something better.


Keeping in mind these factors, the evaluation metrics produced by Personalize are generally useful for two cases:
1. Comparing the performance of solution versions trained on the same recipe, but with different values for the hyperparameters and features (impression data etc)
1. Comparing the performance of solution versions trained on different recipes. Here also keep in mind that the recipes answer different use cases and comparing them to each other might not make sense in your solution.

Properly evaluating a recommendation system is always best done through A/B testing while measuring actual business outcomes. Since recommendations generated by a system usually influence the user behavior which it is based on, it is better to run small experiments and apply A/B testing for longer periods of time. Over time, the bias from the existing model will fade.

Here we wait for our recommenders and campaign to be updated if they were created as part of an AWS workshop.

In [32]:
max_time = time.time() + 10*60*60 # 10 hours

try:
    while time.time() < max_time:

        # Personalized ranking Campaign
        up_version_response = personalize.describe_campaign(
            campaignArn = workshop_rerank_campaign_arn
        )

        pr_status_campaign_update = up_version_response['campaign']['latestCampaignUpdate']['status']

        if pr_status_campaign_update == 'ACTIVE':
            print('Update succeeded for {}'.format(workshop_rerank_campaign_arn))
        elif pr_status_campaign_update == "CREATE FAILED":
            print('Update failed for {}'.format(workshop_rerank_campaign_arn))
            break


        if not pr_status_campaign_update == "ACTIVE":
            print('The Personalized Ranking campaign update is still in progress')
        else:
            print("The Personalized Ranking campaign is ACTIVE")


        cwv_version_response = personalize.describe_recommender(
            recommenderArn = workshop_recommender_customers_who_viewed_arn
        )

        cwv_status_recommender_update = cwv_version_response['recommender']['latestRecommenderUpdate']['status']

        if cwv_status_recommender_update == 'ACTIVE':
            print('Update succeeded for {}'.format(workshop_recommender_customers_who_viewed_arn))
        elif cwv_status_recommender_update == "CREATE FAILED":
            print('Update failed for {}'.format(workshop_recommender_customers_who_viewed_arn))
            break

        if not cwv_status_recommender_update == "ACTIVE":
            print('The Customers who viewed recommender update is still in progress')
        else:
            print("The Customers who viewed X recommender is ACTIVE")

        rfy_version_response = personalize.describe_recommender(
            recommenderArn = workshop_recommender_recommended_for_you_arn
        )

        rfy_status_recommender_update = rfy_version_response['recommender']['latestRecommenderUpdate']['status']

        if rfy_status_recommender_update == 'ACTIVE':
            print('Update succeeded for {}'.format(workshop_recommender_recommended_for_you_arn))
        elif rfy_status_recommender_update == "CREATE FAILED":
            print('Update failed for {}'.format(workshop_recommender_recommended_for_you_arn))
            break

        if not rfy_status_recommender_update == "ACTIVE":
            print('The Recommended for you recommender update is still in progress')
        else:
            print("The Recommended for you recommender is ACTIVE")

        if cwv_status_recommender_update == "ACTIVE" and pr_status_campaign_update == 'ACTIVE' and rfy_status_recommender_update == 'ACTIVE':
            break

        sleep(60)
        print()
except KeyError as e:
    print("Recommenders and Campaigns likely created in notebook as no recent updates have been found")

Recommenders and Campaigns likely created in notebook as no recent updates have been found


## Storing useful variables <a class="anchor" id="vars"></a>
[Back to top](#top)

Before exiting this notebook, run the following cells to save the version ARNs for use in the next notebook.

In [33]:
%store workshop_recommender_customers_who_viewed_arn
%store workshop_recommender_recommended_for_you_arn
%store workshop_rerank_campaign_arn
%store workshop_rerank_solution_arn
%store workshop_rerank_solution_version_arn
%store workshop_rerank_campaign_arn
%store region
%store role_name
%store account_id

Stored 'workshop_recommender_customers_who_viewed_arn' (str)
Stored 'workshop_recommender_recommended_for_you_arn' (str)
Stored 'workshop_rerank_campaign_arn' (str)
Stored 'workshop_rerank_solution_arn' (str)
Stored 'workshop_rerank_solution_version_arn' (str)
Stored 'workshop_rerank_campaign_arn' (str)
Stored 'region' (str)
Stored 'role_name' (str)
Stored 'account_id' (str)


You're all set to move on to [the exploratory notebook `Retail_03_Inference_Layer.ipynb`](Retail_03_Inference_Layer.ipynb). Open it from the browser and you can start interacting with the Recommenders and Campaign and getting recommendations!