# Interacting with Recommenders, Campaigns and Filters <a class="anchor" id="top"></a>

Now that UnicornFlix has trained models for 3 different use cases (Top Picks for You, More like X and Reranking), we need to integrate them into our application. Amazon Personalize can make recommendations available via an Application Programming Interface (API). In addition Amazon Personalize includes features that allow you to easily integrate into applications and provide benefits like real time vending of recommendations based on recent application activity.

In this notebook, you will interact with campaigns and filters in Amazon Personalize.

1. [Introduction](#intro)
1. [Interact with Recommenders](#interact-recommenders)
1. [Interact with Campaigns](#interact-campaigns)
1. [Filters](#filters)
1. [Create Filters](#create-filters)
1. [Using Filters](#using-filters)
1. [Real-time Events](#real-time)
1. [Batch Recommendations](#batch)
1. [Wrap Up](#wrapup)

To run this notebook, you need to have run the previous notebooks, [`01_Data.ipynb`](01_Data.ipynb), and [`02_Training.ipynb`](02_Training.ipynb), where you created a dataset and imported interaction, item, and user metadata data into Amazon Personalize, created recommenders, solutions and campaigns.

## Introduction <a class="anchor" id="intro"></a>
[Back to top](#top)

At this point, you should have 2 Recommenders and one deployed Campaign. Once they are active, there are resources for querying the recommendations, and helper functions to digest the output into something more human-readable. 


In this Notebook we will interact with Recommenders and Campaigns and get recommendations. 

We will create and interact with filters and send live data to Amazon Personalize to see the effect of real-time interactions on recommendations.

The following diagram shows the resources that we will create in this section. with the section we are building  in this notebook highlighted in blue with a dashed outline.

![Workflow](images/03_Inference_Layer_Resources.jpg)

To get started, once again, we need to import libraries, load values from previous notebooks, and load the SDK.

In [1]:
import time
from time import sleep
import json
from datetime import datetime
import uuid
import random
import boto3
import pandas as pd

In [2]:
%store -r

In [3]:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

# Establish a connection to Personalize's event streaming
personalize_events = boto3.client(service_name='personalize-events')

Start by loading in the dataset which we can use for our lookup table.

In [4]:
# Create a dataframe for the items by reading in the correct source CSV
items_df = pd.read_csv(data_dir + '/imdb/items.csv', index_col=0)

# Render some sample data
items_df.head(5)

Unnamed: 0_level_0,TITLE,YEAR,IMDB_RATING,IMDB_NUMBEROFVOTES,PLOT,US_MATURITY_RATING_STRING,US_MATURITY_RATING,GENRES,CREATION_TIMESTAMP,PROMOTION
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
tt0906325,Bagboy,2007,4,741,A teenager enters the competitive world of gro...,PG-13,13,Comedy,1167609600,False
tt0906665,Sukiyaki Western Django,2007,6,14582,A nameless gunfighter arrives in a town ripped...,R,17,Action|Western,1167609600,False
tt0907657,Once,2007,8,109787,A modern-day musical about a busker and an imm...,R,17,Drama|Music|Romance,1167609600,False
tt0910905,In the Electric Mist,2009,6,16592,A detective in post-Katrina New Orleans has a ...,R,17,Crime|Drama|Mystery|Thriller,1230768000,False
tt0910554,Frequently Asked Questions About Time Travel,2009,7,32051,"While drinking at their local pub, three socia...",PG-13,13,Comedy|Sci-Fi,1230768000,False


By defining the ID column as the index column it is easy to return a movie by just querying the ID. 

In [5]:
movie_id_example = 'tt0095016'
title = items_df.loc[movie_id_example]['TITLE']
print(title)

Die Hard


## Interact with recommenders <a class="anchor" id="interact-recommenders"></a>
[Back to top](#top)

Now that the recommenders have been trained, lets have a look at the recommendations we can get for our users!

### "More like X" Recommender

'More like X' requires an item and a user as input, and it will return items which users interact with in similar ways to their interaction with the input item. In this particular case the item is a movie. 

The cells below will handle getting recommendations from the "More like X" Recommender and rendering the results. Let's see what the recommendations are for the first item we looked at earlier in this notebook (Die Hard).

We will be using the `recommenderArn`, the `itemId`, the `userId` as well as the number or results we want, `numResults`.

In [6]:
# First pick a user
test_user_id = "1"

Let's use the capability of Amazon Personalize to return item metadata to get the movie title.

In [7]:
get_recommendations_response = personalize_runtime.get_recommendations(
    recommenderArn = workshop_recommender_more_like_x_arn,
    itemId = movie_id_example,
    userId = test_user_id,
    numResults = 20,
    metadataColumns = {
            "ITEMS": ['TITLE']
        }
)

First lets get the response directly from the get_recommendations API, which by default returns 25 items, but can be adjusted.

In [8]:
print(get_recommendations_response)

{'ResponseMetadata': {'RequestId': '9cb90645-d250-4dfe-acb7-5ce62d23c7b6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Wed, 28 Aug 2024 13:36:11 GMT', 'content-type': 'application/json', 'content-length': '2742', 'connection': 'keep-alive', 'x-amzn-requestid': '9cb90645-d250-4dfe-acb7-5ce62d23c7b6'}, 'RetryAttempts': 0}, 'itemList': [{'itemId': 'tt0103064', 'metadata': {'title': 'Terminator 2: Judgment Day'}}, {'itemId': 'tt0099423', 'metadata': {'title': 'Die Hard 2'}}, {'itemId': 'tt0078748', 'metadata': {'title': 'Alien'}}, {'itemId': 'tt0094226', 'metadata': {'title': 'The Untouchables'}}, {'itemId': 'tt0114746', 'metadata': {'title': '12 Monkeys'}}, {'itemId': 'tt0112573', 'metadata': {'title': 'Braveheart'}}, {'itemId': 'tt0086190', 'metadata': {'title': 'Star Wars: Episode VI - Return of the Jedi'}}, {'itemId': 'tt0120382', 'metadata': {'title': 'The Truman Show'}}, {'itemId': 'tt0112864', 'metadata': {'title': 'Die Hard with a Vengeance'}}, {'itemId': 'tt0120735', 'metadata

Depending on your Item catalog, this response will not be easy to quickly evaluate the recommendations, so lets use the function we created above.

In [9]:
item_list = get_recommendations_response['itemList']
print("If you liked " + items_df.loc[movie_id_example]['TITLE'] + ", you may also like:")
print()
for item in item_list:
    print(item['metadata']['title'])

If you liked Die Hard, you may also like:

Terminator 2: Judgment Day
Die Hard 2
Alien
The Untouchables
12 Monkeys
Braveheart
Star Wars: Episode VI - Return of the Jedi
The Truman Show
Die Hard with a Vengeance
Lock, Stock and Two Smoking Barrels
One Flew Over the Cuckoo's Nest
Aliens
Gladiator
The Princess Bride
Psycho
2001: A Space Odyssey
Interview with the Vampire: The Vampire Chronicles
The Blues Brothers
Batman Begins
Natural Born Killers


Congrats, this is your first list of recommendations! This list is fine, but it would be better to see the recommendations for similar movies render in a nice dataframe. Again, let's create a helper function to achieve this.

In [10]:
# Update DF rendering
pd.set_option('display.max_rows', 30)

def get_new_recommendations_df(recommendations_df, movie_id, user_id):
    # Get the movie name
    movie_name = items_df.loc[movie_id_example]['TITLE']
    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        recommenderArn = workshop_recommender_more_like_x_arn,
        itemId = str(movie_id),
        userId = user_id,
        numResults = 15,
        metadataColumns = {
            "ITEMS": ['TITLE']
        }
    )
    # Builda new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        try:
            movie = item['metadata']['title']
            recommendation_list.append(movie)
        except:
            pass
    new_rec_df = pd.DataFrame(recommendation_list, columns = [movie_name])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_df], axis=1)
    return recommendations_df

Let's sample some data from our dataset to test our "More like X" Recommender. Grab 5 random movies from our dataframe.

Note: We are going to show similar titles, so you may want to re-run the sample until you recognize some of the movies listed

In [11]:
samples = items_df.sample(5)
samples

Unnamed: 0_level_0,TITLE,YEAR,IMDB_RATING,IMDB_NUMBEROFVOTES,PLOT,US_MATURITY_RATING_STRING,US_MATURITY_RATING,GENRES,CREATION_TIMESTAMP,PROMOTION
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
tt0892318,Letters to Juliet,2010,6,91343,Sophie dreams of becoming a writer and travels...,PG,8,Adventure|Comedy|Drama|Romance,1262304000,False
tt0071110,Airport 1975,1974,6,9449,"A 747 in flight collides with a small plane, a...",PG,8,Action|Drama|Thriller,126230400,False
tt0035753,Le Corbeau,1943,8,8893,A French village doctor becomes the target of ...,Not Rated,18,Crime|Drama|Mystery|Thriller,-852076800,False
tt3205376,Slow West,2015,7,42266,A young Scottish man travels across America in...,R,17,Action|Adventure|Drama|Romance|Thriller|Western,1420070400,False
tt0116448,The Great White Hype,1996,6,8350,"The boxing champ's promoter thinks, change is ...",R,17,Comedy|Sport,820454400,False


In [12]:
more_like_x_recommendations_df = pd.DataFrame()
movies = samples.index.tolist()

for movie in movies:
    more_like_x_recommendations_df = get_new_recommendations_df(more_like_x_recommendations_df, movie, test_user_id)

more_like_x_recommendations_df

Unnamed: 0,Die Hard,Die Hard.1,Die Hard.2,Die Hard.3,Die Hard.4
0,Little Ashes,Star Trek: The Motion Picture,Once Upon a Time in the West,Red Hill,Lawnmower Man 2: Beyond Cyberspace
1,Shrek 2,Star Trek IV: The Voyage Home,One Flew Over the Cuckoo's Nest,"Crazy, Stupid, Love.",My Best Friend's Wedding
2,"Crazy, Stupid, Love.",Men in Black,The Godfather: Part II,Knocked Up,The Waterboy
3,Up,Contact,Trainspotting,500 Days of Summer,Happy Gilmore
4,500 Days of Summer,Star Trek: Generations,Saving Private Ryan,Superbad,G.I. Jane
5,Inside Out,Star Trek: Insurrection,The Godfather,Pan's Labyrinth,The Cable Guy
6,The Hunger Games,The Hunt for Red October,Network,Hot Fuzz,Broken Arrow
7,Toy Story 3,Arachnophobia,Chinatown,Call Me by Your Name,The Nutty Professor
8,Meet the Fockers,Aliens,The Blues Brothers,There Will Be Blood,The Birdcage
9,School of Rock,TRON,Das Boot,The Devil Wears Prada,White Men Can't Jump


You may notice that some of the items look the same, hopefully not all of them do (this is more likely with a smaller # of interactions, which will be more common with the movielens small dataset). Inside our fictional streaming service UnicornFlix we can use this recommender to provide recommendations within a movie detail page with the heading "If you like this movie, you may also like".

### "Top picks for you" Recommender

"Top picks for you" supports personalization of the items for a specific user based on their past behavior and can intake real time events in order to alter recommendations for a user without retraining. 

Since "Top picks for you" relies on having a sampling of users, let's load the data we need for that and select 3 random users. Since Movielens does not include user data, we will select 3 random numbers from the range of user id's in the dataset.

In [13]:
users = random.sample(range(1, 600), 3)

Now we render the recommendations for our 3 random users from above. After that, we will explore real-time interactions before moving on to Personalized Ranking.

"Top picks for you" requires only a user as input, and it will return items that are relevant for that particular user. In this particular case the item is a movie.

The cells below will handle getting recommendations from the "Top picks for you" Recommender and rendering the results. 

We will be using the `recommenderArn`, the `userId` as well as the number or results we want, `numResults`.

Again, we create a helper function to render the results in a nice dataframe.

#### API call results

In [14]:
# Update DF rendering
pd.set_option('display.max_rows', 30)

def get_new_recommendations_df_users(recommendations_df, user_id):

    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        recommenderArn = workshop_recommender_top_picks_arn,
        userId = str(user_id),
        numResults = 15,
        metadataColumns = {
                "ITEMS": ['TITLE']
        }
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        try:
            movie = item['metadata']['title']
        except:
            movie = items_df.loc[item['itemId']]['TITLE']
        recommendation_list.append(movie)
    new_rec_df = pd.DataFrame(recommendation_list, columns = [user_id])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_df], axis=1)
    return recommendations_df

In [15]:
recommendations_df_users = pd.DataFrame()

for user in users:
    recommendations_df_users = get_new_recommendations_df_users(recommendations_df_users, user)

recommendations_df_users

Unnamed: 0,411,131,591
0,What's Love Got to Do with It,The Terminator,Almost Famous
1,Muriel's Wedding,Apocalypse Now,"Crouching Tiger, Hidden Dragon"
2,Como agua para chocolate,Star Wars: Episode VI - Return of the Jedi,The Royal Tenenbaums
3,The Crush,Full Metal Jacket,Being John Malkovich
4,Smoke,Back to the Future,"O Brother, Where Art Thou?"
5,Nobody's Fool,Die Hard,Aladdin
6,Immortal Beloved,12 Monkeys,Cast Away
7,Searching for Bobby Fischer,Terminator 2: Judgment Day,What's Eating Gilbert Grape
8,Something to Talk About,Star Wars: Episode IV - A New Hope,Philadelphia
9,Bullets Over Broadway,A Clockwork Orange,Apollo 13


Here we clearly see that the recommendations for each user are different. This recommender could be used inside UnicornFlix on the main page under the title "Top picks for You", where each subscriber would be provided a list or titles that are personalized to their viewing habits, we will work on changing these in realtime later in this notebook.

If you were to need a cache for these results, you could start by running the API calls through all your users and store the results, or you could use a batch export.

## Interact with Campaigns <a class="anchor" id="interact-campaigns"></a>
[Back to top](#top)

### Personalized Ranking

The core use case for personalized ranking is to take a collection of items and to render them in priority or probable order of interest for a user. For a VOD application you want dynamically render a personalized shelf/rail/carousel based on some information (director, location, superhero franchise, movie time period, etc...). This may not be information that you have in your metadata, so an item metadata filter will not work, however you may have this information within you system to generate the item list. 

To demonstrate this, we will use the same user from before and a random collection of items.

In [16]:
rerank_user = user
rerank_items = items_df.sample(15).index.tolist()
rerank_list = []
for item in rerank_items:
    movie = items_df.loc[item]['TITLE']
    rerank_list.append(movie)
rerank_df = pd.DataFrame(rerank_list, columns = ['Un-Ranked'])
rerank_df

Unnamed: 0,Un-Ranked
0,The Return
1,Captain Phillips
2,Mondo cane
3,Mad City
4,The Lego Batman Movie
5,Free Willy
6,The Babysitter
7,Hotel Transylvania 2
8,Smooth Talk
9,Hard Eight


Now lets take that list and make the personalized ranking API call.

In [17]:
rerank_item_list = []
for item in rerank_items:
    rerank_item_list.append(str(item))
    
# Get recommended reranking
get_recommendations_response_rerank = personalize_runtime.get_personalized_ranking(
        campaignArn = workshop_rerank_campaign_arn,
        userId = str(rerank_user),
        inputList = rerank_item_list,
        metadataColumns = {
                "ITEMS": ['TITLE']
        }
)

We will add the reranked items as a second column to the original dataframe, for a side-by-side comparison.

In [18]:
ranked_list = []
item_list = get_recommendations_response_rerank['personalizedRanking']
for item in item_list:
    try: 
        movie = item['metadata']['title']
    except:
        movie = items_df.loc[item['itemId']]['TITLE']
    ranked_list.append(movie)
ranked_df = pd.DataFrame(ranked_list, columns = ['Re-Ranked'])
rerank_df = pd.concat([rerank_df, ranked_df], axis=1)
rerank_df

Unnamed: 0,Un-Ranked,Re-Ranked
0,The Return,Captain Phillips
1,Captain Phillips,Free Willy
2,Mondo cane,An Affair of Love
3,Mad City,Mondo cane
4,The Lego Batman Movie,The Allnighter
5,Free Willy,Hard Eight
6,The Babysitter,The Return
7,Hotel Transylvania 2,The Babysitter
8,Smooth Talk,Beneath the Planet of the Apes
9,Hard Eight,Mad City


You can see above how each entry was re-ordered based on the model's understanding of the user. This is a popular task when you have a collection of items to surface a user that cannot be easily categorized in your metadata, for instance "Critics picks" which are curated by a person.

# Filters <a class="anchor" id="filters"></a>

## Create Filters <a class="anchor" id="create-filters"></a>
[Back to top](#top)

Personalize can utilize either [static or dynamic filters](https://docs.aws.amazon.com/personalize/latest/dg/filter.html). Static filters are where the filter properties are built into the filter itself, which makes invocation simpler, but gives less flexibility. An example of this would be a "Horror" movie filter, which invokes the get_recommendations_response api with the specific filter of GENRE = Horror. In order to create a recommendation for each filter that would require 10+ filters. Personalize also supports dynamic filters, where the values can be passed at runtime, allowing for a single filter of GENRE, where the actual genre is passed at runtime. 

Filters can be created for fields of both Items and Events. 

A few common use cases for dynamic filters in Video On Demand are:

Categorical filters based on Item Metadata - Often your item metadata will have information about the title such as Genre, Keyword, Year, Director, Actor etc. Filtering on these can provide recommendations within that data, such as action movies, Steven Spielberg movies, Movies from 1995 etc.

Range based filters based on Item Metadata - Personalize supports range operations in both static and dynamic filters. Filtering based on a range can be used to create recommendations such as "Whats on now" in live tv scenarios, best of decade, Movies rated over 8/10 stars etc

Events - you may want to filter out certain events and provide results based on those events, such as moving a title from a "suggestions to watch" recommendation to a "watch again" recommendations.


### Create the Genre filter

Within UnicornFlix, our catalog includes content from many different genres (Action, Documentary, Sci-Fi etc). Since there are a lot of genres to filter on, we will create a dynamic filter using the dynamic variable $GENRE, this will allow us to pass in the variable at runtime rather than create a static filter for each genre.

In [19]:
filter_name = 'Genre'
try:
    create_genre_filter_response = personalize.create_filter(
        name=filter_name,
        datasetGroupArn = workshop_dataset_group_arn,
        filterExpression = 'INCLUDE ItemID WHERE Items.GENRES IN ($GENRE)'
    )
    
    genre_filter_arn = create_genre_filter_response['filterArn']

    print('Creating the Personalize filter with genre_filter_arn {}.'.format(genre_filter_arn))
    
except personalize.exceptions.ResourceAlreadyExistsException as e:
    genre_filter_arn = 'arn:aws:personalize:'+region+':'+account_id+':filter/'+filter_name
    print('The Personalize filter {} already exists.'.format(filter_name))
    print ('\nWe will be using the existing Personalize Filter with genre_filter_arn = {}'.format(genre_filter_arn))
    

Creating the Personalize filter with genre_filter_arn arn:aws:personalize:us-east-1:809697808660:filter/Genre.


### Create the Promotion filter

In [20]:
promotion_filter_name = 'promotion'
try:
    create_promotion_filter_response = personalize.create_filter(
        name=promotion_filter_name,
        datasetGroupArn = workshop_dataset_group_arn,
        filterExpression = 'INCLUDE ItemID WHERE Items.PROMOTION IN ("true")'
    )
    
    promotion_filter_arn = create_promotion_filter_response['filterArn']
    print('Creating the Personalize filter with promotion_filter_arn {}.'.format(promotion_filter_arn))
    

except personalize.exceptions.ResourceAlreadyExistsException as e:
    promotion_filter_arn = 'arn:aws:personalize:'+region+':'+account_id+':filter/'+promotion_filter_name
    print('The Personalize filter {} already exists.'.format(promotion_filter_name))
    print ('\nWe will be using the existing Personalize Filter with promotion_filter_arn = {}'.format(promotion_filter_arn))
    

Creating the Personalize filter with promotion_filter_arn arn:aws:personalize:us-east-1:809697808660:filter/promotion.


### Create the Year Range filter
Personalize can also filter based on numerical ranges. This can be helpful if you want to look for items that are within a given time window, above a certain rating etc. For that we will create a filter for decades.

In [21]:
year_range_filter_name = 'YearRange'
try:
    create_genre_filter_response = personalize.create_filter(
        name=year_range_filter_name,
        datasetGroupArn = workshop_dataset_group_arn,
        filterExpression = 'INCLUDE ItemID WHERE Items.YEAR >= $YEAR1 AND Items.YEAR < $YEAR2'
    )
    
    year_range_filter_arn = create_genre_filter_response['filterArn']
    print('Creating the Personalize filter with year_range_filter_arn {}.'.format(year_range_filter_arn))
    

except personalize.exceptions.ResourceAlreadyExistsException as e:
    year_range_filter_arn = 'arn:aws:personalize:'+region+':'+account_id+':filter/'+year_range_filter_name
    print('The Personalize filter {} already exists.'.format(year_range_filter_name))
    print ('\nWe will be using the existing Personalize Filter with year_range_filter_arn = {}'.format(year_range_filter_arn))
    
    
    

Creating the Personalize filter with year_range_filter_arn arn:aws:personalize:us-east-1:809697808660:filter/YearRange.


### Create filters for Watched and Unwatched movies
Lets also create 2 event filters for watched and unwatched content. The Top picks for you and More like X already have a filter on implemented to filter out watched events.

In [22]:
watched_filter_name = 'watched'

try:
    create_watched_filter_response = personalize.create_filter(
        name=watched_filter_name,
        datasetGroupArn = workshop_dataset_group_arn,
        filterExpression = 'INCLUDE ItemID WHERE Interactions.event_type IN ("Watch")'
    )
    
    watched_filter_arn = create_watched_filter_response['filterArn']
    print('Creating the Personalize filter with watched_filter_arn {}.'.format(watched_filter_arn))
    

except personalize.exceptions.ResourceAlreadyExistsException as e:
    watched_filter_arn = 'arn:aws:personalize:'+region+':'+account_id+':filter/'+watched_filter_name
    print('The Personalize filter {} already exists.'.format(watched_filter_name))
    print ('\nWe will be using the existing Personalize Filter with watched_filter_arn = {}'.format(watched_filter_arn))
    

unwatched_filter_name = 'unwatched'
try:
    create_unwatched_filter_response = personalize.create_filter(
        name = unwatched_filter_name,
        datasetGroupArn = workshop_dataset_group_arn,
        filterExpression = 'EXCLUDE ItemID WHERE Interactions.event_type IN ("Watch")'
    )
    unwatched_filter_arn = create_unwatched_filter_response['filterArn']
    print('Creating the Personalize filter with unwatched_filter_arn {}.'.format(unwatched_filter_arn))

except personalize.exceptions.ResourceAlreadyExistsException as e:
    unwatched_filter_arn = 'arn:aws:personalize:'+region+':'+account_id+':filter/'+unwatched_filter_name
    print('\nThe Personalize filter {} already exists.'.format(unwatched_filter_name))
    print ('\nWe will be using the existing Personalize Filter with unwatched_filter_arn = {}'.format(unwatched_filter_arn))
    

Creating the Personalize filter with watched_filter_arn arn:aws:personalize:us-east-1:809697808660:filter/watched.
Creating the Personalize filter with unwatched_filter_arn arn:aws:personalize:us-east-1:809697808660:filter/unwatched.


Before we move on we want to add those filters to a list as well so they can be used later.

In [23]:
interaction_filter_arns = [watched_filter_arn, unwatched_filter_arn]

In [24]:
interaction_filter_arns

['arn:aws:personalize:us-east-1:809697808660:filter/watched',
 'arn:aws:personalize:us-east-1:809697808660:filter/unwatched']

In [25]:
max_time = time.time() + 10*60*60 # 10 hours
while time.time() < max_time:

    # Genre filter
    version_response = personalize.describe_filter(
        filterArn = genre_filter_arn
    )
    status_genre_filter = version_response["filter"]["status"]

    if status_genre_filter == "ACTIVE":
        print("Build succeeded for {}".format(genre_filter_arn))
        
    elif status_genre_filter == "CREATE FAILED":
        print("Build failed for {}".format(genre_filter_arn))
        break
        
    if not status_genre_filter == "ACTIVE":
        print("The filter Genre is still in progress")
    else:
        print("The filter Genre is ACTIVE")

    # Promotion filter
    version_response = personalize.describe_filter(
        filterArn = promotion_filter_arn
    )
    status_promotion_filter = version_response["filter"]["status"]

    if status_promotion_filter == "ACTIVE":
        print("Build succeeded for {}".format(promotion_filter_arn))
        
    elif status_promotion_filter == "CREATE FAILED":
        print("Build failed for {}".format(promotion_filter_arn))
        break
        
    if not status_promotion_filter == "ACTIVE":
        print("The filter Promotion is still in progress")
    else:
        print("The filter Promotion is ACTIVE")   
        
        
    # Decade filter
    version_response = personalize.describe_filter(
        filterArn = year_range_filter_arn
    )
    status_year_range_filter = version_response["filter"]["status"]

    if status_year_range_filter == "ACTIVE":
        print("Build succeeded for {}".format(year_range_filter_arn))
        
    elif status_year_range_filter == "CREATE FAILED":
        print("Build failed for {}".format(year_range_filter_arn))
        break
        
    if not status_year_range_filter == "ACTIVE":
        print("The filter YearRange is still in progress")
    else:
        print("The filter YearRange is ACTIVE")

    # Watched filter
    version_response = personalize.describe_filter(
        filterArn = watched_filter_arn
    )
    status_watched_filter = version_response["filter"]["status"]

    if status_watched_filter == "ACTIVE":
        print("Build succeeded for {}".format(watched_filter_arn))
        
    elif status_watched_filter == "CREATE FAILED":
        print("Build failed for {}".format(watched_filter_arn))
        break
        
    if not status_watched_filter == "ACTIVE":
        print("The filter Watched is still in progress")
    else:
        print("The filter Watched is ACTIVE")
 
    # Unatched filter
    version_response = personalize.describe_filter(
        filterArn = unwatched_filter_arn
    )
    status_unwatched_filter = version_response["filter"]["status"]

    if status_unwatched_filter == "ACTIVE":
        print("Build succeeded for {}".format(unwatched_filter_arn))
        
    elif status_unwatched_filter == "CREATE FAILED":
        print("Build failed for {}".format(unwatched_filter_arn))
        break
        
    if not status_unwatched_filter == "ACTIVE":
        print("The filter Unwatched is still in progress")
    else:
        print("The filter Unwatched is ACTIVE")
    
    if status_genre_filter == "ACTIVE" and status_year_range_filter == 'ACTIVE' and status_promotion_filter  == 'ACTIVE' and status_watched_filter == "ACTIVE" and status_unwatched_filter == "ACTIVE":
        break

    print()
    time.sleep(30)

Build succeeded for arn:aws:personalize:us-east-1:809697808660:filter/Genre
The filter Genre is ACTIVE
Build succeeded for arn:aws:personalize:us-east-1:809697808660:filter/promotion
The filter Promotion is ACTIVE
The filter YearRange is still in progress
The filter Watched is still in progress
The filter Unwatched is still in progress

Build succeeded for arn:aws:personalize:us-east-1:809697808660:filter/Genre
The filter Genre is ACTIVE
Build succeeded for arn:aws:personalize:us-east-1:809697808660:filter/promotion
The filter Promotion is ACTIVE
Build succeeded for arn:aws:personalize:us-east-1:809697808660:filter/YearRange
The filter YearRange is ACTIVE
The filter Watched is still in progress
The filter Unwatched is still in progress

Build succeeded for arn:aws:personalize:us-east-1:809697808660:filter/Genre
The filter Genre is ACTIVE
Build succeeded for arn:aws:personalize:us-east-1:809697808660:filter/promotion
The filter Promotion is ACTIVE
Build succeeded for arn:aws:personalize

## Using Filters <a class="anchor" id="using-filters"></a>
[Back to top](#top)

Now that the Filters have been created we can use them to filter our recommendations.

### Get recommendations using a static filter

Earlier we chose to promote a specific set of movies (movies set in or about Las Vegas), which we created the item metadata field of "PROMOTION", and set it to true for 35 movies. We will use a static filter to ensure that these items are retured in recommendations. First we will use a static filter to ensure that ALL recommendations meet this criteria of the filter.

In [26]:
# Update DF rendering
pd.set_option('display.max_rows', 30)

def get_new_recommendations_w_promotions_only_df_users(recommendations_df, user_id):

    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        recommenderArn = workshop_recommender_top_picks_arn,
        userId = str(user_id),
        filterArn = promotion_filter_arn,
        numResults = 15,
        metadataColumns = {
            "ITEMS": ['TITLE']
        }
    )
    # Build a new dataframe of recommendations
    itemList = get_recommendations_response['itemList']
    recommendation_list = []
    for item in itemList:
        movie = item['metadata']['title']
        recommendation_list.append(movie)
    new_rec_df = pd.DataFrame(recommendation_list, columns = [user_id])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_df], axis=1)
    return recommendations_df

In [27]:
recommendations_df_users = pd.DataFrame()

for user in users:
    recommendations_df_users = get_new_recommendations_w_promotions_only_df_users(recommendations_df_users, user)

recommendations_df_users

Unnamed: 0,411,131,591
0,Bugsy,The Hangover,Fear and Loathing in Las Vegas
1,Showgirls,Ocean's Eleven,Ocean's Eleven
2,Honeymoon in Vegas,Fear and Loathing in Las Vegas,Leaving Las Vegas
3,Leaving Las Vegas,Leaving Las Vegas,Dodgeball
4,Leprechaun 3,Diamonds Are Forever,The Hangover
5,The Gauntlet,The Hangover Part II,Rat Race
6,Fear and Loathing in Las Vegas,Next,Showgirls
7,The Cooler,Dodgeball,Bugsy
8,Breathless,Showgirls,Honeymoon in Vegas
9,Vegas Vacation,Smokin' Aces,The Hangover Part II


In this scenario for the 3 test users we got recommendations for previously, the static filter limits the recommendations to only those that meet the filter criteria. This does not provide a good balance of recommendations that are relevant and recommendations that meet the filter criteria. We could use this inside UnicornFlix under a carousel/rail named "Vegas Movies", as it is only providing recommendations for Vegas movies.

### Get recommendations using a promotions filter

Personalize has a capability to "promote" certain items, which can be defined as a promotion filter. A configurable subset of the recommendations returned will be from the items defined as promotion, in this case 30%. This should provide a balance of items that are the most relevant, and items from the promoted items. Note: The promoted items will be the most relevant items in the promoted items subset based on that users interactions.

In [28]:
# Update DF rendering
pd.set_option('display.max_rows', 30)

def get_new_recommendations_w_promotions_df_users(recommendations_df, user_id):
    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        recommenderArn = workshop_recommender_top_picks_arn,
        userId = str(user_id),
        numResults = 15,
        promotions = [{
        "name" : "vegas_promotion",
        "percentPromotedItems" : 30,
        "filterArn": promotion_filter_arn,
        }],
        metadataColumns = {
            "ITEMS": ['TITLE']
        }
    )
    # Build a new dataframe of recommendations
    itemList = get_recommendations_response['itemList']
    promotion_list = []
    recommendation_list = []
    for item in itemList:
        try:
            movie = item['metadata']['title']
        except:
            movie = items_df.loc[item['itemId']]['TITLE']
        try: 
            item["promotionName"]
            promotion_list.append('Y')
        except KeyError:
            promotion_list.append('N')
        recommendation_list.append(movie)
    new_rec_df = pd.DataFrame(recommendation_list, columns = [user_id])
    promotions_df = pd.DataFrame(promotion_list, columns = [user_id])
    promotions_df.columns = ['Promotion']
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_df], axis=1)
    recommendations_df = pd.concat([recommendations_df, promotions_df], axis=1)
    return recommendations_df

In [29]:
recommendations_df_users = pd.DataFrame()

for user in users:
    recommendations_df_users = get_new_recommendations_w_promotions_df_users(recommendations_df_users, user)

recommendations_df_users

Unnamed: 0,411,Promotion,131,Promotion.1,591,Promotion.2
0,Bugsy,Y,The Terminator,N,Almost Famous,N
1,What's Love Got to Do with It,N,Apocalypse Now,N,"Crouching Tiger, Hidden Dragon",N
2,Muriel's Wedding,N,Star Wars: Episode VI - Return of the Jedi,N,The Royal Tenenbaums,N
3,Showgirls,Y,Full Metal Jacket,N,Fear and Loathing in Las Vegas,Y
4,Honeymoon in Vegas,Y,Back to the Future,N,Being John Malkovich,N
5,Leaving Las Vegas,Y,The Hangover,Y,Ocean's Eleven,Y
6,Como agua para chocolate,N,Die Hard,N,"O Brother, Where Art Thou?",N
7,The Crush,N,12 Monkeys,N,Aladdin,N
8,Smoke,N,Ocean's Eleven,Y,Cast Away,N
9,Nobody's Fool,N,Fear and Loathing in Las Vegas,Y,Leaving Las Vegas,Y


We can see that promotions are mixed in with the most relevant items for the users. This is different from applying a filter directly where **all** items recommended would belong to the promotion. If a user already had certain promoted items in their recommendations before apllying the promotions, they will still be recommended. Inside UnicornFlix, we could use this as the main page "Top picks for you" carousel/rail, where the recommendations are across all of the catalog titles, with Vegas movies being inserted into the recommndations.

### Get recommendations using the date range filter

First lets create functions to get recommendations and pass in dynamic filter values.


In [30]:
def get_new_recommendations_df_by_year_range_filter(recommendations_df, user_id, year_range_filter_arn, filter_value1, filter_value2):
    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        recommenderArn = workshop_recommender_top_picks_arn,
        userId = str(user_id),
        filterArn = year_range_filter_arn,
        filterValues = {"YEAR1": filter_value1,"YEAR2": filter_value2},
        numResults = 15,
        metadataColumns = {
            "ITEMS": ['TITLE']
        }
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        movie = item['metadata']['title']
        recommendation_list.append(movie)
    filter_name = year_range_filter_arn.split('/')[1]
    new_rec_DF = pd.DataFrame(recommendation_list, columns = [filter_value1])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_DF], axis=1)
    return recommendations_df

In [31]:
decades_to_filter = [1930,1940,1950,1960,1970,1980,1990,2000,2010]

In [32]:
# Iterate through Decades
recommendations_df_decade_shelves = pd.DataFrame()
for decade in decades_to_filter:
    recommendations_df_decade_shelves = get_new_recommendations_df_by_year_range_filter(recommendations_df_decade_shelves, user, year_range_filter_arn , str(decade), str(decade+10))
    
recommendations_df_decade_shelves

Unnamed: 0,1930,1940,1950,1960,1970,1980,1990,2000,2010
0,The Wizard of Oz,Casablanca,Singin' in the Rain,Dr. Strangelove or: How I Learned to Stop Worr...,Monty Python's Life of Brian,The Breakfast Club,Being John Malkovich,Almost Famous,The Artist
1,Modern Times,It's a Wonderful Life,Ben-Hur,The Graduate,Willy Wonka & the Chocolate Factory,Stand by Me,Aladdin,"Crouching Tiger, Hidden Dragon",Midnight in Paris
2,Gone with the Wind,Citizen Kane,North by Northwest,For a Few Dollars More,Jaws,Cinema Paradiso,What's Eating Gilbert Grape,The Royal Tenenbaums,The Help
3,Snow White and the Seven Dwarfs,Fantasia,Seven Samurai,The Sound of Music,The Exorcist,The Princess Bride,Philadelphia,"O Brother, Where Art Thou?",The Perks of Being a Wallflower
4,City Lights,The Treasure of the Sierra Madre,Vertigo,"The Good, the Bad and the Ugly",Chinatown,The Little Mermaid,Apollo 13,Cast Away,The Big Short
5,It Happened One Night,The Great Dictator,Some Like It Hot,Once Upon a Time in the West,Taxi Driver,Ferris Bueller's Day Off,Office Space,Shrek,50/50
6,Animal Crackers,Miracle on 34th Street,Rear Window,Mary Poppins,Annie Hall,Batman,Groundhog Day,American Psycho,Moneyball
7,Bringing Up Baby,Arsenic and Old Lace,12 Angry Men,Psycho,Monty Python and the Holy Grail,The Goonies,As Good as It Gets,"Monsters, Inc.",Captain Phillips
8,A Night at the Opera,The Philadelphia Story,All About Eve,2001: A Space Odyssey,The Deer Hunter,Amadeus,Beauty and the Beast,Moulin Rouge!,About Time
9,Mr. Smith Goes to Washington,Pinocchio,Limelight,The Jungle Book,Network,Platoon,Dogma,Road to Perdition,The Intouchables


We can see that creating just one filter, we are able to get a variety of subsets of movies that can be used to populate multiple carousels/rails in UnicornFlix ("1990s movies", "1980s movies" etc)

### Get recommendations unsing the genre filter

Now that we have the ability to generate recommendations via the "Top Picks For You" Recommender, we would like to extend this feature to all of the rails/shleves in the UnicornFlix app. We can use a dynamic filter to provide recommendations based on any item metadata field. In this case we will use the "GENRES" categorical information in the IMDb dataset. 

In [33]:
def get_new_recommendations_df_by_genre_filter(recommendations_df, user_id, genre_filter_arn, filter_values):
    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        recommenderArn = workshop_recommender_top_picks_arn,
        userId = str(user_id),
        filterArn = genre_filter_arn,
        filterValues = { "GENRE": "\"" + filter_values + "\""},
        numResults = 15,
        metadataColumns = {
            "ITEMS": ['TITLE']
        }
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        try:
            movie = item['metadata']['title']
        except:
            movie = items_df.loc[item['itemId']]['TITLE']
        recommendation_list.append(movie)
    filter_name = genre_filter_arn.split('/')[1]
    new_rec_DF = pd.DataFrame(recommendation_list, columns = [filter_values])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_DF], axis=1)
    return recommendations_df

In [34]:
# Create a dataframe for the items by reading in the correct source CSV
items_meta_df = items_df
# Render some sample data
items_meta_df.head(5)

Unnamed: 0_level_0,TITLE,YEAR,IMDB_RATING,IMDB_NUMBEROFVOTES,PLOT,US_MATURITY_RATING_STRING,US_MATURITY_RATING,GENRES,CREATION_TIMESTAMP,PROMOTION
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
tt0906325,Bagboy,2007,4,741,A teenager enters the competitive world of gro...,PG-13,13,Comedy,1167609600,False
tt0906665,Sukiyaki Western Django,2007,6,14582,A nameless gunfighter arrives in a town ripped...,R,17,Action|Western,1167609600,False
tt0907657,Once,2007,8,109787,A modern-day musical about a busker and an imm...,R,17,Drama|Music|Romance,1167609600,False
tt0910905,In the Electric Mist,2009,6,16592,A detective in post-Katrina New Orleans has a ...,R,17,Crime|Drama|Mystery|Thriller,1230768000,False
tt0910554,Frequently Asked Questions About Time Travel,2009,7,32051,"While drinking at their local pub, three socia...",PG-13,13,Comedy|Sci-Fi,1230768000,False


Now what we want to do is determine the genres to filter on, for that we need a list of all genres. First we will get all the unique values of the column GENRE, then split strings on | if they exist, everyone will then get added to a long list which will be converted to a set for efficiency. That set will then be made into a list so that it can be iterated, and we can then use the get recommendatioins API.

In [35]:
unique_genre_field_values = items_meta_df['GENRES'].unique()

genre_val_list = []

def process_for_bar_char(val, val_list):
    if '|' in val:
        values = val.split('|')
        for item in values:
            val_list.append(item)
    elif '(' in val:
        pass
    else:
        val_list.append(val)
    return val_list
    

for val in unique_genre_field_values:
    genre_val_list = process_for_bar_char(val, genre_val_list)

genres_to_filter = list(set(genre_val_list))

In [36]:
genres_to_filter

['Drama',
 'Thriller',
 'Sport',
 'Mystery',
 'War',
 'Sci-Fi',
 'Fantasy',
 'Family',
 'Musical',
 'Action',
 'Adventure',
 'Short',
 'Film-Noir',
 'Comedy',
 'Music',
 'Horror',
 'History',
 'News',
 'Biography',
 'Documentary',
 'Crime',
 'Romance',
 'Animation',
 'Western']

In [37]:
# Iterate through Genres
recommendations_df_genre_shelves = pd.DataFrame()
for genre in genres_to_filter:
    recommendations_df_genre_shelves = get_new_recommendations_df_by_genre_filter(recommendations_df_genre_shelves, user, genre_filter_arn , genre)
    
recommendations_df_genre_shelves

Unnamed: 0,Drama,Thriller,Sport,Mystery,War,Sci-Fi,Fantasy,Family,Musical,Action,...,Music,Horror,History,News,Biography,Documentary,Crime,Romance,Animation,Western
0,Almost Famous,Heat,Jerry Maguire,Mulholland Dr.,Life Is Beautiful,Gattaca,"Crouching Tiger, Hidden Dragon",Aladdin,Aladdin,"Crouching Tiger, Hidden Dragon",...,Almost Famous,The Others,Apollo 13,Blackfish,Blow,Bowling for Columbine,"O Brother, Where Art Thou?","Crouching Tiger, Hidden Dragon",Aladdin,Dances with Wolves
1,The Royal Tenenbaums,Road to Perdition,Raging Bull,The Others,Platoon,Jurassic Park,Being John Malkovich,Beauty and the Beast,Beauty and the Beast,The Boondock Saints,...,"O Brother, Where Art Thou?",28 Days Later...,Amadeus,Born Into Brothels: Calcutta's Red Light Kids,The Insider,Crumb,Heat,Aladdin,Beauty and the Beast,Unforgiven
2,Being John Malkovich,Mulholland Dr.,Hoop Dreams,The Fugitive,Mulan,Minority Report,Aladdin,Shrek,Moulin Rouge!,Jurassic Park,...,Amadeus,Interview with the Vampire: The Vampire Chroni...,Enemy at the Gates,"Food, Inc.",Amadeus,Hoop Dreams,American Psycho,Cast Away,Shrek,Dead Man
3,Cast Away,Training Day,Lagaan: Once Upon a Time in India,Minority Report,Enemy at the Gates,28 Days Later...,Groundhog Day,"Monsters, Inc.",The Little Mermaid,The Fugitive,...,Billy Elliot,The Exorcist,Black Hawk Down,13th,The Aviator,American Movie,Road to Perdition,Groundhog Day,"Monsters, Inc.",Tombstone
4,What's Eating Gilbert Grape,The Boondock Saints,Dodgeball,The Bourne Identity,Black Hawk Down,12 Monkeys,Beauty and the Beast,Toy Story,The Wizard of Oz,Batman,...,The Piano,Army of Darkness,Gandhi,Citizenfour,Gandhi,Spellbound,Blow,As Good as It Gets,Toy Story,Legends of the Fall
5,Philadelphia,The Others,Happy Gilmore,The Game,The Patriot,Total Recall,Dogma,The Princess Bride,"South Park: Bigger, Longer & Uncut",True Lies,...,Walk the Line,The Ring,The Patriot,Restrepo,Catch Me If You Can,Buena Vista Social Club,Training Day,Beauty and the Beast,The Little Mermaid,For a Few Dollars More
6,Apollo 13,Gattaca,Rocky,12 Monkeys,Braveheart,The Fifth Element,Shrek,The Little Mermaid,Mulan,Minority Report,...,The Pianist,Pitch Black,Braveheart,An Inconvenient Truth,Braveheart,Hearts of Darkness: A Filmmaker's Apocalypse,The Boondock Saints,Moulin Rouge!,Ice Age,"The Good, the Bad and the Ugly"
7,As Good as It Gets,Jurassic Park,Million Dollar Baby,Donnie Darko,Casablanca,Donnie Darko,"Monsters, Inc.",The Wizard of Oz,Singin' in the Rain,The Mummy,...,High Fidelity,Shaun of the Dead,Tombstone,Requiem for the American Dream,A Beautiful Mind,The Blue Planet,The Fugitive,Big Fish,"South Park: Bigger, Longer & Uncut",Once Upon a Time in the West
8,The Breakfast Club,The Insider,The Blind Side,Insomnia,Gone with the Wind,Vanilla Sky,Toy Story,Harry Potter and the Sorcerer's Stone,Willy Wonka & the Chocolate Factory,Die Hard with a Vengeance,...,8 Mile,Sleepy Hollow,Gone with the Wind,The Culture High,October Sky,Baraka,Minority Report,The Princess Bride,Mulan,Back to the Future Part III
9,Dogma,The Fugitive,The Sandlot,Vanilla Sky,Legends of the Fall,Signs,Big Fish,Ice Age,The Lion King,The Bourne Identity,...,School of Rock,The Thing,JFK,Burma VJ: Reporting from a Closed Country,Walk the Line,The Endurance,Gangs of New York,The Little Mermaid,Princess Mononoke,Of Mice and Men


Great! Now we have a personalized list of recommendations for each genre in our catalog. We can use this on the homepage for each of the carousels/rails that correspond to genre, such as "Action movies", "Documentaries" etc.

## Real-time Events<a class="anchor" id="real-time"></a>
[Back to top](#top)

The next topic is real-time events. Personalize has the ability to listen to events from your application in order to update the recommendations shown to the user. This is especially useful in media workloads, like video-on-demand, where a customer's intent may differ based on if they are watching with their children or on their own.

Additionally the events that are recorded via this system are stored until a delete call from you is issued, and they are used as historical data alongside the other interaction data you provided when you train your next models.

Start by creating an event tracker that is attached to the dataset group. This event tracker will add information to the dataset and will influence the recommendations.

In [38]:
event_tracker_name = 'MovieTracker'
try: 
    create_event_tracker_response = personalize.create_event_tracker(
        name = event_tracker_name,
        datasetGroupArn=workshop_dataset_group_arn
        )
    event_tracker_arn = create_event_tracker_response['eventTrackerArn']
    print(json.dumps(create_event_tracker_response, indent=2))
    print ('\nCreating the event_tracker with event_tracker_arn = {}'.format(event_tracker_arn))
    tracking_id = create_event_tracker_response['trackingId']
    print ('\nAnd trackingId = {}'.format(tracking_id))
    

except personalize.exceptions.ResourceAlreadyExistsException as e:
    event_tracker_list = personalize.list_event_trackers( 
        datasetGroupArn= workshop_dataset_group_arn
    )['eventTrackers']
    
    event_tracker_arn = event_tracker_list[0]['eventTrackerArn']
    
    describe_event_tracker_response = personalize.describe_event_tracker(
        eventTrackerArn=event_tracker_arn
    )
    tracking_id = describe_event_tracker_response['eventTracker']['trackingId']
    
    print ('\nThe the Event Tracker with event_tracker_name {} already exists'.format(event_tracker_name))
    print ('\nWe will be using the existing Event Tracker with event_tracker_arn = {}'.format(event_tracker_arn))
    print ('\nAnd trackingId = {}'.format(tracking_id))

{
  "eventTrackerArn": "arn:aws:personalize:us-east-1:809697808660:event-tracker/ca4f41f3",
  "trackingId": "55d3a75d-f05d-4fd1-a503-8b740c13c67c",
  "ResponseMetadata": {
    "RequestId": "29716de3-f0d4-43c1-9af9-5e27a30bf114",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Wed, 28 Aug 2024 13:40:57 GMT",
      "content-type": "application/x-amz-json-1.1",
      "content-length": "139",
      "connection": "keep-alive",
      "x-amzn-requestid": "29716de3-f0d4-43c1-9af9-5e27a30bf114",
      "strict-transport-security": "max-age=47304000; includeSubDomains",
      "x-frame-options": "DENY",
      "cache-control": "no-cache",
      "x-content-type-options": "nosniff"
    },
    "RetryAttempts": 0
  }
}

Creating the event_tracker with event_tracker_arn = arn:aws:personalize:us-east-1:809697808660:event-tracker/ca4f41f3

And trackingId = 55d3a75d-f05d-4fd1-a503-8b740c13c67c


In [39]:
max_time = time.time() + 3*60*60 # 3 hours
while time.time() < max_time:

    version_response = personalize.describe_event_tracker(
        eventTrackerArn = event_tracker_arn
    )
    status = version_response['eventTracker']['status']

    if status == 'ACTIVE':
        print('Build succeeded for {}'.format(event_tracker_arn))
    elif status == "CREATE FAILED":
        print('Build failed for {}'.format(wevent_tracker_arn))
        in_progress_campaigns.remove(event_tracker_arn)
    
    if status == 'ACTIVE' or status == 'CREATE FAILED':
        break
    else:
        print('The event tracker build is still in progress')
        
    time.sleep(15)

The event tracker build is still in progress
The event tracker build is still in progress
Build succeeded for arn:aws:personalize:us-east-1:809697808660:event-tracker/ca4f41f3


We will create some code that simulates a user interacting with a particular item. After running this code, you will get recommendations that differ from the results above.

We start by creating some methods for the simulation of real time events.

In [40]:
sessionDict = {}

def send_movie_click(user_id, item_id, event_type):
    """
    Simulates a click as an envent
    to send an event to Amazon Personalize's Event Tracker
    """
    # Configure Session
    session_dict = {}
    try:
        session_id = sessionDict[str(user_id)]
    except:
        session_dict[str(user_id)] = str(uuid.uuid1())
        session_id = session_dict[str(user_id)]
        
    # Configure Properties:
    event = {
    "itemId": str(item_id),
    }
    event_json = json.dumps(event)
        
    # Make Call
    
    personalize_events.put_events(
    trackingId = tracking_id,
    userId= str(user_id),
    sessionId = session_id,
    eventList = [{
        'sentAt': int(time.time()),
        'eventType': str(event_type),
        'properties': event_json
        }]
    )

def get_new_recommendations_df_users_real_time(recommendations_df, user_id, item_id, event_type):
    # Get the artist name (header of column)
    movie_name = items_df.loc[item_id]['TITLE']
    
    # Interact with different movies
    print('sending event ' + event_type + ' for ' + movie_name)
    send_movie_click(user_id=user_id, item_id=item_id,event_type=event_type)
    # Get the recommendations (note you should have a base recommendation DF created before)
    time.sleep(2)
    get_recommendations_response = personalize_runtime.get_recommendations(
        recommenderArn = workshop_recommender_top_picks_arn,
        userId = str(user_id),
        numResults = 15,
        metadataColumns = {
            "ITEMS": ['TITLE']
        }
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        try:
            movie = item['metadata']['title']
        except:
            movie = items_df.loc[item['itemId']]['TITLE']
        recommendation_list.append(movie)
    new_rec_df = pd.DataFrame(recommendation_list, columns = [movie_name])
    # Add this dataframe to the old one
    #recommendations_df = recommendations_df.join(new_rec_DF)
    recommendations_df = pd.concat([recommendations_df, new_rec_df], axis=1)
    return recommendations_df

At this point, we haven't generated any real-time events yet; we have only set up the code. To compare the recommendations before and after the real-time events, let's pick one user and generate the original recommendations for them.

In [41]:
# Get recommendations for the user
get_recommendations_response = personalize_runtime.get_recommendations(
        recommenderArn = workshop_recommender_top_picks_arn,
        userId = str(rerank_user),
        numResults = 15,
        metadataColumns = {
            "ITEMS": ['TITLE']
        }
    )

# Build a new dataframe for the recommendations
item_list = get_recommendations_response['itemList']
recommendation_list = []
for item in item_list:
    try:
        movie = item['metadata']['title']
    except:
        movie = items_df.loc[item['itemId']]['TITLE']
    recommendation_list.append(movie)
user_recommendations_df = pd.DataFrame(recommendation_list, columns = [rerank_user])
user_recommendations_df

Unnamed: 0,591
0,Almost Famous
1,"Crouching Tiger, Hidden Dragon"
2,The Royal Tenenbaums
3,Being John Malkovich
4,"O Brother, Where Art Thou?"
5,Aladdin
6,Cast Away
7,What's Eating Gilbert Grape
8,Philadelphia
9,Apollo 13


Now we have a list of recommendations for this user before we have applied any real-time events. Now let's pick 3 random artists which we will simulate our user interacting with, and then see how this changes the recommendations.

In [42]:
# Next generate 3 random movies
movies = items_df.sample(3).index.tolist()

In [43]:
# Note this will take about 15 seconds to complete due to the sleeps
for movie in movies:
    user_recommendations_df = get_new_recommendations_df_users_real_time(user_recommendations_df, rerank_user, movie,'click')
    time.sleep(5)
    

sending event click for Gone Baby Gone
sending event click for 21
sending event click for The Prince and the Pauper


Now we can look at how the click events changed the recommendations. Note: given that only a single movie was clicked between each get recomendations call, the changes may be subtle. The more interactions a user sends in the larger the change in recommendations will be.

In [44]:
user_recommendations_df

Unnamed: 0,591,Gone Baby Gone,21,The Prince and the Pauper
0,Almost Famous,The Fugitive,Almost Famous,Almost Famous
1,"Crouching Tiger, Hidden Dragon",Dog Day Afternoon,"Crouching Tiger, Hidden Dragon","Crouching Tiger, Hidden Dragon"
2,The Royal Tenenbaums,Chinatown,The Royal Tenenbaums,The Royal Tenenbaums
3,Being John Malkovich,Die Hard 2,Being John Malkovich,Being John Malkovich
4,"O Brother, Where Art Thou?",Citizen Kane,"O Brother, Where Art Thou?","O Brother, Where Art Thou?"
5,Aladdin,Contact,Aladdin,Aladdin
6,Cast Away,Eyes Wide Shut,Cast Away,Cast Away
7,What's Eating Gilbert Grape,Enemy of the State,What's Eating Gilbert Grape,What's Eating Gilbert Grape
8,Philadelphia,The Devil's Advocate,Philadelphia,Philadelphia
9,Apollo 13,Shooter,Apollo 13,Apollo 13


In the cell above, the first column after the index is the user's default recommendations from the "Top picks for you" recommender, and each column after that has as a header of the movie that they interacted with via a real time event, and the recommendations after this event occurred. 

Inside of UnicornFlix, we would use this feature to make sure that the list of recommendations are staying current with exactly what the viewer is interacting with the application.

The behavior may not shift very much or a lot; this is due to the relatively limited nature of this dataset and effect of a few random clicks. If you wanted to better understand this, try simulating clicking more movies to see the impact.

### Filtering on user's recent interactions
Now lets look at the event filters, which allow you to filter items based on the interaction data. For this dataset, it could be click or watch based on the data we imported, but could be based on whatever interaction schema you design (click, rate, like, watch, purchase etc.) 

We will create a new helper function to use the personalized ranking campaign, sice the recommenders have an inbuilt filter to filter out watched content.

In [45]:
def get_new_ranked_recommendations_df_by_static_filter(recommendations_df, user_id, rerank_item_list, filter_arn):
    
    # Todo: update with new feature     
    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_personalized_ranking(
        campaignArn = workshop_rerank_campaign_arn,
        userId = str(user_id),
        inputList = rerank_item_list,
        filterArn = filter_arn,
        metadataColumns = {
            "ITEMS": ['TITLE']
        }
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['personalizedRanking']
    recommendation_list = []
    for item in item_list:
        try:
            movie = item['metadata']['title']
        except:
            movie = items_df.loc[item['itemId']]['TITLE']
        recommendation_list.append(movie)


    filter_name = filter_arn.split('/')[1]
    new_rec_df = pd.DataFrame(recommendation_list, columns = [filter_name])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_df], axis=1)
    return recommendations_df

In [46]:
recommendations_df_events = pd.DataFrame()
for filter_arn in interaction_filter_arns:
    recommendations_df_events = get_new_ranked_recommendations_df_by_static_filter(recommendations_df_events, rerank_user, rerank_item_list, filter_arn)
    
recommendations_df_events

Unnamed: 0,watched,unwatched
0,,Mondo cane
1,,Free Willy
2,,Captain Phillips
3,,Hard Eight
4,,Beneath the Planet of the Apes
5,,The Return
6,,Mad City
7,,Hotel Transylvania 2
8,,The Allnighter
9,,Underworld Awakening


It is very likely that the `watched` column of the dataframe will have several `NaN` values, this is because we randomly selected some movies from the catalog to rank and our randomly selected user has not watched them.

Now lets send a watch event in for the top 4 unwatched recommendations, which would simulate watching 4 movies. In a VOD application, you may choose to send in an event after they have watched a significant amount (over 75%) of a piece of content. Sending at 100% complete could miss people that stop short of the credits.

In [47]:
ranked_unwatched_recommendations_response = personalize_runtime.get_personalized_ranking(
    campaignArn = workshop_rerank_campaign_arn,
    userId = str(rerank_user),
    inputList = rerank_item_list,
    filterArn = filter_arn,
    metadataColumns = {
        "ITEMS": ['TITLE']
    })

item_list = ranked_unwatched_recommendations_response['personalizedRanking'][:4]

# Todo: update with new feature  

for item in item_list:
    print('sending event Watch for ' + items_df.loc[item['itemId']]['TITLE'])
    send_movie_click(user_id=rerank_user, item_id=item['itemId'], event_type='Watch')
    time.sleep(5)

sending event Watch for Mondo cane
sending event Watch for Free Willy
sending event Watch for Captain Phillips
sending event Watch for Hard Eight


Now we can look at the event filters to see the updated watched and unwatched recommendations 

In [48]:
recommendations_df_events = pd.DataFrame()
for filter_arn in interaction_filter_arns:
    recommendations_df_events = get_new_ranked_recommendations_df_by_static_filter(recommendations_df_events, rerank_user, rerank_item_list, filter_arn)
recommendations_df_events

Unnamed: 0,watched,unwatched
0,Hard Eight,Mad City
1,Captain Phillips,An Affair of Love
2,Free Willy,The Return
3,Mondo cane,The Babysitter
4,,The Allnighter
5,,Smooth Talk
6,,Hotel Transylvania 2
7,,Beneath the Planet of the Apes
8,,Underworld Awakening
9,,The Lego Batman Movie


Inside of UnicornFlix we could use this feature to filter out movies that have already been watched, so that recommendations will be for movies that the viewer has not already seen. Another feature that can be developed is a "Watch again" carousel/rail, where you can recommend content that has been seen in the past, but due to current viewing habits, may be interesting to the viewer to revisit.

## Wrap up <a class="anchor" id="wrapup"></a>
[Back to top](#top)

With that you now have a fully working collection of models to tackle various recommendation and personalization scenarios, as well as the skills to manipulate customer data to better integrate with the service, and a knowledge of how to do all this over APIs and by leveraging open source data science tools.

You'll want to make sure that you clean up all of the resources deployed during this POC. We have provided a separate notebook which shows you how to identify and delete the resources in [`Media_06_Clean_Up.ipynb`](Media_06_Clean_Up.ipynb). Optionally feel free to investigate the section on outbound marketing [`Media_05_Optional_Personalized_Emails_with_Amazon_Personalize_and_Generative_AI`](Media_05_Optional_Personalized_Emails_with_Amazon_Personalize_and_Generative_AI.ipynb)showing how to create a what to watch list for customers who may not have logged onto the platform in a while.

In [49]:
%store workshop_dataset_group_arn
%store region
%store role_name

Stored 'workshop_dataset_group_arn' (str)
Stored 'region' (str)
Stored 'role_name' (str)
