# Interacting with Campaigns <a class="anchor" id="top"></a>

In this notebook, you will deploy and interact with campaigns in Amazon Personalize.

1. [Introduction](#intro)
1. [Create campaigns](#create)
1. [Interact with campaigns](#interact)
1. [Batch recommendations](#batch)
1. [Wrap up](#wrapup)

## Introduction <a class="anchor" id="intro"></a>
[Back to top](#top)

At this point, you should have several solutions and at least one solution version for each. Once a solution version is created, it is possible to get recommendations from them, and to get a feel for their overall behavior.

This notebook starts off by deploying each of the solution versions from the previous notebook into individual campaigns. Once they are active, there are resources for querying the recommendations, and helper functions to digest the output into something more human-readable. 

As you with your customer on Amazon Personalize, you can modify the helper functions to fit the structure of their data input files to keep the additional rendering working.

To get started, once again, we need to import libraries, load values from previous notebooks, and load the SDK.

In [1]:
import time
from time import sleep
import json
from datetime import datetime
import uuid
import random

import boto3
import pandas as pd

In [2]:
%store -r

In [3]:
personalize = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')

# Establish a connection to Personalize's event streaming
personalize_events = boto3.client(service_name='personalize-events')

## Interact with campaigns <a class="anchor" id="interact"></a>
[Back to top](#top)

Now that all campaigns are deployed and active, we can start to get recommendations via an API call. Each of the campaigns is based on a different recipe, which behave in slightly different ways because they serve different use cases. We will cover each campaign in a different order than used in previous notebooks, in order to deal with the possible complexities in ascending order (i.e. simplest first).

First, let's create a supporting function to help make sense of the results returned by a Personalize campaign. Personalize returns only an `item_id`. This is great for keeping data compact, but it means you need to query a database or lookup table to get a human-readable result for the notebooks. We will create a helper function to return a human-readable result from the LastFM dataset.

Start by loading in the dataset which we can use for our lookup table.

In [4]:
# Create a dataframe for the items by reading in the correct source CSV
items_df = pd.read_csv(dataset_dir + '/movies.csv', sep=',', usecols=[0,1], encoding='latin-1', dtype={'movieId': "object", 'title': "str"},index_col=0)

# Render some sample data
items_df.head(5)

Unnamed: 0_level_0,title
movieId,Unnamed: 1_level_1
1,Toy Story (1995)
2,Jumanji (1995)
3,Grumpier Old Men (1995)
4,Waiting to Exhale (1995)
5,Father of the Bride Part II (1995)


By defining the ID column as the index column it is trivial to return an artist by just querying the ID. Movie #589 should be Terminator 2: Judgment Day.

In [5]:
movie_id_example = 589
title = items_df.loc[movie_id_example]['title']
print(title)

Terminator 2: Judgment Day (1991)


That isn't terrible, but it would get messy to repeat this everywhere in our code, so the function below will clean that up.

In [6]:
def get_movie_by_id(movie_id, movie_df=items_df):
    """
    This takes in an artist_id from Personalize so it will be a string,
    converts it to an int, and then does a lookup in a default or specified
    dataframe.
    
    A really broad try/except clause was added in case anything goes wrong.
    
    Feel free to add more debugging or filtering here to improve results if
    you hit an error.
    """
    try:
        return movie_df.loc[int(movie_id)]['title']
    except:
        return "Error obtaining title"

Now let's test a few simple values to check our error catching.

In [7]:
# A known good id (The Princess Bride)
print(get_movie_by_id(movie_id="1197"))
# A bad type of value
print(get_movie_by_id(movie_id="987.9393939"))
# Really bad values
print(get_movie_by_id(movie_id="Steve"))

Princess Bride, The (1987)
Error obtaining title
Error obtaining title


Great! Now we have a way of rendering results. 

### SIMS

SIMS requires just an item as input, and it will return items which users interact with in similar ways to their interaction with the input item. In this particular case the item is a movie. 

The cells below will handle getting recommendations from SIMS and rendering the results. Let's see what the recommendations are for the first item we looked at earlier in this notebook (Terminator 2: Judgment Day).

In [8]:
get_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = sims_campaign_arn,
    itemId = str(589),
)

In [9]:
item_list = get_recommendations_response['itemList']
for item in item_list:
    print(get_movie_by_id(movie_id=item['itemId']))

Jurassic Park (1993)
Braveheart (1995)
Terminator, The (1984)
Fugitive, The (1993)
Speed (1994)
Crimson Tide (1995)
GoldenEye (1995)
Batman (1989)
Clear and Present Danger (1994)
True Lies (1994)
Mask, The (1994)
Die Hard: With a Vengeance (1995)
In the Line of Fire (1993)
Lion King, The (1994)
Forrest Gump (1994)
Ghost (1990)
Apollo 13 (1995)
Star Trek: Generations (1994)
Cliffhanger (1993)
Firm, The (1993)
Die Hard (1988)
Mission: Impossible (1996)
Seven (a.k.a. Se7en) (1995)
Indiana Jones and the Last Crusade (1989)
Mrs. Doubtfire (1993)


Congrats, this is your first list of recommendations! This list is fine, but it would be better to see the recommendations for our sample collection of artists render in a nice dataframe. Again, let's create a helper function to achieve this.

In [10]:
# Update DF rendering
pd.set_option('display.max_rows', 30)

def get_new_recommendations_df(recommendations_df, movie_ID):
    # Get the movie name
    movie_name = get_movie_by_id(movie_ID)
    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        campaignArn = sims_campaign_arn,
        itemId = str(movie_ID),
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        movie = get_movie_by_id(item['itemId'])
        recommendation_list.append(movie)
    new_rec_DF = pd.DataFrame(recommendation_list, columns = [movie_name])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_DF], axis=1)
    return recommendations_df

Now, let's test the helper function with several different movies. Let's sample some data from our dataset to test our SIMS campaign. Grab 5 random movies from our dataframe.

Note: We are going to show similar titles, so you may want to re-run the sample until you recognize some of the movies listed

In [11]:
samples = items_df.sample(5)
samples

Unnamed: 0_level_0,title
movieId,Unnamed: 1_level_1
4339,Von Ryan's Express (1965)
71732,I Sell the Dead (2008)
91860,"Way South, The (De weg naar het zuiden) (1981)"
39234,North Country (2005)
38164,"All This, and Heaven Too (1940)"


In [12]:
sims_recommendations_df = pd.DataFrame()
movies = samples.index.tolist()

for movie in movies:
    sims_recommendations_df = get_new_recommendations_df(sims_recommendations_df, movie)

sims_recommendations_df

Unnamed: 0,Von Ryan's Express (1965),I Sell the Dead (2008),"Way South, The (De weg naar het zuiden) (1981)",North Country (2005),"All This, and Heaven Too (1940)"
0,"Ox-Bow Incident, The (1943)","Shawshank Redemption, The (1994)",De platte jungle (1978),"Shawshank Redemption, The (1994)","Accused, The (1988)"
1,"Dolce Vita, La (1960)",Forrest Gump (1994),Too Big to Fail (2011),Forrest Gump (1994),Absence of Malice (1981)
2,"Long, Hot Summer, The (1958)",Pulp Fiction (1994),Partly Cloudy (2009),Pulp Fiction (1994),'Salem's Lot (2004)
3,Requiem for a Heavyweight (1962),"Silence of the Lambs, The (1991)",Blade Runner 2049 (2017),"Silence of the Lambs, The (1991)",Adam's Rib (1949)
4,"Suddenly, Last Summer (1959)","Matrix, The (1999)",Wall Street (1987),"Matrix, The (1999)",All That Jazz (1979)
5,Marty (1955),Braveheart (1995),Cosmos,Braveheart (1995),10 (1979)
6,Meet John Doe (1941),Schindler's List (1993),"Three Billboards Outside Ebbing, Missouri (2017)",Schindler's List (1993),'Til There Was You (1997)
7,All the King's Men (1949),Star Wars: Episode IV - A New Hope (1977),,Star Wars: Episode IV - A New Hope (1977),84 Charing Cross Road (1987)
8,Lifeboat (1944),Jurassic Park (1993),,Jurassic Park (1993),All That Heaven Allows (1955)
9,The Diary of Anne Frank (1959),Terminator 2: Judgment Day (1991),,Terminator 2: Judgment Day (1991),3 Women (Three Women) (1977)


You may notice that a lot of the items look the same, hopefully not all of them do (this is more likely with a smaller # of interactions, which will be more common with the movielens small dataset). This shows that the evaluation metrics should not be the only thing you rely on when evaluating your solution version. So when this happens, what can you do to improve the results?

This is a good time to think about the hyperparameters of the Personalize recipes. The SIMS recipe has a `popularity_discount_factor` hyperparameter (see [documentation](https://docs.aws.amazon.com/personalize/latest/dg/native-recipe-sims.html)). Leveraging this hyperparameter allows you to control the nuance you see in the results. This parameter and its behavior will be unique to every dataset you encounter, and depends on the goals of the business. You can iterate on the value of this hyperparameter until you are satisfied with the results, or you can start by leveraging Personalize's hyperparameter optimization (HPO) feature. For more information on hyperparameters and HPO tuning, see the [documentation](https://docs.aws.amazon.com/personalize/latest/dg/customizing-solution-config-hpo.html).

### User Personalization

HRNN is one of the more advanced algorithms provided by Amazon Personalize. It supports personalization of the items for a specific user based on their past behavior and can intake real time events in order to alter recommendations for a user without retraining. 

Since HRNN relies on having a sampling of users, let's load the data we need for that and select 3 random users. Since Movielens does not include user data, we will select 3 random numbers from the range of user id's in the dataset.

In [13]:
if not USE_FULL_MOVIELENS:
    users = random.sample(range(1, 600), 3)
else:
    users = random.sample(range(1, 162000), 3)
users

[320, 403, 59]

Now we render the recommendations for our 3 random users from above. After that, we will explore real-time interactions before moving on to Personalized Ranking.

Again, we create a helper function to render the results in a nice dataframe.

#### API call results

In [14]:
# Update DF rendering
pd.set_option('display.max_rows', 30)

def get_new_recommendations_df_users(recommendations_df, user_id):
    # Get the movie name
    #movie_name = get_movie_by_id(artist_ID)
    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        campaignArn = userpersonalization_campaign_arn,
        userId = str(user_id),
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        movie = get_movie_by_id(item['itemId'])
        recommendation_list.append(movie)
    new_rec_DF = pd.DataFrame(recommendation_list, columns = [user_id])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_DF], axis=1)
    return recommendations_df

In [15]:
recommendations_df_users = pd.DataFrame()
#users = users_df.sample(3).index.tolist()

for user in users:
    recommendations_df_users = get_new_recommendations_df_users(recommendations_df_users, user)

recommendations_df_users

Unnamed: 0,320,403,59
0,X2: X-Men United (2003),"Hangover, The (2009)",Network (1976)
1,"Matrix Revolutions, The (2003)",Dr. Strangelove or: How I Learned to Stop Worr...,American Graffiti (1973)
2,Avatar (2009),Clueless (1995),Chinatown (1974)
3,Spider-Man 2 (2004),LÃ©on: The Professional (a.k.a. The Profession...,This Is Spinal Tap (1984)
4,Iron Man (2008),Forgetting Sarah Marshall (2008),"Sting, The (1973)"
5,Mr. & Mrs. Smith (2005),Juno (2007),Gandhi (1982)
6,Spider-Man (2002),Reservoir Dogs (1992),Raging Bull (1980)
7,"Hitchhiker's Guide to the Galaxy, The (2005)",Borat: Cultural Learnings of America for Make ...,Amadeus (1984)
8,Fantastic Four (2005),Sleepless in Seattle (1993),Trainspotting (1996)
9,Serenity (2005),Mr. & Mrs. Smith (2005),"Big Chill, The (1983)"


Here we clearly see that the recommendations for each user are different. If you were to need a cache for these results, you could start by running the API calls through all your users and store the results, or you could use a batch export, which will be covered later in this notebook.

### Static and Dynamic Filters

Lets interact with the static filters we created in the previous notebook, and utilize dynamic filters in realtime.

A few common use cases for dynamic filters in Video On Demand are:

Categorical filters based on Item Metadata (that arent range based) - Often your item metadata will have information about the title such as Genre, Keyword, Year, Director, Actor etc. Filtering on these can provide recommendations within that data, such as action movies, Steven Spielberg movies, Movies from 1995 etc.

Events - you may want to filter out certain events and provide results based on those events, such as moving a title from a "suggestions to watch" recommendation to a "watch again" recommendations.

Now lets apply item filters to see recommendations for one of these users within each decade of our static filters.


In [71]:
def get_new_recommendations_df_by_static_filter(recommendations_df, user_id, filter_arn):
    # Get the movie name
    #movie_name = get_movie_by_id(artist_ID)
    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        campaignArn = userpersonalization_campaign_arn,
        userId = str(user_id),
        filterArn = filter_arn
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        movie = get_movie_by_id(item['itemId'])
        recommendation_list.append(movie)
    #print(recommendation_list)
    filter_name = filter_arn.split('/')[1]
    new_rec_DF = pd.DataFrame(recommendation_list, columns = [filter_name])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_DF], axis=1)
    return recommendations_df

In [72]:
def get_new_recommendations_df_by_dynamicfilter(recommendations_df, user_id, genre_filter_arn, filter_values):
    # Get the movie name
    #movie_name = get_movie_by_id(artist_ID)
    # Get the recommendations
    get_recommendations_response = personalize_runtime.get_recommendations(
        campaignArn = userpersonalization_campaign_arn,
        userId = str(user_id),
        filterArn = genre_filter_arn,
        filterValues = { "GENRE": "\"" + filter_values + "\""}
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        movie = get_movie_by_id(item['itemId'])
        recommendation_list.append(movie)
    filter_name = filter_arn.split('/')[1]
    new_rec_DF = pd.DataFrame(recommendation_list, columns = [filter_values])
    # Add this dataframe to the old one
    recommendations_df = pd.concat([recommendations_df, new_rec_DF], axis=1)
    return recommendations_df

In [70]:
recommendations_df_genre_shelves = pd.DataFrame()
for genre in genres_to_filter:
    recommendations_df_genre_shelves = get_new_recommendations_df_by_dynamicfilter(recommendations_df_genre_shelves, user, genre_filter_arn , genre)
    
recommendations_df_genre_shelves

Unnamed: 0,Comedy,Thriller,Fantasy,Horror,Crime,Western,Sci-Fi,Romance,Film-Noir,Action,Animation,Children,Adventure,Drama,Mystery,War,Musical,Documentary,IMAX
0,Network (1976),Chinatown (1974),Being John Malkovich (1999),Poltergeist (1982),Chinatown (1974),Blazing Saddles (1974),Blade Runner (1982),Annie Hall (1977),Chinatown (1974),Apocalypse Now (1979),Bobik Visiting Barbos (1977),"Christmas Story, A (1983)","Man Who Would Be King, The (1975)",Network (1976),Chinatown (1974),"Killing Fields, The (1984)",Pink Floyd: The Wall (1982),"Last Waltz, The (1978)",Apollo 13 (1995)
1,American Graffiti (1973),Escape from Alcatraz (1979),Young Frankenstein (1974),"Exorcist, The (1973)","Sting, The (1973)","Outlaw Josey Wales, The (1976)",Sleeper (1973),Three Days of the Condor (3 Days of the Condor...,L.A. Confidential (1997),Blade Runner (1982),Spirited Away (Sen to Chihiro no kamikakushi) ...,The Star Wars Holiday Special (1978),"Outlaw Josey Wales, The (1976)",American Graffiti (1973),L.A. Confidential (1997),Apocalypse Now (1979),"Muppet Movie, The (1979)",Richard Pryor Live on the Sunset Strip (1982),Everest (1998)
2,This Is Spinal Tap (1984),Badlands (1973),Monty Python and the Holy Grail (1975),"Shining, The (1980)",Trainspotting (1996),Butch Cassidy and the Sundance Kid (1969),Pi (1998),Manhattan (1979),Blood Simple (1984),"Professional, The (Le professionnel) (1981)",Who Framed Roger Rabbit? (1988),E.T. the Extra-Terrestrial (1982),"Gods Must Be Crazy, The (1980)",Gandhi (1982),"Conversation, The (1974)","Deer Hunter, The (1978)",Beat Street (1984),Bowling for Columbine (2002),Alaska: Spirit of the Wild (1997)
3,"Sting, The (1973)",L.A. Confidential (1997),"Princess Bride, The (1987)",Jaws (1975),"Godfather: Part II, The (1974)",For a Few Dollars More (Per qualche dollaro in...,Donnie Darko (2001),High Fidelity (2000),Miller's Crossing (1990),"Outlaw Josey Wales, The (1976)","Fox and the Hound, The (1981)",For the Love of Benji (1977),City of God (Cidade de Deus) (2002),Raging Bull (1980),Three Days of the Condor (3 Days of the Condor...,"Boot, Das (Boat, The) (1981)","Secret Policeman's Other Ball, The (1982)",Eddie Murphy Delirious (1983),Africa: The Serengeti (1994)
4,Trainspotting (1996),All the President's Men (1976),Carrie (1976),Gremlins (1984),Badlands (1973),High Plains Drifter (1973),Close Encounters of the Third Kind (1977),Out of Sight (1998),Touch of Evil (1958),Scarface (1983),Lupin III: The Castle Of Cagliostro (Rupan san...,"Muppet Movie, The (1979)",Never Cry Wolf (1983),Amadeus (1984),Blue Velvet (1986),"Pianist, The (2002)",Stop Making Sense (1984),"Bill Cosby, Himself (1983)",Titanica (1992)
5,"Big Chill, The (1983)",Three Days of the Condor (3 Days of the Condor...,"Dark Crystal, The (1982)","Amityville Horror, The (1979)",L.A. Confidential (1997),Little Big Man (1970),WarGames (1983),Saturday Night Fever (1977),"Long Goodbye, The (1973)",Beverly Hills Cop (1984),"Secret of NIMH, The (1982)",Who Framed Roger Rabbit? (1988),"Straight Story, The (1999)",Trainspotting (1996),"Exorcist, The (1973)",Platoon (1986),"Wizard of Oz, The (1939)",Animals are Beautiful People (1974),T-Rex: Back to the Cretaceous (1998)
6,Annie Hall (1977),Poltergeist (1982),"Christmas Carol, A (1977)",My Bloody Valentine (1981),Taxi Driver (1976),Once Upon a Time in the West (C'era una volta ...,Stalker (1979),"Graduate, The (1967)",Mulholland Drive (2001),City of God (Cidade de Deus) (2002),Priklyucheniya Kapitana Vrungelya (1979),"Fox and the Hound, The (1981)",Austin Powers: International Man of Mystery (1...,"Big Chill, The (1983)","Little Girl Who Lives Down the Lane, The (1976)",Stripes (1981),"Blues Brothers, The (1980)",Roger & Me (1989),Michael Jordan to the Max (2000)
7,Blazing Saddles (1974),Taxi Driver (1976),"City of Lost Children, The (CitÃ© des enfants ...","Omen, The (1976)",Marathon Man (1976),Unforgiven (1992),"Road Warrior, The (Mad Max 2) (1981)",Adaptation (2002),"Grifters, The (1990)",Jaws (1975),Wallace & Gromit: The Wrong Trousers (1993),"Secret of NIMH, The (1982)",Close Encounters of the Third Kind (1977),Escape from Alcatraz (1979),"Usual Suspects, The (1995)","Tin Drum, The (Blechtrommel, Die) (1979)","Decline of Western Civilization, The (1981)",Koyaanisqatsi (a.k.a. Koyaanisqatsi: Life Out ...,Harry Potter and the Goblet of Fire (2005)
8,Monty Python's Life of Brian (1979),"Insider, The (1999)",Spirited Away (Sen to Chihiro no kamikakushi) ...,Alien (1979),Fargo (1996),"Good, the Bad and the Ugly, The (Buono, il bru...",Alien (1979),Shakespeare in Love (1998),"Maltese Falcon, The (1941)",Austin Powers: International Man of Mystery (1...,Wallace & Gromit: The Best of Aardman Animatio...,"Great Muppet Caper, The (1981)",First Blood (Rambo: First Blood) (1982),"Man Who Would Be King, The (1975)",Eyes Wide Shut (1999),M*A*S*H (a.k.a. MASH) (1970),Hair (1979),It Came from Hollywood (1982),More (1998)
9,Trading Places (1983),Blue Velvet (1986),Time Bandits (1981),"Silence of the Lambs, The (1991)",Serpico (1973),High Noon (1952),Ghostbusters (a.k.a. Ghost Busters) (1984),Witness (1985),"Third Man, The (1949)",Rocky III (1982),Akira (1988),Something Wicked This Way Comes (1983),"Road Warrior, The (Mad Max 2) (1981)","Godfather: Part II, The (1974)",Gosford Park (2001),Patton (1970),Fame (1980),"War Room, The (1993)","Dark Knight, The (2008)"


You can see the recommendations for movies within a given decade. Within a VOD application you could create Shelves (also known as rails or carosels) easily by using these filters. Depending on the information you have about your items, You could also filter on additional information such as keyword, year/decade etc.

In [73]:
recommendations_df_decade_shelves = pd.DataFrame()
for filter_arn in meta_filter_decade_arns:
    recommendations_df_decade_shelves = get_new_recommendations_df_by_static_filter(recommendations_df_decade_shelves, user, filter_arn)

recommendations_df_decade_shelves

Unnamed: 0,1950s,1960s,1970s,1980s,1990s,2000s,2010s
0,Touch of Evil (1958),Cool Hand Luke (1967),Network (1976),This Is Spinal Tap (1984),Trainspotting (1996),Requiem for a Dream (2000),Buster's Mal Heart (2017)
1,Some Like It Hot (1959),"Hustler, The (1961)",American Graffiti (1973),Gandhi (1982),L.A. Confidential (1997),"Royal Tenenbaums, The (2001)",Wild Horses (2015)
2,Dial M for Murder (1954),"Producers, The (1968)",Chinatown (1974),Raging Bull (1980),"Insider, The (1999)",Snatch (2000),Wild Tales (2014)
3,On the Waterfront (1954),Butch Cassidy and the Sundance Kid (1969),"Sting, The (1973)",Amadeus (1984),Boogie Nights (1997),City of God (Cidade de Deus) (2002),Dope (2015)
4,Rear Window (1954),Midnight Cowboy (1969),Annie Hall (1977),"Big Chill, The (1983)",Fargo (1996),High Fidelity (2000),Gone Girl (2014)
5,"Seventh Seal, The (Sjunde inseglet, Det) (1957)",Bonnie and Clyde (1967),Escape from Alcatraz (1979),Trading Places (1983),True Romance (1993),Amores Perros (Love's a Bitch) (2000),"Grand Budapest Hotel, The (2014)"
6,Vertigo (1958),"Graduate, The (1967)","Man Who Would Be King, The (1975)",Poltergeist (1982),Sling Blade (1996),Almost Famous (2000),John From (2015)
7,High Noon (1952),For a Few Dollars More (Per qualche dollaro in...,"Godfather: Part II, The (1974)","Killing Fields, The (1984)",Being John Malkovich (1999),Traffic (2000),The African Doctor (2016)
8,"Streetcar Named Desire, A (1951)","Man for All Seasons, A (1966)",Badlands (1973),Blue Velvet (1986),Swingers (1996),Gosford Park (2001),"Big Short, The (2015)"
9,Witness for the Prosecution (1957),Psycho (1960),Blazing Saddles (1974),Blade Runner (1982),"Big Lebowski, The (1998)",About Schmidt (2002),The Boy Next Door (2015)


In [23]:
# Create a dataframe for the items by reading in the correct source CSV
items_meta_df = pd.read_csv(data_dir + '/item-meta.csv', sep=',', index_col=0)

# Render some sample data
items_meta_df.head(10)

Unnamed: 0_level_0,GENRE,YEAR
ITEM_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
1,Adventure|Animation|Children|Comedy|Fantasy,1995
2,Adventure|Children|Fantasy,1995
3,Comedy|Romance,1995
4,Comedy|Drama|Romance,1995
5,Comedy,1995
6,Action|Crime|Thriller,1995
7,Comedy|Romance,1995
8,Adventure|Children,1995
9,Action,1995
10,Action|Adventure|Thriller,1995


Now what we want to do is determine the genres to filter on, for that we need a list of all genres. First we will get all the unique values of the column GENRE, then split strings on | if they exist, everyone will then get added to a long list which will be converted to a set for efficiency. That set will then be made into a list so that it can be iterated, and we can then use the get recommendatioins API.

In [24]:
unique_genre_field_values = items_meta_df['GENRE'].unique()

genre_val_list = []

def process_for_bar_char(val, val_list):
    if '|' in val:
        values = val.split('|')
        for item in values:
            val_list.append(item)
    elif '(' in val:
        pass
    else:
        val_list.append(val)
    return val_list
    

for val in unique_genre_field_values:
    genre_val_list = process_for_bar_char(val, genre_val_list)

genres_to_filter = list(set(genre_val_list))

In [25]:
genres_to_filter

['Comedy',
 'Thriller',
 'Fantasy',
 'Horror',
 'Crime',
 'Western',
 'Sci-Fi',
 'Romance',
 'Film-Noir',
 'Action',
 'Animation',
 'Children',
 'Adventure',
 'Drama',
 'Mystery',
 'War',
 'Musical',
 'Documentary',
 'IMAX']

In [75]:
# Iterate through Genres
recommendations_df_genre_shelves = pd.DataFrame()
for genre in genres_to_filter:
    recommendations_df_genre_shelves = get_new_recommendations_df_by_dynamicfilter(recommendations_df_genre_shelves, user, genre_filter_arn , genre)
    
recommendations_df_genre_shelves

Unnamed: 0,Comedy,Thriller,Fantasy,Horror,Crime,Western,Sci-Fi,Romance,Film-Noir,Action,Animation,Children,Adventure,Drama,Mystery,War,Musical,Documentary,IMAX
0,Network (1976),Chinatown (1974),Being John Malkovich (1999),Poltergeist (1982),Chinatown (1974),Blazing Saddles (1974),Blade Runner (1982),Annie Hall (1977),Chinatown (1974),Apocalypse Now (1979),Bobik Visiting Barbos (1977),"Christmas Story, A (1983)","Man Who Would Be King, The (1975)",Network (1976),Chinatown (1974),"Killing Fields, The (1984)",Pink Floyd: The Wall (1982),"Last Waltz, The (1978)",Apollo 13 (1995)
1,American Graffiti (1973),Escape from Alcatraz (1979),Young Frankenstein (1974),"Exorcist, The (1973)","Sting, The (1973)","Outlaw Josey Wales, The (1976)",Sleeper (1973),Three Days of the Condor (3 Days of the Condor...,L.A. Confidential (1997),Blade Runner (1982),Spirited Away (Sen to Chihiro no kamikakushi) ...,E.T. the Extra-Terrestrial (1982),"Outlaw Josey Wales, The (1976)",American Graffiti (1973),L.A. Confidential (1997),Apocalypse Now (1979),"Muppet Movie, The (1979)",Richard Pryor Live on the Sunset Strip (1982),Everest (1998)
2,This Is Spinal Tap (1984),Badlands (1973),Monty Python and the Holy Grail (1975),"Shining, The (1980)",Trainspotting (1996),Butch Cassidy and the Sundance Kid (1969),Pi (1998),Manhattan (1979),Blood Simple (1984),"Professional, The (Le professionnel) (1981)",Who Framed Roger Rabbit? (1988),For the Love of Benji (1977),"Gods Must Be Crazy, The (1980)",Gandhi (1982),"Conversation, The (1974)","Deer Hunter, The (1978)","Secret Policeman's Other Ball, The (1982)",Bowling for Columbine (2002),Alaska: Spirit of the Wild (1997)
3,"Sting, The (1973)",L.A. Confidential (1997),"Princess Bride, The (1987)",My Bloody Valentine (1981),"Godfather: Part II, The (1974)",For a Few Dollars More (Per qualche dollaro in...,Donnie Darko (2001),High Fidelity (2000),Miller's Crossing (1990),"Outlaw Josey Wales, The (1976)","Fox and the Hound, The (1981)","Muppet Movie, The (1979)",City of God (Cidade de Deus) (2002),Raging Bull (1980),Three Days of the Condor (3 Days of the Condor...,"Boot, Das (Boat, The) (1981)",Stop Making Sense (1984),Eddie Murphy Delirious (1983),Africa: The Serengeti (1994)
4,Trainspotting (1996),All the President's Men (1976),Carrie (1976),Jaws (1975),Badlands (1973),High Plains Drifter (1973),Close Encounters of the Third Kind (1977),Out of Sight (1998),Touch of Evil (1958),Scarface (1983),Lupin III: The Castle Of Cagliostro (Rupan san...,Who Framed Roger Rabbit? (1988),Never Cry Wolf (1983),Amadeus (1984),Blue Velvet (1986),"Pianist, The (2002)","Wizard of Oz, The (1939)","Bill Cosby, Himself (1983)",Titanica (1992)
5,"Big Chill, The (1983)",Three Days of the Condor (3 Days of the Condor...,"Christmas Carol, A (1977)",Gremlins (1984),L.A. Confidential (1997),Little Big Man (1970),WarGames (1983),Saturday Night Fever (1977),"Long Goodbye, The (1973)",Beverly Hills Cop (1984),"Secret of NIMH, The (1982)","Fox and the Hound, The (1981)","Straight Story, The (1999)",Trainspotting (1996),"Exorcist, The (1973)",Platoon (1986),"Blues Brothers, The (1980)",Animals are Beautiful People (1974),T-Rex: Back to the Cretaceous (1998)
6,Annie Hall (1977),Poltergeist (1982),"Dark Crystal, The (1982)","Amityville Horror, The (1979)",Taxi Driver (1976),Once Upon a Time in the West (C'era una volta ...,Stalker (1979),"Graduate, The (1967)",Mulholland Drive (2001),City of God (Cidade de Deus) (2002),Priklyucheniya Kapitana Vrungelya (1979),"Secret of NIMH, The (1982)",Austin Powers: International Man of Mystery (1...,"Big Chill, The (1983)","Little Girl Who Lives Down the Lane, The (1976)",Stripes (1981),"Decline of Western Civilization, The (1981)",Roger & Me (1989),Michael Jordan to the Max (2000)
7,Blazing Saddles (1974),Taxi Driver (1976),"City of Lost Children, The (CitÃ© des enfants ...","Omen, The (1976)",Marathon Man (1976),Unforgiven (1992),"Road Warrior, The (Mad Max 2) (1981)",Adaptation (2002),"Grifters, The (1990)",Jaws (1975),Wallace & Gromit: The Wrong Trousers (1993),"Great Muppet Caper, The (1981)",Close Encounters of the Third Kind (1977),Escape from Alcatraz (1979),"Usual Suspects, The (1995)","Tin Drum, The (Blechtrommel, Die) (1979)",Hair (1979),Koyaanisqatsi (a.k.a. Koyaanisqatsi: Life Out ...,Harry Potter and the Goblet of Fire (2005)
8,Monty Python's Life of Brian (1979),"Insider, The (1999)",Spirited Away (Sen to Chihiro no kamikakushi) ...,Don't Look Now (1973),Fargo (1996),"Good, the Bad and the Ugly, The (Buono, il bru...",Alien (1979),Shakespeare in Love (1998),"Maltese Falcon, The (1941)",Captain America (1979),Wallace & Gromit: The Best of Aardman Animatio...,Something Wicked This Way Comes (1983),First Blood (Rambo: First Blood) (1982),"Man Who Would Be King, The (1975)",Eyes Wide Shut (1999),M*A*S*H (a.k.a. MASH) (1970),Beat Street (1984),It Came from Hollywood (1982),More (1998)
9,Trading Places (1983),Blue Velvet (1986),Time Bandits (1981),Alien (1979),Serpico (1973),High Noon (1952),Ghostbusters (a.k.a. Ghost Busters) (1984),Witness (1985),"Third Man, The (1949)",Austin Powers: International Man of Mystery (1...,Akira (1988),Wallace & Gromit: The Wrong Trousers (1993),"Road Warrior, The (Mad Max 2) (1981)","Godfather: Part II, The (1974)",Gosford Park (2001),Patton (1970),Fame (1980),"War Room, The (1993)","Dark Knight, The (2008)"


The next topic is real-time events. Personalize has the ability to listen to events from your application in order to update the recommendations shown to the user. This is especially useful in media workloads, like video-on-demand, where a customer's intent may differ based on if they are watching with their children or on their own.

Additionally the events that are recorded via this system are stored until a delete call from you is issued, and they are used as historical data alongside the other interaction data you provided when you train your next models.

#### Real time events

Start by creating an event tracker that is attached to the campaign.

In [76]:
response = personalize.create_event_tracker(
    name='MovieTracker',
    datasetGroupArn=dataset_group_arn
)
print(response['eventTrackerArn'])
print(response['trackingId'])
TRACKING_ID = response['trackingId']
event_tracker_arn = response['eventTrackerArn']

arn:aws:personalize:us-east-1:832194813872:event-tracker/dff308a5
1d0cebad-76de-48d8-9a55-9a591f6f4c76


We will create some code that simulates a user interacting with a particular item. After running this code, you will get recommendations that differ from the results above.

We start by creating some methods for the simulation of real time events.

In [77]:
session_dict = {}

def send_movie_click(USER_ID, ITEM_ID, EVENT_TYPE):
    """
    Simulates a click as an envent
    to send an event to Amazon Personalize's Event Tracker
    """
    # Configure Session
    try:
        session_ID = session_dict[str(USER_ID)]
    except:
        session_dict[str(USER_ID)] = str(uuid.uuid1())
        session_ID = session_dict[str(USER_ID)]
        
    # Configure Properties:
    event = {
    "itemId": str(ITEM_ID),
    }
    event_json = json.dumps(event)
        
    # Make Call
    
    personalize_events.put_events(
    trackingId = TRACKING_ID,
    userId= str(USER_ID),
    sessionId = session_ID,
    eventList = [{
        'sentAt': int(time.time()),
        'eventType': str(EVENT_TYPE),
        'properties': event_json
        }]
    )

def get_new_recommendations_df_users_real_time(recommendations_df, user_id, item_id, event_type):
    # Get the artist name (header of column)
    movie_name = get_movie_by_id(item_id)
    # Interact with different movies
    print('sending event ' + event_type + ' for ' + get_movie_by_id(item_id))
    send_movie_click(USER_ID=user_id, ITEM_ID=item_id, EVENT_TYPE=event_type)
    # Get the recommendations (note you should have a base recommendation DF created before)
    get_recommendations_response = personalize_runtime.get_recommendations(
        campaignArn = userpersonalization_campaign_arn,
        userId = str(user_id),
    )
    # Build a new dataframe of recommendations
    item_list = get_recommendations_response['itemList']
    recommendation_list = []
    for item in item_list:
        artist = get_movie_by_id(item['itemId'])
        recommendation_list.append(artist)
    new_rec_DF = pd.DataFrame(recommendation_list, columns = [movie_name])
    # Add this dataframe to the old one
    #recommendations_df = recommendations_df.join(new_rec_DF)
    recommendations_df = pd.concat([recommendations_df, new_rec_DF], axis=1)
    return recommendations_df

At this point, we haven't generated any real-time events yet; we have only set up the code. To compare the recommendations before and after the real-time events, let's pick one user and generate the original recommendations for them.

In [78]:
# First pick a user
user_id = user

# Get recommendations for the user
get_recommendations_response = personalize_runtime.get_recommendations(
        campaignArn = userpersonalization_campaign_arn,
        userId = str(user_id),
    )

# Build a new dataframe for the recommendations
item_list = get_recommendations_response['itemList']
recommendation_list = []
for item in item_list:
    artist = get_movie_by_id(item['itemId'])
    recommendation_list.append(artist)
user_recommendations_df = pd.DataFrame(recommendation_list, columns = [user_id])
user_recommendations_df

Unnamed: 0,59
0,Network (1976)
1,American Graffiti (1973)
2,Chinatown (1974)
3,This Is Spinal Tap (1984)
4,"Sting, The (1973)"
5,Gandhi (1982)
6,Raging Bull (1980)
7,Amadeus (1984)
8,Trainspotting (1996)
9,"Big Chill, The (1983)"


Ok, so now we have a list of recommendations for this user before we have applied any real-time events. Now let's pick 3 random artists which we will simulate our user interacting with, and then see how this changes the recommendations.

In [79]:
# Next generate 3 random movies
movies = items_df.sample(3).index.tolist()

In [80]:
# Note this will take about 15 seconds to complete due to the sleeps
for movie in movies:
    user_recommendations_df = get_new_recommendations_df_users_real_time(user_recommendations_df, user_id, movie,'click')
    time.sleep(5)

sending event click for Glory Road (2006)
sending event click for Letter, The (1940)
sending event click for Born to Kill (1947)


Now we can look at how the click events changed the recommendations.

In [81]:
user_recommendations_df

Unnamed: 0,59,Glory Road (2006),"Letter, The (1940)",Born to Kill (1947)
0,Network (1976),Network (1976),Network (1976),L.A. Confidential (1997)
1,American Graffiti (1973),American Graffiti (1973),American Graffiti (1973),Touch of Evil (1958)
2,Chinatown (1974),Chinatown (1974),Chinatown (1974),Bonnie and Clyde (1967)
3,This Is Spinal Tap (1984),This Is Spinal Tap (1984),This Is Spinal Tap (1984),"Killing, The (1956)"
4,"Sting, The (1973)","Sting, The (1973)","Sting, The (1973)","Killers, The (1946)"
5,Gandhi (1982),Gandhi (1982),Gandhi (1982),Goodfellas (1990)
6,Raging Bull (1980),Raging Bull (1980),Raging Bull (1980),White Heat (1949)
7,Amadeus (1984),Amadeus (1984),Amadeus (1984),"Grifters, The (1990)"
8,Trainspotting (1996),Trainspotting (1996),Trainspotting (1996),Trainspotting (1996)
9,"Big Chill, The (1983)","Big Chill, The (1983)","Big Chill, The (1983)",Miller's Crossing (1990)


In the cell above, the first column after the index is the user's default recommendations from User Personalization, and each column after that has a header of the artist that they interacted with via a real time event, and the recommendations after this event occurred. 

The behavior may not shift very much; this is due to the relatively limited nature of this dataset and effect of a few random clicks. If you wanted to better understand this, try simulating clicking more movies, and you should see a more pronounced impact.

Now lets look at the event filters, which allow you to filter items based on the interaction data. For this dataset, it could be click or watch based on the data we imported, but could be based on whatever interaction schema you design (click, rate, like, watch, purchase etc.) For VOD shelves you could move a title from "Top picks for you" to a "Watch again", the watch again recommendations will be based on the users current interactions, but only recommend titles that have already been watched.


In [82]:
recommendations_df_events = pd.DataFrame()
for filter_arn in interaction_filter_arns:
    recommendations_df_events = get_new_recommendations_df_by_static_filter(recommendations_df_events, user, filter_arn)
    
recommendations_df_events

Unnamed: 0,watched,unwatched
0,Network (1976),Touch of Evil (1958)
1,Chinatown (1974),Bonnie and Clyde (1967)
2,"Big Chill, The (1983)",Goodfellas (1990)
3,"Godfather: Part II, The (1974)",White Heat (1949)
4,L.A. Confidential (1997),"Grifters, The (1990)"
5,Taxi Driver (1976),Trainspotting (1996)
6,"Killing Fields, The (1984)",Miller's Crossing (1990)
7,Apocalypse Now (1979),Detour (1945)
8,Blue Velvet (1986),M (1931)
9,Blade Runner (1982),No Country for Old Men (2007)


now lets send a watch event in for 4 unwatched recommendations, which would simulate watching 4 movies. In a VOD application, you may choose to send in an event after they have watched a significant amount (over 75%) of a piece of content. Sending at 100% complete could miss people that stop short of the credits.

In [83]:
 # Get the recommendations
top_unwatched_recommendations_response = personalize_runtime.get_recommendations(
    campaignArn = userpersonalization_campaign_arn,
    userId = str(user_id),
    filterArn = filter_arn,
    numResults=4)
item_list = top_unwatched_recommendations_response['itemList']
for item in item_list:
    print('sending event watch for ' + get_movie_by_id(item['itemId']))
    send_movie_click(USER_ID=user_id, ITEM_ID=item['itemId'], EVENT_TYPE='watch')
    time.sleep(10)

sending event watch for American Graffiti (1973)
sending event watch for This Is Spinal Tap (1984)
sending event watch for Sting, The (1973)
sending event watch for Gandhi (1982)


Now we can look at the event filters to see the updated watched and unwatched recommendations 

In [84]:
recommendations_df_events = pd.DataFrame()
for filter_arn in interaction_filter_arns:
    recommendations_df_events = get_new_recommendations_df_by_static_filter(recommendations_df_events, user, filter_arn)
    
recommendations_df_events

Unnamed: 0,watched,unwatched
0,L.A. Confidential (1997),Cool Hand Luke (1967)
1,Taxi Driver (1976),Boogie Nights (1997)
2,Midnight Express (1978),"Insider, The (1999)"
3,Gandhi (1982),Raging Bull (1980)
4,Fargo (1996),"Man Who Would Be King, The (1975)"
5,"Sting, The (1973)","Hustler, The (1961)"
6,Network (1976),Amadeus (1984)
7,American Graffiti (1973),Goodfellas (1990)
8,"Killing Fields, The (1984)",Requiem for a Dream (2000)
9,Apocalypse Now (1979),Magnolia (1999)


### Personalized Ranking

The core use case for personalized ranking is to take a collection of items and to render them in priority or probable order of interest for a user. For a VOD application you want dynamically render a personalized shelf/rail/carousel based on some information (director, location, superhero franchise, movie time period etc). This may not be information that you have in your metadata, so a item metadata filter will not work, howeverr you may have this information within you system to generate the item list. 

To demonstrate this, we will use the same user from before and a random collection of items.

In [85]:
rerank_user = user
rerank_items = items_df.sample(25).index.tolist()

Now build a nice dataframe that shows the input data.

In [86]:
rerank_list = []
for item in rerank_items:
    movie = get_movie_by_id(item)
    rerank_list.append(movie)
rerank_df = pd.DataFrame(rerank_list, columns = ['Un-Ranked'])
rerank_df

Unnamed: 0,Un-Ranked
0,Rain Man (1988)
1,Into the Forest (2015)
2,Mayhem (2017)
3,Pusher (1996)
4,"Lady from Shanghai, The (1947)"
5,Shanghai Triad (Yao a yao yao dao waipo qiao) ...
6,Amateur (1994)
7,Beasts of the Southern Wild (2012)
8,"Curse of Frankenstein, The (1957)"
9,Tears of the Sun (2003)


Then make the personalized ranking API call.

In [87]:
# Convert user to string:
user_id = str(rerank_user)
rerank_item_list = []
for item in rerank_items:
    rerank_item_list.append(str(item))
    
# Get recommended reranking
get_recommendations_response_rerank = personalize_runtime.get_personalized_ranking(
        campaignArn = rerank_campaign_arn,
        userId = user_id,
        inputList = rerank_item_list
)

Now add the reranked items as a second column to the original dataframe, for a side-by-side comparison.

In [88]:
ranked_list = []
item_list = get_recommendations_response_rerank['personalizedRanking']
for item in item_list:
    movie = get_movie_by_id(item['itemId'])
    ranked_list.append(movie)
ranked_df = pd.DataFrame(ranked_list, columns = ['Re-Ranked'])
rerank_df = pd.concat([rerank_df, ranked_df], axis=1)
rerank_df

Unnamed: 0,Un-Ranked,Re-Ranked
0,Rain Man (1988),Almost Famous (2000)
1,Into the Forest (2015),Rain Man (1988)
2,Mayhem (2017),Shanghai Triad (Yao a yao yao dao waipo qiao) ...
3,Pusher (1996),"Hunt, The (Jagten) (2012)"
4,"Lady from Shanghai, The (1947)",Blind Fury (1989)
5,Shanghai Triad (Yao a yao yao dao waipo qiao) ...,Trial and Error (1997)
6,Amateur (1994),Sherlock Holmes and Dr. Watson: Acquaintance (...
7,Beasts of the Southern Wild (2012),Futurama: Bender's Game (2008)
8,"Curse of Frankenstein, The (1957)",Beasts of the Southern Wild (2012)
9,Tears of the Sun (2003),Tears of the Sun (2003)


You can see above how each entry was re-ordered based on the model's understanding of the user. This is a popular task when you have a collection of items to surface a user, a list of promotions for example.

## Batch recommendations <a class="anchor" id="batch"></a>
[Back to top](#top)

There are many cases where you may want to have a larger dataset of exported recommendations. Recently, Amazon Personalize launched batch recommendations as a way to export a collection of recommendations to S3. In this example, we will walk through how to do this for the HRNN solution. For more information about batch recommendations, please see the [documentation](https://docs.aws.amazon.com/personalize/latest/dg/getting-recommendations.html#recommendations-batch). This feature applies to all recipes, but the output format will vary.

A simple implementation looks like this:

```python
import boto3

personalize_rec = boto3.client(service_name='personalize')

personalize_rec.create_batch_inference_job (
    solutionVersionArn = "Solution version ARN",
    jobName = "Batch job name",
    roleArn = "IAM role ARN",
    jobInput = 
       {"s3DataSource": {"path": S3 input path}},
    jobOutput = 
       {"s3DataDestination": {"path":S3 output path"}}
)
```

The SDK import, the solution version arn, and role arns have all been determined. This just leaves an input, an output, and a job name to be defined.

Starting with the input for HRNN, it looks like:


```JSON
{"userId": "4638"}
{"userId": "663"}
{"userId": "3384"}
```

This should yield an output that looks like this:

```JSON
{"input":{"userId":"4638"}, "output": {"recommendedItems": ["296", "1", "260", "318"]}}
{"input":{"userId":"663"}, "output": {"recommendedItems": ["1393", "3793", "2701", "3826"]}}
{"input":{"userId":"3384"}, "output": {"recommendedItems": ["8368", "5989", "40815", "48780"]}}
```

The output is a JSON Lines file. It consists of individual JSON objects, one per line. So we will need to put in more work later to digest the results in this format.

### Building the input file

When you are using the batch feature, you specify the users that you'd like to receive recommendations for when the job has completed. The cell below will again select a few random users and will then build the file and save it to disk. From there, you will upload it to S3 to use in the API call later.

In [89]:
# We will use the same users from before
users
# Write the file to disk
json_input_filename = "json_input.json"
with open(data_dir + "/" + json_input_filename, 'w') as json_input:
    for user_id in users:
        json_input.write('{"userId": "' + str(user_id) + '"}\n')

In [90]:
# Showcase the input file:
!cat $data_dir"/"$json_input_filename

{"userId": "320"}
{"userId": "403"}
{"userId": "59"}


Upload the file to S3 and save the path as a variable for later.

In [91]:
# Upload files to S3
boto3.Session().resource('s3').Bucket(bucket_name).Object(json_input_filename).upload_file(data_dir+"/"+json_input_filename)
s3_input_path = "s3://" + bucket_name + "/" + json_input_filename
print(s3_input_path)

s3://832194813872-us-east-1-personalizepocvod/json_input.json


Batch recommendations read the input from the file we've uploaded to S3. Similarly, batch recommendations will save the output to file in S3. So we define the output path where the results should be saved.

In [92]:
# Define the output path
s3_output_path = "s3://" + bucket_name + "/"
print(s3_output_path)

s3://832194813872-us-east-1-personalizepocvod/


Now just make the call to kick off the batch export process.

In [93]:
batchInferenceJobArn = personalize.create_batch_inference_job (
    solutionVersionArn = userpersonalization_solution_version_arn,
    jobName = "VOD-POC-Batch-Inference-Job-UserPersonalization_" + str(round(time.time()*1000)),
    roleArn = role_arn,
    jobInput = 
     {"s3DataSource": {"path": s3_input_path}},
    jobOutput = 
     {"s3DataDestination":{"path": s3_output_path}}
)
batchInferenceJobArn = batchInferenceJobArn['batchInferenceJobArn']

Run the while loop below to track the status of the batch recommendation call. This can take around 30 minutes to complete, because Personalize needs to stand up the infrastructure to perform the task. We are testing the feature with a dataset of only 3 users, which is not an efficient use of this mechanism. Normally, you would only use this feature for bulk processing, in which case the efficiencies will become clear.

In [94]:
current_time = datetime.now()
print("Import Started on: ", current_time.strftime("%I:%M:%S %p"))

max_time = time.time() + 6*60*60 # 6 hours
while time.time() < max_time:
    describe_dataset_inference_job_response = personalize.describe_batch_inference_job(
        batchInferenceJobArn = batchInferenceJobArn
    )
    status = describe_dataset_inference_job_response["batchInferenceJob"]['status']
    print("DatasetInferenceJob: {}".format(status))
    
    if status == "ACTIVE" or status == "CREATE FAILED":
        break
        
    time.sleep(60)
    
current_time = datetime.now()
print("Import Completed on: ", current_time.strftime("%I:%M:%S %p"))

Import Started on:  05:13:33 PM
DatasetInferenceJob: CREATE PENDING
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInferenceJob: CREATE IN_PROGRESS
DatasetInfer

In [95]:
s3 = boto3.client('s3')
export_name = json_input_filename + ".out"
s3.download_file(bucket_name, export_name, data_dir+"/"+export_name)

# Update DF rendering
pd.set_option('display.max_rows', 30)
with open(data_dir+"/"+export_name) as json_file:
    # Get the first line and parse it
    line = json.loads(json_file.readline())
    # Do the same for the other lines
    while line:
        # extract the user ID 
        col_header = "User: " + line['input']['userId']
        # Create a list for all the artists
        recommendation_list = []
        # Add all the entries
        for item in line['output']['recommendedItems']:
            movie = get_movie_by_id(item)
            recommendation_list.append(movie)
        if 'bulk_recommendations_df' in locals():
            new_rec_DF = pd.DataFrame(recommendation_list, columns = [col_header])
            bulk_recommendations_df = bulk_recommendations_df.join(new_rec_DF)
        else:
            bulk_recommendations_df = pd.DataFrame(recommendation_list, columns=[col_header])
        try:
            line = json.loads(json_file.readline())
        except:
            line = None
bulk_recommendations_df

Unnamed: 0,User: 320,User: 403,User: 59
0,X2: X-Men United (2003),"Hangover, The (2009)",Network (1976)
1,"Matrix Revolutions, The (2003)",Dr. Strangelove or: How I Learned to Stop Worr...,American Graffiti (1973)
2,Avatar (2009),Clueless (1995),Chinatown (1974)
3,Spider-Man 2 (2004),LÃ©on: The Professional (a.k.a. The Profession...,This Is Spinal Tap (1984)
4,Iron Man (2008),Forgetting Sarah Marshall (2008),"Sting, The (1973)"
5,Mr. & Mrs. Smith (2005),Juno (2007),Gandhi (1982)
6,Spider-Man (2002),Reservoir Dogs (1992),Raging Bull (1980)
7,"Hitchhiker's Guide to the Galaxy, The (2005)",Borat: Cultural Learnings of America for Make ...,Amadeus (1984)
8,Fantastic Four (2005),Sleepless in Seattle (1993),Trainspotting (1996)
9,Serenity (2005),Mr. & Mrs. Smith (2005),"Big Chill, The (1983)"


## Wrap up <a class="anchor" id="wrapup"></a>
[Back to top](#top)

With that you now have a fully working collection of models to tackle various recommendation and personalization scenarios, as well as the skills to manipulate customer data to better integrate with the service, and a knowledge of how to do all this over APIs and by leveraging open source data science tools.

Use these notebooks as a guide to getting started with your customers for POCs. As you find missing components, or discover new approaches, cut a pull request and provide any additional helpful components that may be missing from this collection.

You'll want to make sure that you clean up all of the resources deployed during this POC. We have provided a separate notebook which shows you how to identify and delete the resources in `06_Clean_Up_Resources.ipynb`.

In [97]:
%store event_tracker_arn
%store batchInferenceJobArn

Stored 'event_tracker_arn' (str)
Stored 'batchInferenceJobArn' (str)
