# Using Personalize campaigns  on synthetic cars data
This notebook takes advantage of campaigns that have been built in the other notebooks.
There are specific sections for each model:

1. [Personalized Ranking](#Exercise-the-personalized-ranking-campaign)
2. [HRNN](#Exercise-the-hrnn-campaign)
3. [SIMS](#Exercise-the-SIMS-campaign)
4. [HRNN-Metadata]()
5. [Popularity Count](#Exercise-the-popularity-campaign)

In addition, we have a section for experimenting with 
[Personalize Event Tracker](#Use-real-time-events).

## Imports, overall settings, initialization

In [492]:
import json
import boto3
import time
import datetime
import pandas as pd
from sklearn.utils import shuffle

account_num = '<your-account>'
dataset_group_name = 'car-dg10'

dg_arn = 'arn:aws:personalize:us-east-1:{}:dataset-group/{}'.format(account_num, dataset_group_name)

NUM_CLUSTERS = 20

cars_filename         = 'car_items.csv'
users_filename        = 'users.csv'
interactions_filename = 'interactions.csv'
int_exp_filename      = 'interactions_expanded.csv'

ranking_arn       = 'arn:aws:personalize:us-east-1:{}:campaign/car-personalized-ranking'.format(account_num)
sims_arn          = 'arn:aws:personalize:us-east-1:{}:campaign/car-sims'.format(account_num)
hrnn_arn          = 'arn:aws:personalize:us-east-1:{}:campaign/car-hrnn'.format(account_num)
hrnn_metadata_arn = 'arn:aws:personalize:us-east-1:{}:campaign/car-hrnn-metadata'.format(account_num)
pop_arn           = 'arn:aws:personalize:us-east-1:{}:campaign/car-popularity'.format(account_num)

In [414]:
personalize         = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')
personalize_events  = boto3.client('personalize-events')

In [415]:
int_df = pd.read_csv(interactions_filename)
int_df = int_df.astype({'ITEM_ID': str})
int_df = int_df.astype({'USER_ID': str})

In [416]:
def date_to_string(ts):
    return datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')

int_expanded_df['WHEN'] = int_expanded_df['TIMESTAMP'].apply(date_to_string)

In [427]:
def show_item_interaction_history(int_df, item_id):
    _tmp_df = int_df[int_df.ITEM_ID == item_id].sort_values('TIMESTAMP')
    print(_tmp_df.shape)
    return _tmp_df[['USER_ID','ITEM_ID',#'WHEN',
                    'FAV','YEAR','GENDER','SALARY']]

In [426]:
def show_user_interaction_history(int_df, user_id):
    _tmp_df = int_df[int_df.USER_ID == int(user_id)].sort_values('TIMESTAMP')
    print(_tmp_df.shape)
    return _tmp_df[['USER_ID','ITEM_ID',#'WHEN',
                    'FAV','YEAR','PRICE','MILEAGE']]

In [419]:
int_expanded_df = pd.read_csv(int_exp_filename)

In [420]:
items_to_rank = int_expanded_df.sample(10)
items_to_rank.head(3)

Unnamed: 0,USER_ID,ITEM_ID,TIMESTAMP,MAKE,MODEL,YEAR,MILEAGE,PRICE,AGE,GENDER,LOCATION,SALARY,FAV_CLUSTER,FAV_MODEL,FAV
54972,13835,27149,1562854527,Ford,Fusion,2012,105039,25301,42,MALE,90706,39362,9,4,OLDISH-Ford-Fusion
657699,1233,21628,1562856351,Ford,Explorer,2016,51072,36096,34,FEMALE,93307,26462,10,5,NEWISH-Ford-Explorer
552922,17395,19930,1562854283,Nissan,Leaf,2019,4754,41511,34,FEMALE,91710,55185,14,7,NEWISH-Nissan-Leaf


In [423]:
def print_item(item_id):
    tmp = int_expanded_df[int_expanded_df.ITEM_ID == item_id].iloc[0]
    print('Id: {}, Make: {}, Model: {}, Fav: {}, Year: {}, Age: {}'.format(item_id,
         tmp['MAKE'], tmp['MODEL'], tmp['FAV'], tmp['YEAR'], tmp['AGE']))

Skip ahead to try out various campaigns:

1. [Personalized Ranking](#Exercise-the-personalized-ranking-campaign)
2. [HRNN](#Exercise-the-hrnn-campaign)
3. [SIMS](#Exercise-the-SIMS-campaign)
4. [HRNN-Metadata]()
5. [Popularity Count](#Exercise-the-popularity-campaign)

In addition, we have a section for experimenting with 
[Personalize Event Tracker](#Use-real-time-events).

## Exercise the personalized ranking campaign
Here we want to see Personalize re-rank a set of search results. For our sample, we will pass
a user that likes oldish cars and would expect oldish cars to appear closer to the top. Likewise, we will
pass a user that likes newish cars and expect the higher ranked cars to be newish.

In [441]:
full_df = pd.DataFrame(columns=['USER_ID','FAV','FAV_CLUSTER'])

for i in range(NUM_CLUSTERS):
    tmp_df = int_expanded_df[int_expanded_df.FAV_CLUSTER == i][['USER_ID','FAV','FAV_CLUSTER']].sample(1)
    full_df = pd.concat([full_df, tmp_df])
ranking_user_df = shuffle(full_df)
ranking_user_list = ranking_user_df['USER_ID'].values.astype(str).tolist()
ranking_user_df.head(25)

Unnamed: 0,USER_ID,FAV,FAV_CLUSTER
709852,26570,OLDISH-Toyota-Camry,7
552163,14592,NEWISH-Nissan-Leaf,14
23901,17159,OLDISH-Toyota-Prius,13
747661,18128,OLDISH-Toyota-Sienna,1
670246,7993,NEWISH-Nissan-Rogue,2
114504,18521,OLDISH-Nissan-Altima,5
694952,11671,NEWISH-Nissan-Altima,4
394130,21050,OLDISH-Nissan-Rogue,3
746124,22079,OLDISH-Ford-Explorer,11
479973,18329,OLDISH-Nissan-Leaf,15


In [442]:
full_df = pd.DataFrame(columns=['ITEM_ID','FAV'])

for i in range(NUM_CLUSTERS):
    tmp_df = int_expanded_df[int_expanded_df.FAV_CLUSTER == i][['ITEM_ID','FAV']].sample(1)
    full_df = pd.concat([full_df, tmp_df])
ranking_item_df = shuffle(full_df)
ranking_item_list = ranking_item_df['ITEM_ID'].values.astype(str).tolist()
ranking_item_df.head(25)

Unnamed: 0,ITEM_ID,FAV
746475,31294,NEWISH-Toyota-Sienna
277527,24799,NEWISH-Ford-Mustang
513609,13709,NEWISH-Toyota-Camry
670816,32076,NEWISH-Nissan-Rogue
240014,30851,OLDISH-Ford-Fusion
664606,19229,NEWISH-Ford-Fusion
528862,20833,NEWISH-Toyota-Prius
675684,29204,OLDISH-Ford-Explorer
393065,28398,OLDISH-Nissan-Rogue
747956,28340,OLDISH-Toyota-Sienna


In [443]:
def print_ranking_target_df(user_id, input_df, target_cluster):
    print('\nRanking for user: {}'.format(user_id))
    
    _input_list = input_df['ITEM_ID'].values.astype(str).tolist()
    
    personalized_ranking_response = personalize_runtime.get_personalized_ranking(
        campaignArn = ranking_arn, userId = str(user_id), inputList = _input_list)
    
    i = 0
    _rank = len(_input_list)
    for item in personalized_ranking_response['personalizedRanking']:
        item_id = item['itemId']
        tmp = int_expanded_df[int_expanded_df.ITEM_ID == int(item_id)].iloc[0]
        _fav_cluster = tmp['FAV_CLUSTER']
        if (target_cluster == _fav_cluster) & (_rank == len(_input_list)):
            _rank = i
        print('Id: {}, Make: {}, Model: {}, Year: {}, Price: {}, Fav: {}'.format(item_id,
         tmp['MAKE'], tmp['MODEL'], tmp['YEAR'], tmp['PRICE'], tmp['FAV']))
        i += 1
    return _rank

In [444]:
full_df = pd.DataFrame(columns=['ITEM_ID','FAV','FAV_CLUSTER'])

random_item_df = shuffle(int_expanded_df[['ITEM_ID','FAV', 'FAV_CLUSTER']].sample(25))
random_item_list = random_item_df['ITEM_ID'].values.astype(str).tolist()
random_item_df.head(25)

Unnamed: 0,ITEM_ID,FAV,FAV_CLUSTER
707748,18084,NEWISH-Ford-Fusion,8
98023,26836,NEWISH-Ford-Fusion,8
477625,31361,OLDISH-Nissan-Leaf,15
619796,28406,NEWISH-Toyota-Prius,12
137405,24264,NEWISH-Ford-Explorer,10
499964,22335,OLDISH-Ford-Fusion,9
290444,26012,NEWISH-Nissan-Leaf,14
575143,22626,NEWISH-Ford-Explorer,10
484782,28758,OLDISH-Ford-Fusion,9
515422,23962,NEWISH-Toyota-Camry,6


#### Try personalized ranking on a set of random items
Here we take some random items and see how well Personalize can re-rank them
for each of a set of users with a known bias to specific car clusters. Ideally,
we would find that a matching car would rise as close to the 0th rank as possible.
We use a curated list of users that cover each car cluster preference. 

Note that it is likely that some of the users will have a preference that is not
covered by the list of random items. In those cases, the best case is that Personalize
re-ranks the list simply based on popularity.

In [447]:
rank_total = 0
for i in range(NUM_CLUSTERS):
    user_fav = ranking_user_df.iloc[i]['FAV']
    user_fav_cluster = ranking_user_df.iloc[i]['FAV_CLUSTER']
    print('\nRanking for user that prefers: {}'.format(user_fav))
    rank = print_ranking_target_df(ranking_user_list[i], random_item_df, user_fav_cluster)
    if (random_item_df.shape[0] == rank):
        print('**desired cluster was not found in the item set')
        rank = 0 # reset to not penalize when no item was available
    else:
        print('**rank {}'.format(rank))
    rank_total += rank

print('\nRank average: {:.2}'.format(rank_total/len(random_item_list)))


Ranking for user that prefers: OLDISH-Toyota-Camry

Ranking for user: 26570
Id: 23525, Make: Toyota, Model: Camry, Year: 2013, Price: 28379, Fav: OLDISH-Toyota-Camry
Id: 21683, Make: Toyota, Model: Camry, Year: 2011, Price: 22290, Fav: OLDISH-Toyota-Camry
Id: 22626, Make: Ford, Model: Explorer, Year: 2014, Price: 33155, Fav: NEWISH-Ford-Explorer
Id: 28758, Make: Ford, Model: Fusion, Year: 2012, Price: 19006, Fav: OLDISH-Ford-Fusion
Id: 24264, Make: Ford, Model: Explorer, Year: 2015, Price: 30363, Fav: NEWISH-Ford-Explorer
Id: 24798, Make: Ford, Model: Fusion, Year: 2012, Price: 20968, Fav: OLDISH-Ford-Fusion
Id: 31650, Make: Ford, Model: Explorer, Year: 2014, Price: 25364, Fav: NEWISH-Ford-Explorer
Id: 30621, Make: Ford, Model: Fusion, Year: 2012, Price: 27925, Fav: OLDISH-Ford-Fusion
Id: 23761, Make: Ford, Model: Fusion, Year: 2009, Price: 18105, Fav: OLDISH-Ford-Fusion
Id: 27275, Make: Ford, Model: Fusion, Year: 2012, Price: 26359, Fav: OLDISH-Ford-Fusion
Id: 24792, Make: Ford, Mode

Id: 24792, Make: Ford, Model: Fusion, Year: 2012, Price: 25730, Fav: OLDISH-Ford-Fusion
Id: 25171, Make: Ford, Model: Fusion, Year: 2014, Price: 29928, Fav: NEWISH-Ford-Fusion
Id: 27275, Make: Ford, Model: Fusion, Year: 2012, Price: 26359, Fav: OLDISH-Ford-Fusion
Id: 23761, Make: Ford, Model: Fusion, Year: 2009, Price: 18105, Fav: OLDISH-Ford-Fusion
Id: 22335, Make: Ford, Model: Fusion, Year: 2011, Price: 24328, Fav: OLDISH-Ford-Fusion
Id: 23962, Make: Toyota, Model: Camry, Year: 2015, Price: 36790, Fav: NEWISH-Toyota-Camry
Id: 26836, Make: Ford, Model: Fusion, Year: 2016, Price: 38408, Fav: NEWISH-Ford-Fusion
Id: 26498, Make: Toyota, Model: Prius, Year: 2017, Price: 41505, Fav: NEWISH-Toyota-Prius
**desired cluster was not found in the item set

Ranking for user that prefers: NEWISH-Nissan-Rogue

Ranking for user: 7993
Id: 26012, Make: Nissan, Model: Leaf, Year: 2015, Price: 29590, Fav: NEWISH-Nissan-Leaf
Id: 31361, Make: Nissan, Model: Leaf, Year: 2009, Price: 19462, Fav: OLDISH-Niss

Id: 25917, Make: Ford, Model: Fusion, Year: 2015, Price: 37779, Fav: NEWISH-Ford-Fusion
Id: 31650, Make: Ford, Model: Explorer, Year: 2014, Price: 25364, Fav: NEWISH-Ford-Explorer
Id: 22626, Make: Ford, Model: Explorer, Year: 2014, Price: 33155, Fav: NEWISH-Ford-Explorer
Id: 17289, Make: Ford, Model: Explorer, Year: 2013, Price: 31651, Fav: OLDISH-Ford-Explorer
Id: 25171, Make: Ford, Model: Fusion, Year: 2014, Price: 29928, Fav: NEWISH-Ford-Fusion
Id: 26836, Make: Ford, Model: Fusion, Year: 2016, Price: 38408, Fav: NEWISH-Ford-Fusion
Id: 19526, Make: Ford, Model: Explorer, Year: 2011, Price: 16019, Fav: OLDISH-Ford-Explorer
Id: 26498, Make: Toyota, Model: Prius, Year: 2017, Price: 41505, Fav: NEWISH-Toyota-Prius
Id: 26761, Make: Toyota, Model: Prius, Year: 2015, Price: 28876, Fav: NEWISH-Toyota-Prius
Id: 28406, Make: Toyota, Model: Prius, Year: 2014, Price: 26315, Fav: NEWISH-Toyota-Prius
**desired cluster was not found in the item set

Ranking for user that prefers: OLDISH-Ford-Explor

Id: 31361, Make: Nissan, Model: Leaf, Year: 2009, Price: 19462, Fav: OLDISH-Nissan-Leaf
Id: 26012, Make: Nissan, Model: Leaf, Year: 2015, Price: 29590, Fav: NEWISH-Nissan-Leaf
Id: 22626, Make: Ford, Model: Explorer, Year: 2014, Price: 33155, Fav: NEWISH-Ford-Explorer
Id: 31650, Make: Ford, Model: Explorer, Year: 2014, Price: 25364, Fav: NEWISH-Ford-Explorer
Id: 23962, Make: Toyota, Model: Camry, Year: 2015, Price: 36790, Fav: NEWISH-Toyota-Camry
Id: 28758, Make: Ford, Model: Fusion, Year: 2012, Price: 19006, Fav: OLDISH-Ford-Fusion
Id: 26761, Make: Toyota, Model: Prius, Year: 2015, Price: 28876, Fav: NEWISH-Toyota-Prius
Id: 24264, Make: Ford, Model: Explorer, Year: 2015, Price: 30363, Fav: NEWISH-Ford-Explorer
Id: 28406, Make: Toyota, Model: Prius, Year: 2014, Price: 26315, Fav: NEWISH-Toyota-Prius
Id: 18084, Make: Ford, Model: Fusion, Year: 2014, Price: 33353, Fav: NEWISH-Ford-Fusion
Id: 26498, Make: Toyota, Model: Prius, Year: 2017, Price: 41505, Fav: NEWISH-Toyota-Prius
Id: 24798, M

#### Try personalized ranking on a curated set of items with each car cluster covered
Here we take a curated set of items, with one item for each car cluster. Personalize
should be able to re-rank in such a way that the specific item that would best match
the user rises to the 0th position.

In [448]:
rank_total = 0
for i in range(NUM_CLUSTERS):
    user_fav = ranking_user_df.iloc[i]['FAV']
    user_fav_cluster = ranking_user_df.iloc[i]['FAV_CLUSTER']
    print('\nRanking for user that prefers: {}'.format(user_fav))
    rank = print_ranking_target_df(ranking_user_list[i], ranking_item_df, user_fav_cluster)
    if (random_item_df.shape[0] == rank):
        print('**desired cluster was not found in the item set')
        rank = 0 # reset to not penalize when no item was available
    else:
        print('**rank {}'.format(rank))
    rank_total += rank

print('\nRank average: {:.2}'.format(rank_total/len(ranking_item_list)))


Ranking for user that prefers: OLDISH-Toyota-Camry

Ranking for user: 26570
Id: 27578, Make: Toyota, Model: Camry, Year: 2012, Price: 27464, Fav: OLDISH-Toyota-Camry
Id: 28340, Make: Toyota, Model: Sienna, Year: 2010, Price: 15007, Fav: OLDISH-Toyota-Sienna
Id: 24799, Make: Ford, Model: Mustang, Year: 2015, Price: 29057, Fav: NEWISH-Ford-Mustang
Id: 31932, Make: Ford, Model: Explorer, Year: 2014, Price: 27450, Fav: NEWISH-Ford-Explorer
Id: 30851, Make: Ford, Model: Fusion, Year: 2011, Price: 24515, Fav: OLDISH-Ford-Fusion
Id: 31764, Make: Ford, Model: Mustang, Year: 2011, Price: 19764, Fav: OLDISH-Ford-Mustang
Id: 38490, Make: Nissan, Model: Altima, Year: 2012, Price: 25630, Fav: OLDISH-Nissan-Altima
Id: 32076, Make: Nissan, Model: Rogue, Year: 2014, Price: 26049, Fav: NEWISH-Nissan-Rogue
Id: 20833, Make: Toyota, Model: Prius, Year: 2014, Price: 25782, Fav: NEWISH-Toyota-Prius
Id: 31294, Make: Toyota, Model: Sienna, Year: 2016, Price: 36832, Fav: NEWISH-Toyota-Sienna
Id: 24567, Make: 

Id: 28398, Make: Nissan, Model: Rogue, Year: 2013, Price: 24785, Fav: OLDISH-Nissan-Rogue
Id: 24122, Make: Nissan, Model: Altima, Year: 2017, Price: 36868, Fav: NEWISH-Nissan-Altima
Id: 24799, Make: Ford, Model: Mustang, Year: 2015, Price: 29057, Fav: NEWISH-Ford-Mustang
Id: 13709, Make: Toyota, Model: Camry, Year: 2014, Price: 31690, Fav: NEWISH-Toyota-Camry
Id: 20481, Make: Nissan, Model: Leaf, Year: 2012, Price: 22918, Fav: OLDISH-Nissan-Leaf
Id: 31294, Make: Toyota, Model: Sienna, Year: 2016, Price: 36832, Fav: NEWISH-Toyota-Sienna
Id: 31764, Make: Ford, Model: Mustang, Year: 2011, Price: 19764, Fav: OLDISH-Ford-Mustang
Id: 28340, Make: Toyota, Model: Sienna, Year: 2010, Price: 15007, Fav: OLDISH-Toyota-Sienna
Id: 24567, Make: Toyota, Model: Prius, Year: 2011, Price: 19000, Fav: OLDISH-Toyota-Prius
Id: 32076, Make: Nissan, Model: Rogue, Year: 2014, Price: 26049, Fav: NEWISH-Nissan-Rogue
Id: 27578, Make: Toyota, Model: Camry, Year: 2012, Price: 27464, Fav: OLDISH-Toyota-Camry
Id: 38

Id: 31294, Make: Toyota, Model: Sienna, Year: 2016, Price: 36832, Fav: NEWISH-Toyota-Sienna
Id: 31764, Make: Ford, Model: Mustang, Year: 2011, Price: 19764, Fav: OLDISH-Ford-Mustang
Id: 24122, Make: Nissan, Model: Altima, Year: 2017, Price: 36868, Fav: NEWISH-Nissan-Altima
Id: 32076, Make: Nissan, Model: Rogue, Year: 2014, Price: 26049, Fav: NEWISH-Nissan-Rogue
Id: 28340, Make: Toyota, Model: Sienna, Year: 2010, Price: 15007, Fav: OLDISH-Toyota-Sienna
Id: 24799, Make: Ford, Model: Mustang, Year: 2015, Price: 29057, Fav: NEWISH-Ford-Mustang
Id: 28398, Make: Nissan, Model: Rogue, Year: 2013, Price: 24785, Fav: OLDISH-Nissan-Rogue
Id: 27223, Make: Nissan, Model: Leaf, Year: 2015, Price: 36224, Fav: NEWISH-Nissan-Leaf
Id: 20481, Make: Nissan, Model: Leaf, Year: 2012, Price: 22918, Fav: OLDISH-Nissan-Leaf
Id: 13709, Make: Toyota, Model: Camry, Year: 2014, Price: 31690, Fav: NEWISH-Toyota-Camry
Id: 38490, Make: Nissan, Model: Altima, Year: 2012, Price: 25630, Fav: OLDISH-Nissan-Altima
Id: 31

## Exercise the hrnn campaign
Here we try out the hrnn campaign. We ask Personalize for recommendations for a particular user. Our hope is that
it would detect that this user likes old or new cars and would return a list accordingly. 
Best case is that the recommended list of cars entirely matches the user's preferred car
cluster.

In [431]:
users_to_try = int_expanded_df.sample(10)
users_to_try[['USER_ID','FAV']].head(3)

Unnamed: 0,USER_ID,FAV
192874,17333,NEWISH-Ford-Explorer
365859,9103,OLDISH-Nissan-Rogue
135563,9116,NEWISH-Ford-Explorer


In [432]:
show_user_interaction_history(int_expanded_df, users_to_try.iloc[0]['USER_ID'])

(40, 15)


Unnamed: 0,USER_ID,ITEM_ID,FAV,YEAR,PRICE,MILEAGE
192858,17333,29848,NEWISH-Ford-Explorer,2016,40901,49154
192851,17333,24270,NEWISH-Ford-Explorer,2017,38651,39171
192873,17333,27064,NEWISH-Ford-Explorer,2015,30061,65353
192857,17333,24328,NEWISH-Ford-Explorer,2014,33450,79576
192874,17333,31550,NEWISH-Ford-Explorer,2014,28581,75791
192869,17333,28923,NEWISH-Ford-Explorer,2016,39178,48862
192849,17333,25907,NEWISH-Ford-Explorer,2015,31822,62902
192859,17333,22651,NEWISH-Ford-Explorer,2017,38455,30202
192877,17333,15478,NEWISH-Ford-Explorer,2016,37318,54694
192842,17333,25442,NEWISH-Ford-Explorer,2014,26450,77625


In [450]:
for i in range(10):
    user_id = str(users_to_try.iloc[i]['USER_ID'])
    fav     = users_to_try.iloc[i]['FAV']
    print('Getting recommendations for user: {}, who likes: {}'.format(user_id, fav))
    response = personalize_runtime.get_recommendations(campaignArn=hrnn_arn, 
                                                       userId=user_id, 
                                                       numResults=10)
    items = response['itemList']
    for item in items:
        print_item(int(item['itemId']))
    print('')

Getting recommendations for user: 17333, who likes: NEWISH-Ford-Explorer
Id: 23006, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2017, Age: 39
Id: 24998, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2014, Age: 40
Id: 26348, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2017, Age: 40
Id: 24522, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 36
Id: 21642, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 41
Id: 29385, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2016, Age: 39
Id: 24231, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2016, Age: 36
Id: 23317, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 36
Id: 24844, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 40
Id: 25166, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2016, Age: 38

Getting recommendations for user: 9103, who likes: OLDIS

## Exercise the SIMS campaign
Here we experiment with the SIMS campaign. We loop through a list of items that have at 
least some interactions historically. 
For each car, we would expect similar cars to be similar in age, make and model.
We leverage car clusters and would like to see Personalize generate a list of similar cars
that entirely come from the same car cluster.

In [452]:
items_to_try = int_expanded_df.sample(10)
items_to_try[['ITEM_ID','FAV']].head(5)

Unnamed: 0,ITEM_ID,FAV
139404,19080,NEWISH-Ford-Explorer
53493,24319,OLDISH-Ford-Fusion
263770,24889,OLDISH-Toyota-Camry
12877,19768,OLDISH-Ford-Explorer
74707,27500,NEWISH-Ford-Fusion


In [454]:
desired_num_results = 10
for i in range(items_to_try.shape[0]):
    item_id     = str(items_to_try.iloc[i]['ITEM_ID'])
    fav         = items_to_try.iloc[i]['FAV']
    fav_cluster = items_to_try.iloc[i]['FAV_CLUSTER']
    
    print('Getting items similar to: {}, which is a: {}'.format(item_id, fav))
    response = personalize_runtime.get_recommendations(campaignArn=sims_arn, 
                                                       itemId=item_id, 
                                                       numResults=desired_num_results)
    items = response['itemList']
    match = 0
    actual_num_results = len(items)
    for item in items:
        _curr_cluster = int_expanded_df[int_expanded_df.ITEM_ID == int(item_id)].iloc[0]['FAV_CLUSTER']
        if fav_cluster == _curr_cluster:
            match += 1
        print_item(int(item['itemId']))
    print('Matched {:.2} ({}/{})'.format(match/actual_num_results, match, actual_num_results))
    print('')

Getting items similar to: 19080, which is a: NEWISH-Ford-Explorer
Id: 21070, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 40
Id: 34991, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2014, Age: 37
Id: 18088, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2019, Age: 44
Id: 25093, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2016, Age: 42
Id: 25770, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2016, Age: 38
Id: 28252, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2014, Age: 40
Id: 32727, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2018, Age: 38
Id: 23366, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2016, Age: 39
Id: 19563, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 38
Id: 28963, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2014, Age: 38
Matched 1.0 (10/10)

Getting items similar to: 24319, which is a

Id: 21816, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2013, Age: 41
Id: 33730, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2012, Age: 40
Matched 1.0 (10/10)

Getting items similar to: 12643, which is a: NEWISH-Ford-Explorer
Id: 24835, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2014, Age: 45
Id: 23851, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 35
Id: 16509, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2014, Age: 41
Id: 21992, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2016, Age: 37
Id: 26764, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2014, Age: 40
Id: 21955, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 39
Id: 28755, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2017, Age: 41
Id: 22125, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 38
Matched 1.0 (8/8)



## Exercise the popularity campaign
Personalize provides a baseline recommender which leverages simple popularity of an item. 
Here we will
compare its results with our own definition of "popular". 

Our popularity is driven simply by total count of
interactions for that item. We expect significant overlap between our list and the one from Personalize.

#### First let's see the results from Personalize

In [384]:
NUM_MOST_POPULAR = 10

popularity_response = personalize_runtime.get_recommendations(campaignArn=pop_arn, 
                                                              userId='0', 
                                                              numResults=NUM_MOST_POPULAR)
pop_items = popularity_response['itemList']
for item in pop_items:
    print_item(int(item['itemId']))
    
personalized_pop = []
for p in pop_items:
    personalized_pop.append(str(p['itemId']))

Id: 25281, Make: Ford, Model: Mustang, Fav: OLDISH-Ford-Mustang, Year: 2013, Age: 36
Id: 25589, Make: Ford, Model: Mustang, Fav: OLDISH-Ford-Mustang, Year: 2010, Age: 46
Id: 24064, Make: Ford, Model: Mustang, Fav: OLDISH-Ford-Mustang, Year: 2012, Age: 46
Id: 25652, Make: Toyota, Model: Sienna, Fav: NEWISH-Toyota-Sienna, Year: 2015, Age: 37
Id: 26289, Make: Toyota, Model: Sienna, Fav: NEWISH-Toyota-Sienna, Year: 2014, Age: 43
Id: 25124, Make: Toyota, Model: Sienna, Fav: NEWISH-Toyota-Sienna, Year: 2015, Age: 37
Id: 24667, Make: Ford, Model: Explorer, Fav: OLDISH-Ford-Explorer, Year: 2010, Age: 38
Id: 25612, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2016, Age: 47
Id: 26239, Make: Ford, Model: Mustang, Fav: OLDISH-Ford-Mustang, Year: 2013, Age: 35
Id: 24920, Make: Ford, Model: Mustang, Fav: OLDISH-Ford-Mustang, Year: 2012, Age: 40


#### Now let's see the actual popularity counts of the historical interactions

In [391]:
most_popular = pd.DataFrame(int_expanded_df['ITEM_ID'].value_counts().reset_index())
most_popular.drop(['ITEM_ID'], axis=1, inplace=True)
ten_most_popular = most_popular.head(10)
ten_most_popular.head(10)

Unnamed: 0,index
0,25281
1,25589
2,24064
3,25612
4,26289
5,25652
6,23126
7,26495
8,27038
9,24920


#### Now compare the two lists

In [398]:
overlap = ten_most_popular[ten_most_popular['index'].isin(personalized_pop)].shape[0]
print('We asked Personalize for {} most popular.'.format(NUM_MOST_POPULAR))
print('Of those, {} are truly most popular.'.format(overlap))
not_overlap = ten_most_popular[~ten_most_popular['index'].isin(personalized_pop)]
print('\nThese {} were truly top 10, but Personalize did NOT think so:'.format(not_overlap.shape[0]))
not_overlap.head()

We asked Personalize for 10 most popular.
Of those, 7 are truly most popular.

These 3 were truly top 10, but Personalize did NOT think so:


Unnamed: 0,index
6,23126
7,26495
8,27038


## Use real time events
Here we use the event tracker mechanism of personalize to add some events on the fly after deployment of 
a campaign. We then show the impact on the recommendations.

In [494]:
response = personalize.create_event_tracker(
    name='CarClickTracker',
    datasetGroupArn=dg_arn
)
print(response['eventTrackerArn'])
print(response['trackingId'])

TRACKING_ID = response['trackingId']

arn:aws:personalize:us-east-1:355151823911:event-tracker/1574f70c
f95a6715-a8a7-407c-9e82-7849235e5f27


In [495]:
session_dict = {}

In [496]:
import uuid

def send_car_click(user_id, item_id, ts):
    """
    Simulates a click to send an event to Amazon Personalize's Event Tracker
    """
    # Configure Session
    try:
        session_ID = session_dict[user_id]
    except:
        session_dict[user_id] = str(uuid.uuid1())
        session_ID = session_dict[user_id]
        
    # Configure Properties:
    event = {
        'itemId': str(item_id)
    }
    event_json = json.dumps(event)
        
    # Make Call
    personalize_events.put_events(
        trackingId = TRACKING_ID,
        userId     = str(user_id),
        sessionId  = session_ID,
        eventList  = [{
            'sentAt': ts,
            'eventType': 'EVENT_TYPE',
            'properties': event_json
            }]
)

In [497]:
def send_car_clicks(user_id, items):
    # TODO: send all events in a single array instead of one call for each item
    i = 0
    for item in items:
        send_car_click(user_id, item, time.time())
        i += 1

In [498]:
def recommend_cars(user_id):
    response = personalize_runtime.get_recommendations(campaignArn=hrnn_arn, 
                                                       userId=str(user_id), 
                                                       numResults=25)
    items = response['itemList']
    for item in items:
        print_item(int(item['itemId']))
    print('')

In [513]:
sample_user = int_expanded_df.sample(1).iloc[0]['USER_ID']
sample_user_cluster = int_expanded_df[int_expanded_df.USER_ID == sample_user].iloc[0]['FAV_CLUSTER']
sample_user_fav = int_expanded_df[int_expanded_df.USER_ID == sample_user].iloc[0]['FAV']
print('user: {}, cluster: {}, fav: {}'.format(sample_user, sample_user_cluster, sample_user_fav))

user: 18050, cluster: 7, fav: OLDISH-Toyota-Camry


In [514]:
new_cluster = sample_user_cluster + 1
if (new_cluster == NUM_CLUSTERS):
    new_cluster = 0
new_fav = int_expanded_df[int_expanded_df.FAV_CLUSTER == new_cluster].iloc[0]['FAV']
print('new cluster: {}, new fav: {}'.format(new_cluster, new_fav))

new cluster: 8, new fav: NEWISH-Ford-Fusion


In [515]:
print('Before any real time events, Personalize should recommend {} cars...\n'.format(sample_user_fav))
recommend_cars(sample_user)

Before any real time events, Personalize should recommend OLDISH-Toyota-Camry cars...

Id: 23113, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 41
Id: 26734, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 44
Id: 31008, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2013, Age: 39
Id: 23234, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2007, Age: 44
Id: 28809, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2013, Age: 44
Id: 24088, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2011, Age: 43
Id: 29885, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2009, Age: 39
Id: 23990, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2013, Age: 44
Id: 20823, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 37
Id: 20490, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2013, Age: 36
Id: 22996, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry

In [516]:
new_car_cluster = int_expanded_df[int_expanded_df.FAV_CLUSTER == new_cluster].sample(10)
new_car_cluster[['FAV','ITEM_ID','YEAR','PRICE']].head(3)

Unnamed: 0,FAV,ITEM_ID,YEAR,PRICE
456780,NEWISH-Ford-Fusion,23259,2015,35760
414943,NEWISH-Ford-Fusion,28473,2016,36284
247864,NEWISH-Ford-Fusion,25032,2015,35148


In [517]:
new_items_clicked = new_car_cluster['ITEM_ID'].values
new_items_clicked

array([23259, 28473, 25032, 25132, 26306, 27296, 31273, 24111, 22178,
       26182])

In [518]:
send_car_clicks(sample_user, new_items_clicked)

In [519]:
int_expanded_df[int_expanded_df.USER_ID == sample_user]['FAV'].value_counts()

OLDISH-Toyota-Camry    90
Name: FAV, dtype: int64

In [520]:
print('Now this same user has started to like {} cars.'.format(new_fav))
print('Lets see if Personalize picks up on this real time change in intent...')
recommend_cars(sample_user)

Now this same user has started to like NEWISH-Ford-Fusion cars.
Lets see if Personalize picks up on this real time change in intent...
Id: 23961, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2015, Age: 39
Id: 28719, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2014, Age: 38
Id: 22004, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2015, Age: 45
Id: 20727, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2015, Age: 37
Id: 25717, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2014, Age: 41
Id: 26504, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2016, Age: 33
Id: 25513, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2015, Age: 43
Id: 23942, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2014, Age: 39
Id: 25705, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2016, Age: 43
Id: 20666, Make: Ford, Model: Fusion, Fav: NEWISH-Ford-Fusion, Year: 2019, Age: 37
Id: 27567, Make: Ford, Model: Fusio

## Exercise the hrnn-metadata campaign
Here we try out the hrnn-metadata campaign. 
We ask Personalize for recommendations for a particular user. Our hope is that
it would detect that this user likes old or new cars and would return a list accordingly.

In [521]:
users_to_try = int_expanded_df.sample(10)
users_to_try[['USER_ID','FAV']].head(3)

Unnamed: 0,USER_ID,FAV
172216,18778,OLDISH-Toyota-Camry
159448,20489,NEWISH-Ford-Fusion
711906,20560,NEWISH-Ford-Fusion


In [522]:
show_user_interaction_history(int_expanded_df, users_to_try.iloc[0]['USER_ID'])

(40, 15)


Unnamed: 0,USER_ID,ITEM_ID,FAV,YEAR,PRICE,MILEAGE
172229,18778,30817,OLDISH-Toyota-Camry,2012,20449,105711
172226,18778,20823,OLDISH-Toyota-Camry,2012,27578,106170
172212,18778,23858,OLDISH-Toyota-Camry,2013,25988,98421
172224,18778,24944,OLDISH-Toyota-Camry,2010,14430,135575
172220,18778,24501,OLDISH-Toyota-Camry,2013,23518,91704
172231,18778,25001,OLDISH-Toyota-Camry,2011,18194,126837
172237,18778,15043,OLDISH-Toyota-Camry,2013,25151,97141
172216,18778,19603,OLDISH-Toyota-Camry,2011,24108,129333
172209,18778,24404,OLDISH-Toyota-Camry,2012,26803,108132
172200,18778,21491,OLDISH-Toyota-Camry,2010,15419,136966


In [523]:
for i in range(10):
    user_id = str(users_to_try.iloc[i]['USER_ID'])
    fav     = users_to_try.iloc[i]['FAV']
    print('Getting recommendations for user: {}, who likes: {}'.format(user_id, fav))
    response = personalize_runtime.get_recommendations(campaignArn=hrnn_metadata_arn, 
                                                       userId=user_id, 
                                                       numResults=10)
    items = response['itemList']
    for item in items:
        print_item(int(item['itemId']))
    print('')

Getting recommendations for user: 18778, who likes: OLDISH-Toyota-Camry
Id: 22472, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 40
Id: 23174, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 36
Id: 27324, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2011, Age: 43
Id: 28133, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2011, Age: 38
Id: 23609, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2011, Age: 43
Id: 26345, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2010, Age: 40
Id: 25152, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 42
Id: 26613, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 39
Id: 24239, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 39
Id: 24040, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2011, Age: 41

Getting recommendations for user: 20489, who likes: NEWISH-Ford-Fusion
Id: 23

Id: 24824, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2011, Age: 42
Id: 24637, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2012, Age: 34
Id: 23736, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2011, Age: 42
Id: 23215, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2012, Age: 42
Id: 25575, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2013, Age: 42
Id: 26137, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2012, Age: 42
Id: 25146, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2011, Age: 41
Id: 29055, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2011, Age: 40
Id: 24302, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2011, Age: 37
Id: 26043, Make: Nissan, Model: Leaf, Fav: OLDISH-Nissan-Leaf, Year: 2011, Age: 42



### Paste in ITEM_ID results from Personalize console to see results in context

In [None]:
results_from_console = ''

item_ids = np.asarray([22163,
24920,
22006,
27841,
20432,
23609,
28250,
25526,
28305,
28143,
26224,
23330,
25106,
21044,
21693,
25787,
24186,
21239,
24041,
26370,
22595,
29357,
15431,
26717,
25327])
item_id_str = item_ids.astype(str)
int_expanded_df[int_expanded_df.ITEM_ID.isin(item_id_str)]