# Using Personalize campaigns  on synthetic cars data
This notebook takes advantage of campaigns that have been built in the other notebooks.
There are specific sections for each model:

1. [Personalized Ranking](#Exercise-the-personalized-ranking-campaign)
2. [HRNN](#Exercise-the-hrnn-campaign)
3. [SIMS](#Exercise-the-SIMS-campaign)
4. [HRNN-Metadata]()
5. [Popularity Count](#Exercise-the-popularity-campaign)

In addition, we have a section for experimenting with 
[Personalize Event Tracker](#Use-real-time-events).

## Imports, overall settings, initialization

In [573]:
import json
import boto3
import time
import datetime
import pandas as pd
from sklearn.utils import shuffle

region      = '<region>'
account_num = '<your-account>'
dataset_group_name = 'car-dg10'

dg_arn = 'arn:aws:personalize:{}:{}:dataset-group/{}'.format(region, account_num, dataset_group_name)

cars_filename         = 'car_items.csv'
users_filename        = 'users.csv'
interactions_filename = 'interactions.csv'
int_exp_filename      = 'interactions_expanded.csv'

ranking_arn       = 'arn:aws:personalize:{}:{}:campaign/car-personalized-ranking'.format(region, account_num)
sims_arn          = 'arn:aws:personalize:{}:{}:campaign/car-sims'.format(region, account_num)
hrnn_arn          = 'arn:aws:personalize:{}:{}:campaign/car-hrnn'.format(region, account_num)
hrnn_metadata_arn = 'arn:aws:personalize:{}:{}:campaign/car-hrnn-metadata'.format(region, account_num)
pop_arn           = 'arn:aws:personalize:{}:{}:campaign/car-popularity-count'.format(region, account_num)

In [574]:
personalize         = boto3.client('personalize')
personalize_runtime = boto3.client('personalize-runtime')
personalize_events  = boto3.client('personalize-events')

In [575]:
def show_item_interaction_history(int_df, item_id):
    _tmp_df = int_df[int_df.ITEM_ID == item_id].sort_values('TIMESTAMP')
    print(_tmp_df.shape)
    return _tmp_df[['USER_ID','ITEM_ID','WHEN',
                    'FAV','YEAR','GENDER','SALARY']]

In [576]:
def show_user_interaction_history(int_df, user_id):
    _tmp_df = int_df[int_df.USER_ID == int(user_id)].sort_values('TIMESTAMP')
    print(_tmp_df.shape)
    return _tmp_df[['USER_ID','ITEM_ID','WHEN',
                    'FAV','YEAR','PRICE','MILEAGE']]

In [577]:
def date_to_string(ts):
    return datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')

In [578]:
int_expanded_df = pd.read_csv(int_exp_filename)

int_expanded_df['WHEN'] = int_expanded_df['TIMESTAMP'].apply(date_to_string)

NUM_CLUSTERS = len(int_expanded_df.FAV_CLUSTER.value_counts())
print('{} clusters'.format(NUM_CLUSTERS))

20 clusters


In [579]:
items_to_rank = int_expanded_df.sample(10)
items_to_rank.head(3)

Unnamed: 0,USER_ID,ITEM_ID,TIMESTAMP,MAKE,MODEL,YEAR,MILEAGE,PRICE,AGE,GENDER,LOCATION,SALARY,FAV_CLUSTER,FAV_MODEL,FAV,WHEN
365868,9103,26193,1562854025,Nissan,Rogue,2011,121466,25235,40,FEMALE,926,26306,3,1,OLDISH-Nissan-Rogue,2019-07-11 14:07:05
478136,18349,26607,1562858608,Nissan,Leaf,2013,98573,23012,34,MALE,90250,33656,15,7,OLDISH-Nissan-Leaf,2019-07-11 15:23:28
364941,19861,23083,1562858100,Nissan,Rogue,2012,105212,20423,48,MALE,11219,26648,3,1,OLDISH-Nissan-Rogue,2019-07-11 15:15:00


In [580]:
def print_item(item_id):
    tmp = int_expanded_df[int_expanded_df.ITEM_ID == item_id].iloc[0]
    print('Id: {}, Make: {}, Model: {}, Fav: {}, Year: {}, Age: {}'.format(item_id,
         tmp['MAKE'], tmp['MODEL'], tmp['FAV'], tmp['YEAR'], tmp['AGE']))

Skip ahead to try out various campaigns:

1. [Personalized Ranking](#Exercise-the-personalized-ranking-campaign)
2. [HRNN](#Exercise-the-hrnn-campaign)
3. [SIMS](#Exercise-the-SIMS-campaign)
4. [HRNN-Metadata]()
5. [Popularity Count](#Exercise-the-popularity-campaign)

In addition, we have a section for experimenting with 
[Personalize Event Tracker](#Use-real-time-events).

## Exercise the Personalized Ranking campaign
Here we want to see Personalize re-rank a set of search results. For our sample, we will pass
a user that likes oldish cars and would expect oldish cars to appear closer to the top. Likewise, we will
pass a user that likes newish cars and expect the higher ranked cars to be newish.

In [581]:
full_df = pd.DataFrame(columns=['USER_ID','FAV','FAV_CLUSTER'])

for i in range(NUM_CLUSTERS):
    tmp_df = int_expanded_df[int_expanded_df.FAV_CLUSTER == i][['USER_ID','FAV','FAV_CLUSTER']].sample(1)
    full_df = pd.concat([full_df, tmp_df])
ranking_user_df = shuffle(full_df)
ranking_user_list = ranking_user_df['USER_ID'].values.astype(str).tolist()
ranking_user_df.head(25)

Unnamed: 0,USER_ID,FAV,FAV_CLUSTER
309709,8485,NEWISH-Toyota-Camry,6
48341,15421,OLDISH-Ford-Fusion,9
746667,18105,NEWISH-Toyota-Sienna,0
561163,20855,NEWISH-Toyota-Prius,12
361013,15660,OLDISH-Nissan-Rogue,3
733135,5205,OLDISH-Ford-Mustang,17
110296,11910,OLDISH-Nissan-Altima,5
172060,11549,OLDISH-Toyota-Camry,7
29405,10804,OLDISH-Toyota-Prius,13
62981,12409,NEWISH-Ford-Fusion,8


In [582]:
full_df = pd.DataFrame(columns=['ITEM_ID','FAV'])

for i in range(NUM_CLUSTERS):
    tmp_df = int_expanded_df[int_expanded_df.FAV_CLUSTER == i][['ITEM_ID','FAV']].sample(1)
    full_df = pd.concat([full_df, tmp_df])
ranking_item_df = shuffle(full_df)
ranking_item_list = ranking_item_df['ITEM_ID'].values.astype(str).tolist()
ranking_item_df.head(25)

Unnamed: 0,ITEM_ID,FAV
644015,28342,OLDISH-Nissan-Leaf
748243,21096,OLDISH-Toyota-Sienna
148160,20498,OLDISH-Toyota-Prius
360650,24021,OLDISH-Nissan-Rogue
351295,18590,NEWISH-Nissan-Leaf
245054,23142,NEWISH-Ford-Fusion
316380,26083,NEWISH-Toyota-Camry
628151,25566,OLDISH-Ford-Explorer
735500,27301,OLDISH-Ford-Mustang
182185,23056,OLDISH-Toyota-Camry


In [583]:
def print_ranking_target_df(user_id, input_df, target_cluster):
    print('\nRanking for user: {}'.format(user_id))
    
    _input_list = input_df['ITEM_ID'].values.astype(str).tolist()
    
    personalized_ranking_response = personalize_runtime.get_personalized_ranking(
        campaignArn = ranking_arn, userId = str(user_id), inputList = _input_list)
    
    i = 0
    _rank = len(_input_list)
    for item in personalized_ranking_response['personalizedRanking']:
        item_id = item['itemId']
        tmp = int_expanded_df[int_expanded_df.ITEM_ID == int(item_id)].iloc[0]
        _fav_cluster = tmp['FAV_CLUSTER']
        if (target_cluster == _fav_cluster) & (_rank == len(_input_list)):
            _rank = i
        print('Id: {}, Make: {}, Model: {}, Year: {}, Price: {}, Fav: {}'.format(item_id,
         tmp['MAKE'], tmp['MODEL'], tmp['YEAR'], tmp['PRICE'], tmp['FAV']))
        i += 1
    return _rank

In [584]:
full_df = pd.DataFrame(columns=['ITEM_ID','FAV','FAV_CLUSTER'])

random_item_df = shuffle(int_expanded_df[['ITEM_ID','FAV', 'FAV_CLUSTER']].sample(25))
random_item_list = random_item_df['ITEM_ID'].values.astype(str).tolist()
random_item_df.head(25)

Unnamed: 0,ITEM_ID,FAV,FAV_CLUSTER
230577,22495,OLDISH-Ford-Fusion,9
745794,25791,OLDISH-Toyota-Rav4,19
452102,28885,OLDISH-Ford-Explorer,11
180151,26618,OLDISH-Toyota-Camry,7
94107,19919,NEWISH-Ford-Fusion,8
10697,27817,OLDISH-Ford-Explorer,11
691933,26258,NEWISH-Nissan-Altima,4
552821,25530,NEWISH-Nissan-Leaf,14
358567,33534,OLDISH-Nissan-Altima,5
50365,22961,OLDISH-Ford-Fusion,9


#### Try personalized ranking on a set of random items
Here we take some random items and see how well Personalize can re-rank them
for each of a set of users with a known bias to specific car clusters. Ideally,
we would find that a matching car would rise as close to the 0th rank as possible.
We use a curated list of users that cover each car cluster preference. 

Note that it is likely that some of the users will have a preference that is not
covered by the list of random items. In those cases, the best case is that Personalize
re-ranks the list simply based on popularity.

In [585]:
rank_total = 0
for i in range(NUM_CLUSTERS):
    user_fav = ranking_user_df.iloc[i]['FAV']
    user_fav_cluster = ranking_user_df.iloc[i]['FAV_CLUSTER']
    print('\nRanking for user that prefers: {}'.format(user_fav))
    rank = print_ranking_target_df(ranking_user_list[i], random_item_df, user_fav_cluster)
    if (random_item_df.shape[0] == rank):
        print('**desired cluster was not found in the item set')
        rank = 0 # reset to not penalize when no item was available
    else:
        print('**rank {}'.format(rank))
    rank_total += rank

print('\nRank average: {:.2}'.format(rank_total/len(random_item_list)))


Ranking for user that prefers: NEWISH-Toyota-Camry

Ranking for user: 8485
Id: 35337, Make: Toyota, Model: Camry, Year: 2014, Price: 28501, Fav: NEWISH-Toyota-Camry
Id: 20883, Make: Nissan, Model: Leaf, Year: 2011, Price: 17397, Fav: OLDISH-Nissan-Leaf
Id: 33534, Make: Nissan, Model: Altima, Year: 2013, Price: 28534, Fav: OLDISH-Nissan-Altima
Id: 34717, Make: Nissan, Model: Altima, Year: 2013, Price: 30582, Fav: OLDISH-Nissan-Altima
Id: 18281, Make: Ford, Model: Explorer, Year: 2014, Price: 33208, Fav: NEWISH-Ford-Explorer
Id: 26258, Make: Nissan, Model: Altima, Year: 2014, Price: 33292, Fav: NEWISH-Nissan-Altima
Id: 17217, Make: Ford, Model: Fusion, Year: 2016, Price: 35261, Fav: NEWISH-Ford-Fusion
Id: 22824, Make: Ford, Model: Explorer, Year: 2009, Price: 14616, Fav: OLDISH-Ford-Explorer
Id: 27817, Make: Ford, Model: Explorer, Year: 2013, Price: 27265, Fav: OLDISH-Ford-Explorer
Id: 23936, Make: Toyota, Model: Prius, Year: 2012, Price: 24662, Fav: OLDISH-Toyota-Prius
Id: 19919, Make:

Id: 26258, Make: Nissan, Model: Altima, Year: 2014, Price: 33292, Fav: NEWISH-Nissan-Altima
Id: 25791, Make: Toyota, Model: Rav4, Year: 2012, Price: 22753, Fav: OLDISH-Toyota-Rav4
Id: 27240, Make: Nissan, Model: Leaf, Year: 2015, Price: 34652, Fav: NEWISH-Nissan-Leaf
Id: 35337, Make: Toyota, Model: Camry, Year: 2014, Price: 28501, Fav: NEWISH-Toyota-Camry
Id: 25530, Make: Nissan, Model: Leaf, Year: 2017, Price: 36843, Fav: NEWISH-Nissan-Leaf
Id: 22824, Make: Ford, Model: Explorer, Year: 2009, Price: 14616, Fav: OLDISH-Ford-Explorer
Id: 27817, Make: Ford, Model: Explorer, Year: 2013, Price: 27265, Fav: OLDISH-Ford-Explorer
Id: 33534, Make: Nissan, Model: Altima, Year: 2013, Price: 28534, Fav: OLDISH-Nissan-Altima
Id: 17234, Make: Toyota, Model: Camry, Year: 2010, Price: 22678, Fav: OLDISH-Toyota-Camry
Id: 28885, Make: Ford, Model: Explorer, Year: 2012, Price: 28363, Fav: OLDISH-Ford-Explorer
Id: 34717, Make: Nissan, Model: Altima, Year: 2013, Price: 30582, Fav: OLDISH-Nissan-Altima
Id: 

Id: 25791, Make: Toyota, Model: Rav4, Year: 2012, Price: 22753, Fav: OLDISH-Toyota-Rav4
Id: 34717, Make: Nissan, Model: Altima, Year: 2013, Price: 30582, Fav: OLDISH-Nissan-Altima
Id: 33534, Make: Nissan, Model: Altima, Year: 2013, Price: 28534, Fav: OLDISH-Nissan-Altima
Id: 26258, Make: Nissan, Model: Altima, Year: 2014, Price: 33292, Fav: NEWISH-Nissan-Altima
Id: 25530, Make: Nissan, Model: Leaf, Year: 2017, Price: 36843, Fav: NEWISH-Nissan-Leaf
Id: 22495, Make: Ford, Model: Fusion, Year: 2013, Price: 27441, Fav: OLDISH-Ford-Fusion
Id: 21035, Make: Ford, Model: Fusion, Year: 2013, Price: 24892, Fav: OLDISH-Ford-Fusion
Id: 27240, Make: Nissan, Model: Leaf, Year: 2015, Price: 34652, Fav: NEWISH-Nissan-Leaf
Id: 20883, Make: Nissan, Model: Leaf, Year: 2011, Price: 17397, Fav: OLDISH-Nissan-Leaf
Id: 17217, Make: Ford, Model: Fusion, Year: 2016, Price: 35261, Fav: NEWISH-Ford-Fusion
Id: 30623, Make: Ford, Model: Fusion, Year: 2010, Price: 17397, Fav: OLDISH-Ford-Fusion
Id: 18281, Make: For

Id: 20883, Make: Nissan, Model: Leaf, Year: 2011, Price: 17397, Fav: OLDISH-Nissan-Leaf
Id: 33534, Make: Nissan, Model: Altima, Year: 2013, Price: 28534, Fav: OLDISH-Nissan-Altima
Id: 26258, Make: Nissan, Model: Altima, Year: 2014, Price: 33292, Fav: NEWISH-Nissan-Altima
Id: 34717, Make: Nissan, Model: Altima, Year: 2013, Price: 30582, Fav: OLDISH-Nissan-Altima
Id: 35337, Make: Toyota, Model: Camry, Year: 2014, Price: 28501, Fav: NEWISH-Toyota-Camry
Id: 22323, Make: Toyota, Model: Prius, Year: 2011, Price: 16343, Fav: OLDISH-Toyota-Prius
Id: 25791, Make: Toyota, Model: Rav4, Year: 2012, Price: 22753, Fav: OLDISH-Toyota-Rav4
Id: 22824, Make: Ford, Model: Explorer, Year: 2009, Price: 14616, Fav: OLDISH-Ford-Explorer
Id: 23936, Make: Toyota, Model: Prius, Year: 2012, Price: 24662, Fav: OLDISH-Toyota-Prius
Id: 21035, Make: Ford, Model: Fusion, Year: 2013, Price: 24892, Fav: OLDISH-Ford-Fusion
Id: 30623, Make: Ford, Model: Fusion, Year: 2010, Price: 17397, Fav: OLDISH-Ford-Fusion
Id: 29307,

#### Try personalized ranking on a curated set of items with each car cluster covered
Here we take a curated set of items, with one item for each car cluster. Personalize
should be able to re-rank in such a way that the specific item that would best match
the user rises to the 0th position.

In [586]:
rank_total = 0
for i in range(NUM_CLUSTERS):
    user_fav = ranking_user_df.iloc[i]['FAV']
    user_fav_cluster = ranking_user_df.iloc[i]['FAV_CLUSTER']
    print('\nRanking for user that prefers: {}'.format(user_fav))
    rank = print_ranking_target_df(ranking_user_list[i], ranking_item_df, user_fav_cluster)
    if (random_item_df.shape[0] == rank):
        print('**desired cluster was not found in the item set')
        rank = 0 # reset to not penalize when no item was available
    else:
        print('**rank {}'.format(rank))
    rank_total += rank

print('\nRank average: {:.2}'.format(rank_total/len(ranking_item_list)))


Ranking for user that prefers: NEWISH-Toyota-Camry

Ranking for user: 8485
Id: 26083, Make: Toyota, Model: Camry, Year: 2015, Price: 32542, Fav: NEWISH-Toyota-Camry
Id: 27030, Make: Ford, Model: Mustang, Year: 2016, Price: 37311, Fav: NEWISH-Ford-Mustang
Id: 28342, Make: Nissan, Model: Leaf, Year: 2011, Price: 18026, Fav: OLDISH-Nissan-Leaf
Id: 24021, Make: Nissan, Model: Rogue, Year: 2010, Price: 21995, Fav: OLDISH-Nissan-Rogue
Id: 22430, Make: Nissan, Model: Altima, Year: 2013, Price: 26351, Fav: OLDISH-Nissan-Altima
Id: 25670, Make: Ford, Model: Explorer, Year: 2015, Price: 35804, Fav: NEWISH-Ford-Explorer
Id: 25518, Make: Nissan, Model: Altima, Year: 2016, Price: 37217, Fav: NEWISH-Nissan-Altima
Id: 23142, Make: Ford, Model: Fusion, Year: 2015, Price: 29901, Fav: NEWISH-Ford-Fusion
Id: 20498, Make: Toyota, Model: Prius, Year: 2013, Price: 22501, Fav: OLDISH-Toyota-Prius
Id: 30762, Make: Nissan, Model: Rogue, Year: 2014, Price: 26870, Fav: NEWISH-Nissan-Rogue
Id: 19088, Make: Toyot

Id: 24021, Make: Nissan, Model: Rogue, Year: 2010, Price: 21995, Fav: OLDISH-Nissan-Rogue
Id: 19088, Make: Toyota, Model: Rav4, Year: 2013, Price: 28700, Fav: OLDISH-Toyota-Rav4
Id: 18590, Make: Nissan, Model: Leaf, Year: 2014, Price: 28751, Fav: NEWISH-Nissan-Leaf
Id: 23056, Make: Toyota, Model: Camry, Year: 2011, Price: 25610, Fav: OLDISH-Toyota-Camry
Id: 28342, Make: Nissan, Model: Leaf, Year: 2011, Price: 18026, Fav: OLDISH-Nissan-Leaf
Id: 27030, Make: Ford, Model: Mustang, Year: 2016, Price: 37311, Fav: NEWISH-Ford-Mustang
Id: 22430, Make: Nissan, Model: Altima, Year: 2013, Price: 26351, Fav: OLDISH-Nissan-Altima
Id: 25670, Make: Ford, Model: Explorer, Year: 2015, Price: 35804, Fav: NEWISH-Ford-Explorer
Id: 20498, Make: Toyota, Model: Prius, Year: 2013, Price: 22501, Fav: OLDISH-Toyota-Prius
Id: 29605, Make: Toyota, Model: Prius, Year: 2019, Price: 49461, Fav: NEWISH-Toyota-Prius
Id: 25566, Make: Ford, Model: Explorer, Year: 2011, Price: 20757, Fav: OLDISH-Ford-Explorer
Id: 26083,

Id: 22809, Make: Toyota, Model: Rav4, Year: 2016, Price: 36958, Fav: NEWISH-Toyota-Rav4
Id: 21096, Make: Toyota, Model: Sienna, Year: 2010, Price: 21760, Fav: OLDISH-Toyota-Sienna
Id: 30762, Make: Nissan, Model: Rogue, Year: 2014, Price: 26870, Fav: NEWISH-Nissan-Rogue
Id: 27301, Make: Ford, Model: Mustang, Year: 2011, Price: 25569, Fav: OLDISH-Ford-Mustang
Id: 17073, Make: Toyota, Model: Sienna, Year: 2015, Price: 37126, Fav: NEWISH-Toyota-Sienna
Id: 22430, Make: Nissan, Model: Altima, Year: 2013, Price: 26351, Fav: OLDISH-Nissan-Altima
Id: 19088, Make: Toyota, Model: Rav4, Year: 2013, Price: 28700, Fav: OLDISH-Toyota-Rav4
Id: 24021, Make: Nissan, Model: Rogue, Year: 2010, Price: 21995, Fav: OLDISH-Nissan-Rogue
Id: 25518, Make: Nissan, Model: Altima, Year: 2016, Price: 37217, Fav: NEWISH-Nissan-Altima
Id: 29605, Make: Toyota, Model: Prius, Year: 2019, Price: 49461, Fav: NEWISH-Toyota-Prius
Id: 27030, Make: Ford, Model: Mustang, Year: 2016, Price: 37311, Fav: NEWISH-Ford-Mustang
Id: 28

Id: 27301, Make: Ford, Model: Mustang, Year: 2011, Price: 25569, Fav: OLDISH-Ford-Mustang
Id: 21096, Make: Toyota, Model: Sienna, Year: 2010, Price: 21760, Fav: OLDISH-Toyota-Sienna
Id: 29605, Make: Toyota, Model: Prius, Year: 2019, Price: 49461, Fav: NEWISH-Toyota-Prius
Id: 25566, Make: Ford, Model: Explorer, Year: 2011, Price: 20757, Fav: OLDISH-Ford-Explorer
Id: 24021, Make: Nissan, Model: Rogue, Year: 2010, Price: 21995, Fav: OLDISH-Nissan-Rogue
Id: 21116, Make: Ford, Model: Fusion, Year: 2012, Price: 28209, Fav: OLDISH-Ford-Fusion
Id: 25670, Make: Ford, Model: Explorer, Year: 2015, Price: 35804, Fav: NEWISH-Ford-Explorer
Id: 23056, Make: Toyota, Model: Camry, Year: 2011, Price: 25610, Fav: OLDISH-Toyota-Camry
Id: 18590, Make: Nissan, Model: Leaf, Year: 2014, Price: 28751, Fav: NEWISH-Nissan-Leaf
Id: 23142, Make: Ford, Model: Fusion, Year: 2015, Price: 29901, Fav: NEWISH-Ford-Fusion
**rank 0

Ranking for user that prefers: NEWISH-Nissan-Leaf

Ranking for user: 16094
Id: 18590, Make

## Exercise the hrnn campaign
Here we try out the hrnn campaign. We ask Personalize for recommendations for a particular user. Our hope is that
it would detect that this user likes old or new cars and would return a list accordingly. 
Best case is that the recommended list of cars entirely matches the user's preferred car
cluster.

In [587]:
users_to_try = int_expanded_df.sample(10)
users_to_try[['USER_ID','FAV']].head(3)

Unnamed: 0,USER_ID,FAV
239788,24654,OLDISH-Ford-Fusion
699494,7022,NEWISH-Toyota-Prius
127207,22641,NEWISH-Ford-Explorer


In [588]:
show_user_interaction_history(int_expanded_df, users_to_try.iloc[0]['USER_ID'])

(20, 16)


Unnamed: 0,USER_ID,ITEM_ID,WHEN,FAV,YEAR,PRICE,MILEAGE
239796,24654,15443,2019-07-11 13:47:00,OLDISH-Ford-Fusion,2013,30925,98618
239787,24654,27255,2019-07-11 13:49:00,OLDISH-Ford-Fusion,2013,28172,90303
239782,24654,19500,2019-07-11 13:53:00,OLDISH-Ford-Fusion,2013,27413,98248
239798,24654,16140,2019-07-11 13:59:00,OLDISH-Ford-Fusion,2012,20806,110471
239791,24654,17822,2019-07-11 14:07:00,OLDISH-Ford-Fusion,2011,20877,128695
239794,24654,17444,2019-07-11 14:14:18,OLDISH-Ford-Fusion,2013,26746,95360
239785,24654,24930,2019-07-11 14:16:18,OLDISH-Ford-Fusion,2011,21176,122879
239780,24654,27660,2019-07-11 14:17:00,OLDISH-Ford-Fusion,2012,23097,108977
239789,24654,22805,2019-07-11 14:20:18,OLDISH-Ford-Fusion,2013,24976,94450
239792,24654,23571,2019-07-11 14:26:18,OLDISH-Ford-Fusion,2013,29013,92089


In [589]:
for i in range(10):
    user_id = str(users_to_try.iloc[i]['USER_ID'])
    fav     = users_to_try.iloc[i]['FAV']
    print('Getting recommendations for user: {}, who likes: {}'.format(user_id, fav))
    response = personalize_runtime.get_recommendations(campaignArn=hrnn_arn, 
                                                       userId=user_id, 
                                                       numResults=10)
    items = response['itemList']
    for item in items:
        print_item(int(item['itemId']))
    print('')

Getting recommendations for user: 24654, who likes: OLDISH-Ford-Fusion
Id: 24497, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2012, Age: 30
Id: 27549, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2012, Age: 30
Id: 23906, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2010, Age: 30
Id: 26110, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2011, Age: 43
Id: 28824, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2012, Age: 30
Id: 22609, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2009, Age: 36
Id: 20203, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2010, Age: 30
Id: 29032, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2013, Age: 41
Id: 22744, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2012, Age: 39
Id: 25517, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2012, Age: 41

Getting recommendations for user: 7022, who likes: NEWISH-Toyota-Prius
Id: 27242, Make: Toyota, Mo

Id: 31294, Make: Toyota, Model: Sienna, Fav: NEWISH-Toyota-Sienna, Year: 2016, Age: 40
Id: 25612, Make: Toyota, Model: Sienna, Fav: NEWISH-Toyota-Sienna, Year: 2018, Age: 40
Id: 26802, Make: Nissan, Model: Rogue, Fav: NEWISH-Nissan-Rogue, Year: 2014, Age: 35
Id: 20238, Make: Nissan, Model: Rogue, Fav: NEWISH-Nissan-Rogue, Year: 2015, Age: 38
Id: 31349, Make: Toyota, Model: Sienna, Fav: NEWISH-Toyota-Sienna, Year: 2014, Age: 40
Id: 25848, Make: Toyota, Model: Sienna, Fav: NEWISH-Toyota-Sienna, Year: 2014, Age: 35
Id: 21680, Make: Nissan, Model: Rogue, Fav: NEWISH-Nissan-Rogue, Year: 2014, Age: 35



## Exercise the SIMS campaign
Here we experiment with the SIMS campaign. We loop through a list of items that have at 
least some interactions historically. 
For each car, we would expect similar cars to be similar in age, make and model.
We leverage car clusters and would like to see Personalize generate a list of similar cars
that entirely come from the same car cluster.

In [590]:
items_to_try = int_expanded_df.sample(10)
items_to_try[['ITEM_ID','FAV']].head(5)

Unnamed: 0,ITEM_ID,FAV
265454,24956,OLDISH-Toyota-Camry
122653,19043,NEWISH-Ford-Explorer
388007,18244,OLDISH-Ford-Fusion
683076,25636,OLDISH-Ford-Fusion
84390,26583,OLDISH-Toyota-Prius


In [591]:
desired_num_results = 10
for i in range(items_to_try.shape[0]):
    item_id     = str(items_to_try.iloc[i]['ITEM_ID'])
    fav         = items_to_try.iloc[i]['FAV']
    fav_cluster = items_to_try.iloc[i]['FAV_CLUSTER']
    
    print('Getting items similar to: {}, which is a: {}'.format(item_id, fav))
    response = personalize_runtime.get_recommendations(campaignArn=sims_arn, 
                                                       itemId=item_id, 
                                                       numResults=desired_num_results)
    items = response['itemList']
    match = 0
    actual_num_results = len(items)
    for item in items:
        _curr_cluster = int_expanded_df[int_expanded_df.ITEM_ID == int(item_id)].iloc[0]['FAV_CLUSTER']
        if fav_cluster == _curr_cluster:
            match += 1
        print_item(int(item['itemId']))
    print('Matched {:.2} ({}/{})'.format(match/actual_num_results, match, actual_num_results))
    print('')

Getting items similar to: 24956, which is a: OLDISH-Toyota-Camry
Id: 10573, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2009, Age: 41
Id: 35665, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 39
Id: 23099, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2013, Age: 40
Id: 34976, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2013, Age: 38
Id: 22308, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2013, Age: 43
Id: 19593, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 34
Id: 19727, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 35
Id: 18170, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2012, Age: 44
Id: 16741, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2013, Age: 42
Id: 35351, Make: Toyota, Model: Camry, Fav: OLDISH-Toyota-Camry, Year: 2011, Age: 39
Matched 1.0 (10/10)

Getting items similar to: 19043, which is a: NEWISH-Ford-Explore

Id: 18810, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2015, Age: 40
Matched 1.0 (10/10)



## Exercise the popularity campaign
Personalize provides a baseline recommender which leverages simple popularity of an item. 
Here we will
compare its results with our own definition of "popular". 

Our popularity is driven simply by total count of
interactions for that item. We expect significant overlap between our list and the one from Personalize.

#### First let's see the results from Personalize

In [592]:
NUM_MOST_POPULAR = 10

popularity_response = personalize_runtime.get_recommendations(campaignArn=pop_arn, 
                                                              userId='0', 
                                                              numResults=NUM_MOST_POPULAR)
pop_items = popularity_response['itemList']
for item in pop_items:
    print_item(int(item['itemId']))
    
personalized_pop = []
for p in pop_items:
    personalized_pop.append(str(p['itemId']))

Id: 26830, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2013, Age: 35
Id: 24573, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2016, Age: 39
Id: 26135, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2013, Age: 41
Id: 26316, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2009, Age: 35
Id: 27095, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2011, Age: 44
Id: 25478, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2013, Age: 41
Id: 25863, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2017, Age: 33
Id: 24231, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2016, Age: 36
Id: 25248, Make: Ford, Model: Fusion, Fav: OLDISH-Ford-Fusion, Year: 2013, Age: 35
Id: 24831, Make: Ford, Model: Explorer, Fav: NEWISH-Ford-Explorer, Year: 2014, Age: 45


#### Now let's see the actual popularity counts of the historical interactions

In [593]:
most_popular = pd.DataFrame(int_expanded_df['ITEM_ID'].value_counts().reset_index())
most_popular.drop(['ITEM_ID'], axis=1, inplace=True)
ten_most_popular = most_popular.head(10)
ten_most_popular.head(10)

Unnamed: 0,index
0,24573
1,26830
2,26135
3,26316
4,25248
5,25863
6,25478
7,27095
8,24231
9,25869


#### Now compare the two lists

In [594]:
overlap = ten_most_popular[ten_most_popular['index'].isin(personalized_pop)].shape[0]
print('We asked Personalize for {} most popular.'.format(NUM_MOST_POPULAR))
print('Of those, {} are truly most popular.'.format(overlap))
not_overlap = ten_most_popular[~ten_most_popular['index'].isin(personalized_pop)]
print('\nThese {} were truly top 10, but Personalize did NOT think so:'.format(not_overlap.shape[0]))
not_overlap.head()

We asked Personalize for 10 most popular.
Of those, 9 are truly most popular.

These 1 were truly top 10, but Personalize did NOT think so:


Unnamed: 0,index
9,25869


## Use real time events
Here we use the event tracker mechanism of personalize to add some events on the fly after deployment of 
a campaign. We then show the impact on the recommendations.

In [595]:
response = personalize.create_event_tracker(
    name='CarClickTracker',
    datasetGroupArn=dg_arn
)
print(response['eventTrackerArn'])
print(response['trackingId'])

TRACKING_ID = response['trackingId']

arn:aws:personalize:us-east-1:355151823911:event-tracker/1574f70c
b55db42b-7896-40a4-8f5b-e1c461f61470


In [596]:
session_dict = {}

In [597]:
import uuid

def send_car_click(user_id, item_id, ts):
    """
    Simulates a click to send an event to Amazon Personalize's Event Tracker
    """
    # Configure Session
    try:
        session_ID = session_dict[user_id]
    except:
        session_dict[user_id] = str(uuid.uuid1())
        session_ID = session_dict[user_id]
        
    # Configure Properties:
    event = {
        'itemId': str(item_id)
    }
    event_json = json.dumps(event)
        
    # Make Call
    personalize_events.put_events(
        trackingId = TRACKING_ID,
        userId     = str(user_id),
        sessionId  = session_ID,
        eventList  = [{
            'sentAt': ts,
            'eventType': 'EVENT_TYPE',
            'properties': event_json
            }]
)

In [598]:
def send_car_clicks(user_id, items):
    # TODO: send all events in a single array instead of one call for each item
    i = 0
    for item in items:
        send_car_click(user_id, item, time.time())
        i += 1

In [599]:
def recommend_cars(user_id):
    response = personalize_runtime.get_recommendations(campaignArn=hrnn_arn, 
                                                       userId=str(user_id), 
                                                       numResults=25)
    items = response['itemList']
    for item in items:
        print_item(int(item['itemId']))
    print('')

In [600]:
sample_user = int_expanded_df.sample(1).iloc[0]['USER_ID']
sample_user_cluster = int_expanded_df[int_expanded_df.USER_ID == sample_user].iloc[0]['FAV_CLUSTER']
sample_user_fav = int_expanded_df[int_expanded_df.USER_ID == sample_user].iloc[0]['FAV']
print('user: {}, cluster: {}, fav: {}'.format(sample_user, sample_user_cluster, sample_user_fav))

user: 15688, cluster: 12, fav: NEWISH-Toyota-Prius


In [601]:
new_cluster = sample_user_cluster + 1
if (new_cluster == NUM_CLUSTERS):
    new_cluster = 0
new_fav = int_expanded_df[int_expanded_df.FAV_CLUSTER == new_cluster].iloc[0]['FAV']
print('new cluster: {}, new fav: {}'.format(new_cluster, new_fav))

new cluster: 13, new fav: OLDISH-Toyota-Prius


In [602]:
print('Before any real time events, Personalize should recommend {} cars...\n'.format(sample_user_fav))
recommend_cars(sample_user)

Before any real time events, Personalize should recommend NEWISH-Toyota-Prius cars...

Id: 27242, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2014, Age: 38
Id: 22172, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2015, Age: 35
Id: 25486, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2016, Age: 40
Id: 23346, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2014, Age: 36
Id: 27600, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2014, Age: 46
Id: 24546, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2014, Age: 51
Id: 22654, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2016, Age: 38
Id: 21029, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2014, Age: 37
Id: 23085, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2017, Age: 32
Id: 26417, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius, Year: 2014, Age: 37
Id: 21006, Make: Toyota, Model: Prius, Fav: NEWISH-Toyota-Prius

In [621]:
new_car_cluster = int_expanded_df[int_expanded_df.FAV_CLUSTER == new_cluster].sample(100)
new_car_cluster[['FAV','ITEM_ID','YEAR','PRICE']].head(3)

Unnamed: 0,FAV,ITEM_ID,YEAR,PRICE
24757,OLDISH-Toyota-Prius,19705,2011,22404
631375,OLDISH-Toyota-Prius,26571,2012,25370
395750,OLDISH-Toyota-Prius,29168,2012,22500


In [622]:
new_items_clicked = new_car_cluster['ITEM_ID'].values
new_items_clicked

array([19705, 26571, 29168, 15761, 26795, 29665, 21651, 20170, 20536,
       29606, 23591, 20439, 28707, 18529, 28784, 19886, 22589, 35073,
       36365, 17861, 27209, 33885, 24701, 26457, 22938, 34642, 33591,
       21111, 28442, 19295, 24991, 17726, 24911, 26571, 29936, 29199,
       20103, 31233, 23936, 30726, 22984, 21990, 15599, 25643, 19022,
       19937, 26486, 33965, 28310, 27037, 24739, 22573, 26186, 19203,
       19884, 25179, 19937, 18645, 18263, 30599, 22889, 21513, 27448,
       31949, 30009, 24462, 22253, 36571, 27439, 24777, 26858, 28779,
       24578, 28565, 15961, 23815, 26106, 18920, 24107, 25124, 24788,
       30096, 29654, 29581, 25938, 23398, 23809, 27691, 24616, 25676,
       22025, 26269, 36915, 32885, 21220, 20678, 30909, 25828, 23939,
       31993])

In [623]:
send_car_clicks(sample_user, new_items_clicked)

In [624]:
int_expanded_df[int_expanded_df.USER_ID == sample_user]['FAV'].value_counts()

NEWISH-Toyota-Prius    60
Name: FAV, dtype: int64

In [625]:
print('Now this same user has started to like {} cars.'.format(new_fav))
print('Lets see if Personalize picks up on this real time change in intent...')
recommend_cars(sample_user)

Now this same user has started to like OLDISH-Toyota-Prius cars.
Lets see if Personalize picks up on this real time change in intent...
Id: 25606, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2012, Age: 41
Id: 22372, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2013, Age: 36
Id: 26576, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2010, Age: 45
Id: 24787, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2011, Age: 41
Id: 27579, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2013, Age: 42
Id: 26650, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2011, Age: 36
Id: 25179, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2011, Age: 46
Id: 26569, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2013, Age: 37
Id: 22223, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2013, Age: 39
Id: 23473, Make: Toyota, Model: Prius, Fav: OLDISH-Toyota-Prius, Year: 2013, Age: 45
Id: 28223, Mak

## Exercise the hrnn-metadata campaign
Here we try out the hrnn-metadata campaign. 
We ask Personalize for recommendations for a particular user. Our hope is that
it would detect that this user likes old or new cars and would return a list accordingly.

In [626]:
users_to_try = int_expanded_df.sample(10)
users_to_try[['USER_ID','FAV']].head(3)

Unnamed: 0,USER_ID,FAV
515757,18856,NEWISH-Toyota-Camry
588817,19548,NEWISH-Ford-Fusion
89020,18739,NEWISH-Ford-Fusion


In [627]:
show_user_interaction_history(int_expanded_df, users_to_try.iloc[0]['USER_ID'])

(40, 16)


Unnamed: 0,USER_ID,ITEM_ID,WHEN,FAV,YEAR,PRICE,MILEAGE
515737,18856,22091,2019-07-11 14:02:44,NEWISH-Toyota-Camry,2015,35998,66012
515743,18856,17263,2019-07-11 14:03:12,NEWISH-Toyota-Camry,2014,28432,75172
515722,18856,22290,2019-07-11 14:04:01,NEWISH-Toyota-Camry,2014,32936,78919
515734,18856,26255,2019-07-11 14:04:44,NEWISH-Toyota-Camry,2015,28050,68901
515736,18856,24234,2019-07-11 14:05:12,NEWISH-Toyota-Camry,2015,31225,61548
515732,18856,19874,2019-07-11 14:06:01,NEWISH-Toyota-Camry,2015,29406,63872
515738,18856,17617,2019-07-11 14:08:44,NEWISH-Toyota-Camry,2019,47918,702
515757,18856,20433,2019-07-11 14:09:12,NEWISH-Toyota-Camry,2015,35358,69157
515759,18856,14791,2019-07-11 14:10:01,NEWISH-Toyota-Camry,2018,44687,22971
515740,18856,24232,2019-07-11 14:13:01,NEWISH-Toyota-Camry,2014,31755,78485


In [628]:
for i in range(10):
    user_id = str(users_to_try.iloc[i]['USER_ID'])
    fav     = users_to_try.iloc[i]['FAV']
    print('Getting recommendations for user: {}, who likes: {}'.format(user_id, fav))
    response = personalize_runtime.get_recommendations(campaignArn=hrnn_metadata_arn, 
                                                       userId=user_id, 
                                                       numResults=10)
    items = response['itemList']
    for item in items:
        print_item(int(item['itemId']))
    print('')

Getting recommendations for user: 18856, who likes: NEWISH-Toyota-Camry
Id: 19916, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2015, Age: 40
Id: 23048, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2015, Age: 32
Id: 22662, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2014, Age: 32
Id: 26319, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2015, Age: 37
Id: 26891, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2015, Age: 32
Id: 26007, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2015, Age: 43
Id: 22537, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2015, Age: 35
Id: 29601, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2015, Age: 41
Id: 23963, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2015, Age: 33
Id: 18478, Make: Toyota, Model: Camry, Fav: NEWISH-Toyota-Camry, Year: 2015, Age: 37

Getting recommendations for user: 19548, who likes: NEWISH-Ford-Fusion
Id: 25