# Third Environment : 
## Recommender System - Implicit Feedback 
_Mariem Kachouri & Dinara Veshchezerova_

In [1]:
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
import numpy as np
import gc
import requests 
from time import sleep
import json
import tensorflow as tf
from keras.models import Model, Sequential
from keras.layers import Embedding, Flatten, Input, Dense, Dropout, Concatenate, Lambda, Dot

from keras.regularizers import l2

Using TensorFlow backend.


In [2]:
## Importing Useful Functions predefined
%run useful_functions.py
from useful_functions import *

## Reset the Environment & Getting initial Data

In [3]:
user_id = '0H3BRZ9M0BQP3SFPSCL3'
base_url ='http://35.180.178.243/'
url_reset=base_url+"reset"
url_predict = base_url+'predict'
params = {'user_id':user_id}
r = requests.get(url=url_reset,params=params) # get history of rating
data = r.json()
data.keys()

dict_keys(['nb_items', 'state_history', 'rewards_history', 'action_history', 'next_state', 'nb_users'])

Remarques:
- Ordre: State => Action ==> Rewards 
- action_history : position of item recommended 
- reward_history : if bought => price of that item if not => 0
- The initial state returns the next_user_id for first items recommendation 

_Nota bene_ : In the training, we need to score all sublist of the available list ata that particular state for a given user an take the argmax defining the item_position.

__Interpretation:__<br>
- 1st value: user_id 
- 2nd value: item_id
- 3rd value: item_price
- 4rd, etc : features that can be related to users or items ( check if values change from item to item or from user to user)
==> For each line, we have info about a specific item <br>

_Prediciton output_ <br> The prediciton is the index of the item recommended for that user on the state t, so if we recommend the item_id =4 for user =28, we return the index of this item in the given list ( inut of network) ==> argmax  
<br>
One user has a certain proba to buy an item: 
- if a user doesn't buy an item, reward =0
- if someone buys an item, the reward = price of item

_Nota Bene_
<br>
Predict Function:
- reward 
- state = vector contains features for all items and users available.
    One feature j encodes 4 things  <br> state[j][0]= user, state[j][1] = item , state[j][2] = prices and state[j][3]= variables that encode additional information 

In [4]:
nb_users = data['nb_users'] # 100
nb_items = data['nb_items'] # 30

action_history = data['action_history']
state_history = data['state_history']
rewards_history = data['rewards_history']
next_state = data['next_state']

print('Number of users {}'.format(nb_users))
print('Number of items {}'.format(nb_items))
print('Nb of items available in the next state to predict: {}'.format(len(next_state)))

Number of users 100
Number of items 30
Nb of items available in the next state to predict: 29


__Remarks:__ <br>
We need to be able to proedict items on new users ==> Cold Start Problem. 
- Idea 1: Search for similiar user profiles based on user features only
- Idea 2: Search for similar use rprofiles based on user features and item features

### Retreiving user ids from the state history 


In [5]:
users_ids = list(zip(*list(list(zip(*state_history))[0])))[0]
print('Nb of user Ids: ', len(users_ids))

Nb of user Ids:  200


### Constructing the positive data / negative data for training

- Positive items are items that were bought by users at a given state 
- An item is bought whenever it has been recommended (cf. action_history for item positions recommended for each user ) and has a positive reward > 0. In that case, the reward is the price item that was bought.

#### Retrieving positive rewards

In [6]:
pos_rewards = compute_pos_rewards(rewards_history)
print('Number of positive rewards:', len(list(pos_rewards.keys())))

Number of positive rewards: 56


#### Creating positive DataFrame with useful columns

In [7]:
pos_data = create_pos_data(pos_rewards,state_history,action_history)
pos_data.head(10)

Unnamed: 0,user_id,item_id,feat_users,pos_items
0,32,12,"[1.1806423144867988, 1.5568314111775972]",[12]
1,27,4,"[1.1251761464092285, 0.07990664573756712]",[4]
2,20,12,"[2.034459066599278, 1.8831918709363333]","[12, 19]"
3,53,26,"[1.3750243939570619, 0.09872131728706157]","[26, 11]"
4,47,28,"[1.6685096967568889, 1.3387683610407908]","[28, 23]"
5,17,26,"[1.0715302452505464, 0.13321528616412603]",[26]
6,35,10,"[0.4235280995174412, 2.524573897879229]",[10]
7,57,9,"[1.8066458341008933, -0.047780364773431394]","[9, 1]"
8,60,23,"[1.7308233079966897, 1.544679869684752]",[23]
9,85,17,"[0.42458554507811497, 1.3111569340734799]",[17]


## Fitting 1st model  without New Users

In [8]:
deep_match_model, deep_triplet_model = build_models(nb_users, nb_items, user_dim=32,
                                                    item_dim= 15, n_hidden =2, hidden_size=64,
                                                    dropout=0.1,l2_reg=0)

In [9]:
print(deep_triplet_model.summary())

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
user_input (InputLayer)         (None, 1)            0                                            
__________________________________________________________________________________________________
positive_item_input (InputLayer (None, 1)            0                                            
__________________________________________________________________________________________________
negative_item_input (InputLayer (None, 1)            0                                            
__________________________________________________________________________________________________
user_embedding (Embedding)      (None, 1, 32)        3200        user_input[0][0]                 
__________________________________________________________________________________________________
item_embed

In [10]:
deep_triplet_model.compile(loss=identity_loss, optimizer='adam')
fake_y = np.ones_like(pos_data['user_id'])

n_epochs = 50

for i in range(n_epochs):
    # Sample new negatives to build different triplets at each epoch
    triplet_inputs = sample_triplets(pos_data,random_seed=i)

    # Fit the model incrementally by doing a single pass over the
    # sampled triplets.
    model_triplet = deep_triplet_model.fit(triplet_inputs, fake_y, shuffle=True, 
                                           batch_size=32,
                                           epochs=1, 
                                           verbose=2)

#     # Monitor the convergence of the model
#     test_auc = average_roc_auc(deep_match_model, pos_data, pos_data_test)
    print("Epoch %d/%d:"% (i + 1, n_epochs))

Epoch 1/1
 - 1s - loss: 1.0000
Epoch 1/50:
Epoch 1/1
 - 0s - loss: 0.9994
Epoch 2/50:
Epoch 1/1
 - 0s - loss: 1.0001
Epoch 3/50:
Epoch 1/1
 - 0s - loss: 0.9991
Epoch 4/50:
Epoch 1/1
 - 0s - loss: 0.9985
Epoch 5/50:
Epoch 1/1
 - 0s - loss: 1.0015
Epoch 6/50:
Epoch 1/1
 - 0s - loss: 0.9979
Epoch 7/50:
Epoch 1/1
 - 0s - loss: 0.9975
Epoch 8/50:
Epoch 1/1
 - 0s - loss: 0.9949
Epoch 9/50:
Epoch 1/1
 - 0s - loss: 0.9960
Epoch 10/50:
Epoch 1/1
 - 0s - loss: 0.9959
Epoch 11/50:
Epoch 1/1
 - 0s - loss: 0.9970
Epoch 12/50:
Epoch 1/1
 - 0s - loss: 0.9927
Epoch 13/50:
Epoch 1/1
 - 0s - loss: 0.9935
Epoch 14/50:
Epoch 1/1
 - 0s - loss: 0.9841
Epoch 15/50:
Epoch 1/1
 - 0s - loss: 0.9866
Epoch 16/50:
Epoch 1/1
 - 0s - loss: 0.9929
Epoch 17/50:
Epoch 1/1
 - 0s - loss: 0.9861
Epoch 18/50:
Epoch 1/1
 - 0s - loss: 0.9851
Epoch 19/50:
Epoch 1/1
 - 0s - loss: 0.9748
Epoch 20/50:
Epoch 1/1
 - 0s - loss: 0.9701
Epoch 21/50:
Epoch 1/1
 - 0s - loss: 0.9767
Epoch 22/50:
Epoch 1/1
 - 0s - loss: 0.9667
Epoch 23/5

## 1st Model : Computing Predicitions

In [11]:
nb_iters = 1000
rewards = 0
nb_reward_pos=0
new_users=[]

for i in range(nb_iters):
    sleep(0.1) # sleep to let the API breathe and allow others to call requests
    next_user = np.asarray([next_state[0][0] for i in range(len(next_state))])
    list_items = np.asarray(list(list(zip(*next_state))[1]))
    
    predictions = deep_match_model.predict([next_user, list_items])
    recommended_item = np.argmax(predictions)

    params['recommended_item'] = recommended_item 
    r=requests.get(url=url_predict,params=params)
    d=r.json()
    reward= d['reward'] # previous reward for the recommended item predicted
    
    # check how many times the item recommended was actually bought
    if reward > 0 : 
        nb_reward_pos+=1
    
    # check how many times we had to predict for a new user
    else:
        if next_state[0][0] not in list(pos_data.user_id.unique()):
            new_users.append(next_state[0][0])
    
    print('user: %s | recommended item position: %s | reward: %s' % (next_user[0],params['recommended_item'],d['reward']))
    next_state = d['state']
    
    rewards += reward
    
print('Average reward: ', rewards/nb_iters)
print('Percentage of positive rewards: ', 100*(nb_reward_pos/nb_iters), '%')

user: 38 | recommended item position: 18 | reward: 0
user: 20 | recommended item position: 17 | reward: 530.1222904842982
user: 6 | recommended item position: 19 | reward: 127.52225704880965
user: 54 | recommended item position: 19 | reward: 127.52225704880965
user: 66 | recommended item position: 19 | reward: 0
user: 6 | recommended item position: 18 | reward: 0
user: 20 | recommended item position: 1 | reward: 0
user: 8 | recommended item position: 19 | reward: 127.52225704880965
user: 58 | recommended item position: 19 | reward: 127.52225704880965
user: 53 | recommended item position: 18 | reward: 127.52225704880965
user: 20 | recommended item position: 1 | reward: 0
user: 39 | recommended item position: 19 | reward: 0
user: 79 | recommended item position: 18 | reward: 127.52225704880965
user: 31 | recommended item position: 18 | reward: 127.52225704880965
user: 97 | recommended item position: 18 | reward: 0
user: 13 | recommended item position: 18 | reward: 0
user: 80 | recommended

user: 88 | recommended item position: 17 | reward: 0
user: 68 | recommended item position: 19 | reward: 0
user: 34 | recommended item position: 18 | reward: 530.1222904842982
user: 28 | recommended item position: 1 | reward: 0
user: 17 | recommended item position: 19 | reward: 0
user: 5 | recommended item position: 19 | reward: 0
user: 35 | recommended item position: 18 | reward: 127.52225704880965
user: 69 | recommended item position: 19 | reward: 0
user: 49 | recommended item position: 19 | reward: 0
user: 4 | recommended item position: 19 | reward: 127.52225704880965
user: 20 | recommended item position: 1 | reward: 0
user: 96 | recommended item position: 18 | reward: 127.52225704880965
user: 28 | recommended item position: 1 | reward: 148.45669210471638
user: 43 | recommended item position: 19 | reward: 127.52225704880965
user: 7 | recommended item position: 18 | reward: 0
user: 78 | recommended item position: 12 | reward: 473.31218421752004
user: 57 | recommended item position: 17

ConnectionError: HTTPConnectionPool(host='35.180.178.243', port=80): Max retries exceeded with url: /predict?recommended_item=10&user_id=0H3BRZ9M0BQP3SFPSCL3 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001E7CD8D4390>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond',))

In [21]:
print('Nb of unique new users: {}'.format(len(list(set(new_users) - set(pos_data.user_id.unique().tolist())))))
print('Out of the 1000 steps, %.2f ot the times, we had to predict for new users' %((len(new_users)/1000)*100))

Nb of unique new users: 60
Out of the 1000 steps, 46.80 ot the times, we had to predict for new users


__Interpretation:__ <br> This observation can explain why the average reward is so low and how we had only 28% of the recommened items actually being bought bu users. The model acts randomly on new users and isn't really efficient.

## Fitting 2nd Model with adding coavariates
_Nota Bene_:
- First 2 are related to users 
- Three remaining features  are related to items ( as they change from an item to another given the same user)

In [58]:
user_id = '0H3BRZ9M0BQP3SFPSCL3'
base_url ='http://35.180.178.243/'
url_reset=base_url+"reset"
url_predict = base_url+'predict'
params = {'user_id':user_id}
r = requests.get(url=url_reset,params=params) # get history of rating
data = r.json()
data.keys()

nb_users = data['nb_users'] # 100
nb_items = data['nb_items'] # 30

action_history = data['action_history']
state_history = data['state_history']
rewards_history = data['rewards_history']
next_state = data['next_state']

print('Number of users {}'.format(nb_users))
print('Number of items {}'.format(nb_items))
print('Nb of items available in the next state to predict: {}'.format(len(next_state)))


Number of users 100
Number of items 30
Nb of items available in the next state to predict: 29


In [60]:
pos_rewards = compute_pos_rewards(rewards_history)
print('Number of positive rewards:', len(list(pos_rewards.keys())))
pos_data = create_pos_data(pos_rewards,state_history,action_history)
pos_data.head(10)

Number of positive rewards: 70


Unnamed: 0,user_id,item_id,pos_items
0,75,27,"[27, 17]"
1,24,12,"[12, 14, 29]"
2,6,16,"[16, 15]"
3,74,7,"[7, 2]"
4,98,13,[13]
5,3,15,"[15, 2]"
6,35,29,"[29, 21]"
7,34,14,"[14, 22]"
8,45,14,[14]
9,78,14,[14]


In [12]:
deep_match_model2, deep_triplet_model2 = build_models_covariates(nb_users, nb_items, user_dim=32,
                                                                item_dim= 15, n_hidden =2, hidden_size=64,
                                                                dropout=0.1,l2_reg=0)

In [63]:
print(deep_triplet_model2.summary())

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
user_input (InputLayer)         (None, 1)            0                                            
__________________________________________________________________________________________________
positive_item_input (InputLayer (None, 1)            0                                            
__________________________________________________________________________________________________
negative_item_input (InputLayer (None, 1)            0                                            
__________________________________________________________________________________________________
user_embedding (Embedding)      (None, 1, 32)        3200        user_input[0][0]                 
__________________________________________________________________________________________________
feature_us

_Nota Bene_:<br> _Use the function sample quintuplets that take into account user features and item features in the sampling so we could take into consideration covariates in the fitting._

In [66]:
deep_triplet_model2.compile(loss=identity_loss, optimizer='adam')
fake_y = np.ones_like(pos_data['user_id'])

n_epochs = 50

for i in range(n_epochs):
    # Sample new negatives to build different quintuplets at each epoch
    inputs = sample_quintuplets(pos_data,state_history, random_seed=i)
    
    # Fit the model incrementally by doing a single pass over the sampled quintuplets.
    deep_triplet_model2.fit(inputs, fake_y, shuffle=True, 
                            batch_size=32,
                            epochs=1, 
                            verbose=2)

#     # Monitor the convergence of the model
#     test_auc = average_roc_auc(deep_match_model, pos_data, pos_data_test)
    print("Epoch %d/%d:"% (i + 1, n_epochs))

Epoch 1/1
 - 1s - loss: 0.9995
Epoch 1/50:
Epoch 1/1
 - 0s - loss: 1.0062
Epoch 2/50:
Epoch 1/1
 - 0s - loss: 0.9858
Epoch 3/50:
Epoch 1/1
 - 0s - loss: 1.0084
Epoch 4/50:
Epoch 1/1
 - 0s - loss: 1.0100
Epoch 5/50:
Epoch 1/1
 - 0s - loss: 0.9839
Epoch 6/50:
Epoch 1/1
 - 0s - loss: 0.9543
Epoch 7/50:
Epoch 1/1
 - 0s - loss: 0.9754
Epoch 8/50:
Epoch 1/1
 - 0s - loss: 0.9787
Epoch 9/50:
Epoch 1/1
 - 0s - loss: 0.9940
Epoch 10/50:
Epoch 1/1
 - 0s - loss: 0.9493
Epoch 11/50:
Epoch 1/1
 - 0s - loss: 0.9748
Epoch 12/50:
Epoch 1/1
 - 0s - loss: 0.9916
Epoch 13/50:
Epoch 1/1
 - 0s - loss: 1.0127
Epoch 14/50:
Epoch 1/1
 - 0s - loss: 0.9313
Epoch 15/50:
Epoch 1/1
 - 0s - loss: 0.9235
Epoch 16/50:
Epoch 1/1
 - 0s - loss: 0.9462
Epoch 17/50:
Epoch 1/1
 - 0s - loss: 0.9666
Epoch 18/50:
Epoch 1/1
 - 0s - loss: 0.9675
Epoch 19/50:
Epoch 1/1
 - 0s - loss: 0.8331
Epoch 20/50:
Epoch 1/1
 - 0s - loss: 0.8566
Epoch 21/50:
Epoch 1/1
 - 0s - loss: 0.8615
Epoch 22/50:
Epoch 1/1
 - 0s - loss: 0.9299
Epoch 23/5

## 2nd Model with Covariates - Computing Predictions  

In [67]:
nb_iters = 1000
rewards = 0
nb_reward_pos=0
new_users =[]

for i in range(nb_iters):
    sleep(0.05) # sleep to let the API breathe and allow others to call requests
    next_user = np.asarray([next_state[0][0] for i in range(len(next_state))])
    list_items = np.asarray(list(list(zip(*next_state))[1]))  # available items 
    list_feat_user = np.expand_dims(np.asarray([next_state[0][3:5] for i in range(len(next_state))]), axis=1)
    list_feat_items = np.expand_dims(np.asarray([next_state[0][5:] for i in range(len(next_state))]), axis=1)
   
    predictions = deep_match_model2.predict([next_user, list_items, list_feat_user, list_feat_items])
    recommended_item = np.argmax(predictions) # position item 

    params['recommended_item'] = recommended_item 
    r=requests.get(url=url_predict,params=params)
    d=r.json()
    reward= d['reward'] # previous reward for the recommended item predicted
    if reward > 0 : 
        nb_reward_pos+=1 
    else:
        new_users.append(next_state[0][0])
    
    print('user: %s |recommended item position: %s | recommended item id %s | reward: %s' % (next_user[0],params['recommended_item'],next_state[params['recommended_item']][1],d['reward']))
    next_state = d['state']
    
    rewards += reward
    
print('Average reward: ', rewards/nb_iters)
print('Percentage of positive rewards: ', 100*(nb_reward_pos/nb_iters), '%')

user: 41 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 50 |recommended item position: 15 | recommended item id 15 | reward: 283.5629260026308
user: 33 |recommended item position: 15 | recommended item id 15 | reward: 283.5629260026308
user: 5 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 27 |recommended item position: 14 | recommended item id 15 | reward: 283.5629260026308
user: 52 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 47 |recommended item position: 14 | recommended item id 15 | reward: 283.5629260026308
user: 85 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 87 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 24 |recommended item position: 13 | recommended item id 15 | reward: 0
user: 39 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 53 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 5 |recomm

user: 74 |recommended item position: 13 | recommended item id 15 | reward: 0
user: 49 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 49 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 72 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 93 |recommended item position: 14 | recommended item id 14 | reward: 0
user: 51 |recommended item position: 15 | recommended item id 15 | reward: 283.5629260026308
user: 38 |recommended item position: 2 | recommended item id 2 | reward: 733.3329741483906
user: 91 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 12 |recommended item position: 15 | recommended item id 15 | reward: 283.5629260026308
user: 55 |recommended item position: 15 | recommended item id 15 | reward: 283.5629260026308
user: 9 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 68 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 92 |recommended

user: 98 |recommended item position: 14 | recommended item id 15 | reward: 0
user: 59 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 59 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 82 |recommended item position: 2 | recommended item id 2 | reward: 733.3329741483906
user: 25 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 25 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 19 |recommended item position: 13 | recommended item id 15 | reward: 0
user: 54 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 16 |recommended item position: 15 | recommended item id 15 | reward: 283.5629260026308
user: 31 |recommended item position: 14 | recommended item id 15 | reward: 283.5629260026308
user: 16 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 43 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 13 |recommended item posit

user: 83 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 41 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 34 |recommended item position: 2 | recommended item id 2 | reward: 733.3329741483906
user: 57 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 60 |recommended item position: 13 | recommended item id 14 | reward: 570.0406368218858
user: 43 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 71 |recommended item position: 14 | recommended item id 14 | reward: 0
user: 92 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 67 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 36 |recommended item position: 5 | recommended item id 7 | reward: 0
user: 25 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 66 |recommended item position: 2 | recommended item id 2 | reward: 733.3329741483906
user: 44 |recommended item position: 18 

user: 38 |recommended item position: 5 | recommended item id 7 | reward: 0
user: 90 |recommended item position: 4 | recommended item id 5 | reward: 0
user: 74 |recommended item position: 12 | recommended item id 14 | reward: 0
user: 79 |recommended item position: 13 | recommended item id 14 | reward: 0
user: 49 |recommended item position: 13 | recommended item id 14 | reward: 570.0406368218858
user: 44 |recommended item position: 4 | recommended item id 5 | reward: 0
user: 97 |recommended item position: 2 | recommended item id 2 | reward: 733.3329741483906
user: 82 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 58 |recommended item position: 18 | recommended item id 21 | reward: 190.007453893661
user: 57 |recommended item position: 15 | recommended item id 15 | reward: 283.5629260026308
user: 11 |recommended item position: 17 | recommended item id 21 | reward: 0
user: 1 |recommended item position: 13 | recommended item id 14 | reward: 570.0406368218858
user: 

user: 58 |recommended item position: 21 | recommended item id 27 | reward: 0
user: 89 |recommended item position: 2 | recommended item id 2 | reward: 733.3329741483906
user: 0 |recommended item position: 13 | recommended item id 15 | reward: 0
user: 86 |recommended item position: 18 | recommended item id 21 | reward: 190.007453893661
user: 88 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 5 |recommended item position: 13 | recommended item id 14 | reward: 570.0406368218858
user: 0 |recommended item position: 13 | recommended item id 15 | reward: 283.5629260026308
user: 63 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 42 |recommended item position: 13 | recommended item id 14 | reward: 570.0406368218858
user: 63 |recommended item position: 18 | recommended item id 21 | reward: 190.007453893661
user: 37 |recommended item position: 2 | recommended item id 2 | reward: 733.3329741483906
user: 4 |recommended item position: 18 | recommended

user: 38 |recommended item position: 22 | recommended item id 29 | reward: 0
user: 10 |recommended item position: 12 | recommended item id 14 | reward: 0
user: 42 |recommended item position: 22 | recommended item id 27 | reward: 0
user: 52 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 68 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 48 |recommended item position: 13 | recommended item id 14 | reward: 0
user: 3 |recommended item position: 18 | recommended item id 21 | reward: 190.007453893661
user: 61 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 25 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 35 |recommended item position: 23 | recommended item id 27 | reward: 304.7203211537428
user: 8 |recommended item position: 5 | recommended item id 7 | reward: 379.2067124531398
user: 72 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 27 |recommended item position

user: 50 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 28 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 37 |recommended item position: 22 | recommended item id 27 | reward: 0
user: 78 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 62 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 22 |recommended item position: 13 | recommended item id 14 | reward: 0
user: 41 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 70 |recommended item position: 13 | recommended item id 14 | reward: 0
user: 46 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 91 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 94 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 13 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 45 |recommended item position: 6 | recommended item id 7 | reward: 0
user: 7 |

user: 11 |recommended item position: 21 | recommended item id 27 | reward: 304.7203211537428
user: 47 |recommended item position: 4 | recommended item id 5 | reward: 0
user: 60 |recommended item position: 4 | recommended item id 5 | reward: 0
user: 14 |recommended item position: 15 | recommended item id 22 | reward: 0
user: 77 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 37 |recommended item position: 22 | recommended item id 27 | reward: 304.7203211537428
user: 9 |recommended item position: 22 | recommended item id 27 | reward: 0
user: 84 |recommended item position: 6 | recommended item id 7 | reward: 0
user: 31 |recommended item position: 2 | recommended item id 2 | reward: 733.3329741483906
user: 67 |recommended item position: 15 | recommended item id 15 | reward: 0
user: 2 |recommended item position: 13 | recommended item id 14 | reward: 0
user: 86 |recommended item position: 22 | recommended item id 27 | reward: 0
user: 41 |recommended item position: 2

user: 53 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 83 |recommended item position: 23 | recommended item id 27 | reward: 0
user: 52 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 23 |recommended item position: 15 | recommended item id 15 | reward: 283.5629260026308
user: 5 |recommended item position: 4 | recommended item id 5 | reward: 0
user: 7 |recommended item position: 13 | recommended item id 14 | reward: 0
user: 1 |recommended item position: 21 | recommended item id 27 | reward: 0
user: 46 |recommended item position: 18 | recommended item id 21 | reward: 0
user: 42 |recommended item position: 5 | recommended item id 7 | reward: 0
user: 91 |recommended item position: 13 | recommended item id 14 | reward: 0
user: 62 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 32 |recommended item position: 12 | recommended item id 14 | reward: 0
user: 94 |recommended item position: 17 | recommended item id 21 | rewa

In [68]:
print('Nb of unique new users: {}'.format(len(list(set(new_users) - set(pos_data.user_id.unique().tolist())))))
print('Out of the 1000 steps, %.2f ot the times, we had to predict for new users' %((len(new_users)/1000)*100))

Nb of unique new users: 54
Out of the 1000 steps, 66.20 ot the times, we had to predict for new users


__Interepretation:__ <br> 
More than 50% of the users id in the prediction phase to whom we recommend an item but they didn't actually bu it are new users. So, even if the model we built can deal with new users by randomly assigning them an ebedding, the prediction for new users that the mdoel has never seen in the training is rather senseless. We need to tackle the issue of cold start to improve the results.

## Fitting 3rd Model : Tackling Cold Start Issue (e.g New Users)

In [25]:
from sklearn.metrics.pairwise import cosine_similarity

In [59]:
user_id = '0H3BRZ9M0BQP3SFPSCL3'
base_url ='http://35.180.178.243/'
url_reset=base_url+"reset"
url_predict = base_url+'predict'
params = {'user_id':user_id}
r = requests.get(url=url_reset,params=params) # get history of rating
data = r.json()

nb_users = data['nb_users'] # 100
nb_items = data['nb_items'] # 30

action_history = data['action_history']
state_history = data['state_history']
rewards_history = data['rewards_history']
next_state = data['next_state']

print('Number of users {}'.format(nb_users))
print('Number of items {}'.format(nb_items))
print('Nb of items available in the next state to predict: {}'.format(len(next_state)))

Number of users 100
Number of items 30
Nb of items available in the next state to predict: 29


In [60]:
pos_rewards = compute_pos_rewards(rewards_history)
print('Number of positive rewards:', len(list(pos_rewards.keys())))
pos_data = create_pos_data(pos_rewards,state_history,action_history)
pos_data.head(10)

Number of positive rewards: 65


Unnamed: 0,user_id,item_id,feat_users,pos_items
0,52,22,"[-0.379528722621125, 0.6902862540249266]",[22]
1,53,21,"[1.8416495727134352, 1.806010370038425]",[21]
2,81,17,"[1.086847454873052, 0.8684914040481226]","[17, 21]"
3,84,17,"[1.0367880465703698, 0.09369720060721975]",[17]
4,83,23,"[1.898735023598991, 0.7574940929001662]","[23, 3, 11]"
5,27,29,"[1.8671942361860272, 0.9671288871118074]","[29, 27]"
6,68,26,"[2.4831855830139635, 0.801874786422626]","[26, 20]"
7,81,21,"[1.086847454873052, 0.8684914040481226]","[17, 21]"
8,2,5,"[0.369341734608177, -0.47277370650214556]","[5, 12]"
9,3,21,"[1.7656903496039487, 2.3965140148006077]","[21, 13]"


In [13]:
deep_match_model3, deep_triplet_model3 = build_models_covariates(nb_users, nb_items, user_dim=32,
                                                                item_dim= 15, n_hidden =2, hidden_size=64,
                                                                dropout=0.1,l2_reg=0)

In [14]:
deep_triplet_model3.compile(loss=identity_loss, optimizer='adam')
fake_y = np.ones_like(pos_data['user_id'])

n_epochs = 50

for i in range(n_epochs):
    # Sample new negatives to build different triplets at each epoch
    inputs = sample_quintuplets(pos_data,state_history, random_seed=i)
    
    # Fit the model incrementally by doing a single pass over the sampled quintuplets.
    deep_triplet_model3.fit(inputs, fake_y, shuffle=True, 
                            batch_size=32,
                            epochs=1, 
                            verbose=2)

#     # Monitor the convergence of the model
#     test_auc = average_roc_auc(deep_match_model, pos_data, pos_data_test)
    print("Epoch %d/%d:"% (i + 1, n_epochs))

Epoch 1/1
 - 1s - loss: 0.9534
Epoch 1/50:
Epoch 1/1
 - 0s - loss: 0.9778
Epoch 2/50:
Epoch 1/1
 - 0s - loss: 1.0235
Epoch 3/50:
Epoch 1/1
 - 0s - loss: 0.9855
Epoch 4/50:
Epoch 1/1
 - 0s - loss: 0.9586
Epoch 5/50:
Epoch 1/1
 - 0s - loss: 0.9837
Epoch 6/50:
Epoch 1/1
 - 0s - loss: 0.9388
Epoch 7/50:
Epoch 1/1
 - 0s - loss: 0.9321
Epoch 8/50:
Epoch 1/1
 - 0s - loss: 1.0018
Epoch 9/50:
Epoch 1/1
 - 0s - loss: 1.0158
Epoch 10/50:
Epoch 1/1
 - 0s - loss: 0.9810
Epoch 11/50:
Epoch 1/1
 - 0s - loss: 0.8949
Epoch 12/50:
Epoch 1/1
 - 0s - loss: 0.9621
Epoch 13/50:
Epoch 1/1
 - 0s - loss: 0.9857
Epoch 14/50:
Epoch 1/1
 - 0s - loss: 0.8418
Epoch 15/50:
Epoch 1/1
 - 0s - loss: 0.8469
Epoch 16/50:
Epoch 1/1
 - 0s - loss: 0.8870
Epoch 17/50:
Epoch 1/1
 - 0s - loss: 0.8404
Epoch 18/50:
Epoch 1/1
 - 0s - loss: 0.9408
Epoch 19/50:
Epoch 1/1
 - 0s - loss: 0.9569
Epoch 20/50:
Epoch 1/1
 - 0s - loss: 0.9851
Epoch 21/50:
Epoch 1/1
 - 0s - loss: 0.8449
Epoch 22/50:
Epoch 1/1
 - 0s - loss: 0.8112
Epoch 23/5

## 3rd Model : Computing Predictions 
_Idea:_<br> _If the user id in the next state isn't in the user ids of pos_data , then compute cosine similarity between this new user and alla other users present in pos data using then features. We then, take the most similar one (having greatest cosine similarity value) and predict for him instead._

In [64]:
nb_iters = 1000
rewards = 0
nb_reward_pos=0
new_users =[]

for i in range(nb_iters):
    sleep(0.05) # sleep to let the API breathe and allow others to call requests
    
    if next_state[0][0] in pos_data.user_id.unique().tolist():
        next_user = np.asarray([next_state[0][0] for i in range(len(next_state))])
        list_feat_user = np.expand_dims(np.asarray([next_state[0][3:5] for i in range(len(next_state))]), axis=1)

    
    else:
        #predict items based on users' profile similarity 
        most_similar_user_id = compute_most_similar(state_history,next_state,pos_data) 
        next_user = np.asarray([most_similar_user_id for i in range(len(next_state))])
        list_feat_user = list(pos_data.loc[pos_data.user_id==most_similar_user_id,'feat_users'])[0]
        list_feat_user = np.expand_dims(np.asarray([list_feat_user for i in range(len(next_state))]), axis=1)

    
    list_items = np.asarray(list(list(zip(*next_state))[1]))
    list_feat_items = np.expand_dims(np.asarray([next_state[0][5:] for i in range(len(next_state))]), axis=1)

    predictions = deep_match_model3.predict([next_user, list_items, list_feat_user, list_feat_items])
    recommended_item = np.argmax(predictions)
   
    params['recommended_item'] = recommended_item 
    r=requests.get(url=url_predict,params=params)
    d=r.json()
    reward= d['reward'] # previous reward for the recommended item predicted
    if reward > 0 : 
        nb_reward_pos+=1 
    else:
        if next_state[0][0] not in list(pos_data.user_id.unique()):
            new_users.append(next_state[0][0])  
            
    print('user: %s |recommended item position: %s | recommended item id %s | reward: %s' % (next_user[0],params['recommended_item'],next_state[params['recommended_item']][1],d['reward']))
    next_state = d['state']
    
    rewards += reward
    
print('Average reward: ', rewards/nb_iters)
print('Percentage of positive rewards: ', 100*(nb_reward_pos/nb_iters), '%')

user: 11 |recommended item position: 27 | recommended item id 28 | reward: 505.6366927552675
user: 41 |recommended item position: 28 | recommended item id 28 | reward: 505.6366927552675
user: 9 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 83 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 34 |recommended item position: 27 | recommended item id 28 | reward: 0
user: 91 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 27 |recommended item position: 10 | recommended item id 10 | reward: 509.8745430624738
user: 77 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 55 |recommended item position: 27 | recommended item id 28 | reward: 0
user: 83 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 9 |recommended item position: 28 | recommended item id 28 | reward: 505.6366927552675
user: 3 |recommended item position: 26 | recommended item id 28 | reward: 505.6366927552675

user: 41 |recommended item position: 17 | recommended item id 17 | reward: 610.1407021847199
user: 52 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 8 |recommended item position: 8 | recommended item id 8 | reward: 0
user: 24 |recommended item position: 27 | recommended item id 28 | reward: 505.6366927552675
user: 58 |recommended item position: 10 | recommended item id 10 | reward: 509.8745430624738
user: 50 |recommended item position: 27 | recommended item id 28 | reward: 0
user: 9 |recommended item position: 27 | recommended item id 28 | reward: 505.6366927552675
user: 5 |recommended item position: 8 | recommended item id 8 | reward: 505.1394608715629
user: 70 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 24 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 65 |recommended item position: 28 | recommended item id 28 | reward: 505.6366927552675
user: 9 |recommended item position: 16 | recommended item id 17 

user: 52 |recommended item position: 27 | recommended item id 28 | reward: 0
user: 41 |recommended item position: 16 | recommended item id 17 | reward: 610.1407021847199
user: 9 |recommended item position: 10 | recommended item id 10 | reward: 509.8745430624738
user: 13 |recommended item position: 17 | recommended item id 17 | reward: 0
user: 58 |recommended item position: 17 | recommended item id 17 | reward: 0
user: 27 |recommended item position: 8 | recommended item id 8 | reward: 0
user: 50 |recommended item position: 27 | recommended item id 28 | reward: 0
user: 34 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 78 |recommended item position: 17 | recommended item id 17 | reward: 610.1407021847199
user: 77 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 9 |recommended item position: 3 | recommended item id 3 | reward: 0
user: 33 |recommended item position: 26 | recommended item id 28 | reward: 0
user: 58 |recommended item positio

user: 39 |recommended item position: 26 | recommended item id 28 | reward: 0
user: 33 |recommended item position: 28 | recommended item id 28 | reward: 505.6366927552675
user: 78 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 24 |recommended item position: 17 | recommended item id 17 | reward: 0
user: 37 |recommended item position: 10 | recommended item id 10 | reward: 509.8745430624738
user: 53 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 75 |recommended item position: 9 | recommended item id 10 | reward: 509.8745430624738
user: 33 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 41 |recommended item position: 17 | recommended item id 19 | reward: 0
user: 52 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 53 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 52 |recommended item position: 17 | recommended item id 17 | reward: 610.1407021847199
user: 9 |reco

user: 30 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 42 |recommended item position: 14 | recommended item id 16 | reward: 491.73252110224854
user: 91 |recommended item position: 27 | recommended item id 28 | reward: 505.6366927552675
user: 58 |recommended item position: 17 | recommended item id 17 | reward: 610.1407021847199
user: 52 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 77 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 77 |recommended item position: 3 | recommended item id 3 | reward: 0
user: 95 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 91 |recommended item position: 9 | recommended item id 10 | reward: 509.8745430624738
user: 95 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 77 |recommended item position: 16 | recommended item id 19 | reward: 0
user: 68 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 13 |reco

user: 13 |recommended item position: 26 | recommended item id 28 | reward: 505.6366927552675
user: 58 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 5 |recommended item position: 9 | recommended item id 12 | reward: 0
user: 77 |recommended item position: 28 | recommended item id 28 | reward: 505.6366927552675
user: 5 |recommended item position: 9 | recommended item id 12 | reward: 114.97650483514862
user: 15 |recommended item position: 8 | recommended item id 8 | reward: 0
user: 39 |recommended item position: 10 | recommended item id 10 | reward: 509.8745430624738
user: 50 |recommended item position: 27 | recommended item id 28 | reward: 505.6366927552675
user: 25 |recommended item position: 10 | recommended item id 10 | reward: 509.8745430624738
user: 77 |recommended item position: 7 | recommended item id 8 | reward: 0
user: 14 |recommended item position: 27 | recommended item id 28 | reward: 0
user: 35 |recommended item position: 15 | recommended item id 17

user: 1 |recommended item position: 8 | recommended item id 8 | reward: 0
user: 84 |recommended item position: 15 | recommended item id 19 | reward: 0
user: 75 |recommended item position: 9 | recommended item id 12 | reward: 0
user: 95 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 58 |recommended item position: 17 | recommended item id 19 | reward: 0
user: 61 |recommended item position: 27 | recommended item id 28 | reward: 505.6366927552675
user: 52 |recommended item position: 10 | recommended item id 12 | reward: 0
user: 52 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 1 |recommended item position: 8 | recommended item id 8 | reward: 0
user: 95 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 78 |recommended item position: 8 | recommended item id 8 | reward: 0
user: 44 |recommended item position: 7 | recommended item id 12 | reward: 114.97650483514862
user: 30 |recommended item position: 28 | recommended

user: 37 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 91 |recommended item position: 15 | recommended item id 17 | reward: 0
user: 52 |recommended item position: 28 | recommended item id 28 | reward: 0
user: 3 |recommended item position: 16 | recommended item id 17 | reward: 610.1407021847199
user: 58 |recommended item position: 13 | recommended item id 16 | reward: 491.73252110224854
user: 5 |recommended item position: 12 | recommended item id 16 | reward: 0
user: 81 |recommended item position: 14 | recommended item id 16 | reward: 491.73252110224854
user: 86 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 7 |recommended item position: 7 | recommended item id 8 | reward: 0
user: 37 |recommended item position: 15 | recommended item id 17 | reward: 0
user: 39 |recommended item position: 10 | recommended item id 10 | reward: 509.8745430624738
user: 31 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 34 |recom

user: 81 |recommended item position: 15 | recommended item id 19 | reward: 0
user: 95 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 77 |recommended item position: 17 | recommended item id 17 | reward: 0
user: 52 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 14 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 37 |recommended item position: 16 | recommended item id 17 | reward: 0
user: 7 |recommended item position: 7 | recommended item id 8 | reward: 0
user: 24 |recommended item position: 15 | recommended item id 19 | reward: 0
user: 7 |recommended item position: 7 | recommended item id 8 | reward: 0
user: 41 |recommended item position: 10 | recommended item id 10 | reward: 0
user: 78 |recommended item position: 17 | recommended item id 19 | reward: 0
user: 7 |recommended item position: 7 | recommended item id 8 | reward: 505.1394608715629
user: 65 |recommended item position: 3 | recommended item id 4 | rewa

## Fitting 4rth Model =  Model 3 + Adding Price in Features
_Nota Bene:_<br> Need to normalize prices because not all features are in the same scale.

In [65]:
user_id = '0H3BRZ9M0BQP3SFPSCL3'
base_url ='http://35.180.178.243/'
url_reset=base_url+"reset"
url_predict = base_url+'predict'
params = {'user_id':user_id}
r = requests.get(url=url_reset,params=params) # get history of rating
data = r.json()

nb_users = data['nb_users'] # 100
nb_items = data['nb_items'] # 30

action_history = data['action_history']
state_history = data['state_history']
rewards_history = data['rewards_history']
next_state = data['next_state']

print('Number of users {}'.format(nb_users))
print('Number of items {}'.format(nb_items))
print('Nb of items available in the next state to predict: {}'.format(len(next_state)))

Number of users 100
Number of items 30
Nb of items available in the next state to predict: 29


In [66]:
pos_rewards = compute_pos_rewards(rewards_history)
print('Number of positive rewards:', len(list(pos_rewards.keys())))
pos_data = create_pos_data(pos_rewards,state_history,action_history)
pos_data.head(10)

Number of positive rewards: 52


Unnamed: 0,user_id,item_id,feat_users,pos_items
0,93,16,"[2.9493430717002065, 2.996145971616058]",[16]
1,89,23,"[0.7968648968847498, 2.0501409403269424]","[23, 16]"
2,16,12,"[0.3078891175808073, 1.119812459334277]","[12, 23]"
3,72,13,"[0.858769056683026, 0.5526218496252273]",[13]
4,99,17,"[0.7962814242781673, 2.850400094941593]",[17]
5,81,18,"[-0.5711821058248812, 1.0687847961838348]",[18]
6,75,21,"[1.8353870900821736, 0.3264639985627831]",[21]
7,47,0,"[-0.15476523001431275, 0.8454412027736513]","[0, 14]"
8,82,29,"[-1.047106817747653, 0.9369903481815889]",[29]
9,98,16,"[0.8645458007963118, 1.0210574111418278]","[16, 14, 7]"


In [15]:
deep_match_model4, deep_triplet_model4 = build_models_covariates_price(nb_users, nb_items, user_dim=32,
                                                                        item_dim= 15, n_hidden =2, hidden_size=64,
                                                                        dropout=0.1,l2_reg=0)

_Nota Bene:_ <br> Make use of the sample quintuplet price function because it takes into account price features before fitting.

In [72]:
deep_triplet_model4.compile(loss=identity_loss, optimizer='adam')
fake_y = np.ones_like(pos_data['user_id'])

n_epochs = 50

for i in range(n_epochs):
    # Sample new negatives to build different triplets at each epoch
    inputs = sample_quintuplets_price(pos_data,state_history, random_seed=i)
    
    # Fit the model incrementally by doing a single pass over the sampled quintuplets.
    deep_triplet_model4.fit(inputs, fake_y, shuffle=True, 
                            batch_size=32,
                            epochs=1, 
                            verbose=2)

#     # Monitor the convergence of the model
#     test_auc = average_roc_auc(deep_match_model, pos_data, pos_data_test)
    print("Epoch %d/%d:"% (i + 1, n_epochs))

Epoch 1/1
 - 20s - loss: 0.9879
Epoch 1/50:
Epoch 1/1
 - 0s - loss: 1.0187
Epoch 2/50:
Epoch 1/1
 - 0s - loss: 0.9854
Epoch 3/50:
Epoch 1/1
 - 0s - loss: 0.9948
Epoch 4/50:
Epoch 1/1
 - 0s - loss: 0.9838
Epoch 5/50:
Epoch 1/1
 - 0s - loss: 0.9899
Epoch 6/50:
Epoch 1/1
 - 0s - loss: 0.9899
Epoch 7/50:
Epoch 1/1
 - 0s - loss: 0.9820
Epoch 8/50:
Epoch 1/1
 - 0s - loss: 0.9770
Epoch 9/50:
Epoch 1/1
 - 0s - loss: 0.9602
Epoch 10/50:
Epoch 1/1
 - 0s - loss: 0.9345
Epoch 11/50:
Epoch 1/1
 - 0s - loss: 1.0007
Epoch 12/50:
Epoch 1/1
 - 0s - loss: 0.9878
Epoch 13/50:
Epoch 1/1
 - 0s - loss: 0.9165
Epoch 14/50:
Epoch 1/1
 - 0s - loss: 0.9711
Epoch 15/50:
Epoch 1/1
 - 0s - loss: 1.0198
Epoch 16/50:
Epoch 1/1
 - 0s - loss: 0.9533
Epoch 17/50:
Epoch 1/1
 - 0s - loss: 0.9473
Epoch 18/50:
Epoch 1/1
 - 0s - loss: 0.8601
Epoch 19/50:
Epoch 1/1
 - 0s - loss: 0.9119
Epoch 20/50:
Epoch 1/1
 - 0s - loss: 0.9291
Epoch 21/50:
Epoch 1/1
 - 0s - loss: 0.8739
Epoch 22/50:
Epoch 1/1
 - 0s - loss: 0.8242
Epoch 23/

In [74]:
nb_iters = 1000
rewards = 0
nb_reward_pos=0
new_users =[]

for i in range(nb_iters):
    sleep(0.05) # sleep to let the API breathe and allow others to call requests
    list_items = np.asarray(list(list(zip(*next_state))[1]))  # available items 
    
    if next_state[0][0] in pos_data.user_id.unique().tolist():
        next_user = np.asarray([next_state[0][0] for i in range(len(next_state))])
        list_feat_user = np.expand_dims(np.asarray([next_state[0][3:5] for i in range(len(next_state))]), axis=1)

    
    else:
        #predict items based on users' profile similarity 
        most_similar_user_id = compute_most_similar(state_history,next_state,pos_data) 
        next_user = np.asarray([most_similar_user_id for i in range(len(next_state))])
        list_feat_user = list(pos_data.loc[pos_data.user_id==most_similar_user_id,'feat_users'])[0]
        list_feat_user = np.expand_dims(np.asarray([list_feat_user for i in range(len(next_state))]), axis=1)


    
    prices =list(list(zip(*next_state))[1])
    
    feat_items = [next_state[0][5:] for i in range(len(next_state))]
    price_mean = np.asarray(prices).mean()
    price_std = np.asarray(prices).std()
    prices_norm = [(price-price_mean)/price_std for price in prices ] 
    
    for i,feature in  enumerate(feat_items):
        feature.append(prices_norm[i])
    
    list_feat_items = np.expand_dims(np.asarray(feat_items), axis=1)
   
    predictions = deep_match_model4.predict([next_user, list_items, list_feat_user, list_feat_items])
    recommended_item = np.argmax(predictions) # position item 

    params['recommended_item'] = recommended_item 
    r=requests.get(url=url_predict,params=params)
    d=r.json()
    reward= d['reward'] # previous reward for the recommended item predicted
    if reward > 0 : 
        nb_reward_pos+=1 
    else:
        new_users.append(next_state[0][0])
    
    print('user: %s |recommended item position: %s | recommended item id %s | reward: %s' % (next_user[0],params['recommended_item'],next_state[params['recommended_item']][1],d['reward']))
    next_state = d['state']
    
    rewards += reward
    
print('Average reward: ', rewards/nb_iters)
print('Percentage of positive rewards: ', 100*(nb_reward_pos/nb_iters), '%')

user: 15 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 17 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 89 |recommended item position: 16 | recommended item id 16 | reward: 0
user: 94 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 67 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 19 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 66 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 89 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 7 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 33 |recommended item position: 16 | recommended item id 16 | reward: 0
user: 58 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 29 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 60 |recommended item position: 0 | recommended item id 0 

user: 15 |recommended item position: 14 | recommended item id 16 | reward: 0
user: 99 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 29 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 6 |recommended item position: 15 | recommended item id 16 | reward: 0
user: 29 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 42 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 60 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 47 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 67 |recommended item position: 1 | recommended item id 2 | reward: 603.6925570939856
user: 33 |recommended item position: 27 | recommended item id 29 | reward: 0
user: 52 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 67 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 90 |recommended item p

user: 76 |recommended item position: 2 | recommended item id 2 | reward: 0
user: 98 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 19 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 98 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 58 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 7 |recommended item position: 15 | recommended item id 16 | reward: 0
user: 15 |recommended item position: 14 | recommended item id 16 | reward: 424.1694205414542
user: 98 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 56 |recommended item position: 15 | recommended item id 16 | reward: 0
user: 13 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 15 |recommended item position: 5 | recommended item id 7 | reward: 0
user: 10 |recommended item position: 16 | recommended item id 16 | reward: 424.1694205414542
user: 29 |recommended item position: 0 | recommended item id 

user: 54 |recommended item position: 14 | recommended item id 14 | reward: 0
user: 0 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 42 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 60 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 0 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 29 |recommended item position: 1 | recommended item id 2 | reward: 603.6925570939856
user: 79 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 47 |recommended item position: 5 | recommended item id 7 | reward: 372.08063048983627
user: 16 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 94 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 13 |recommended item position: 11 | recommended item id 12 | reward: 288.4518687144317
user: 19 |recommended item position: 13 | recommended it

user: 60 |recommended item position: 6 | recommended item id 7 | reward: 0
user: 15 |recommended item position: 13 | recommended item id 16 | reward: 0
user: 80 |recommended item position: 6 | recommended item id 7 | reward: 0
user: 15 |recommended item position: 5 | recommended item id 7 | reward: 372.08063048983627
user: 17 |recommended item position: 5 | recommended item id 7 | reward: 372.08063048983627
user: 82 |recommended item position: 5 | recommended item id 7 | reward: 0
user: 56 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 19 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 98 |recommended item position: 0 | recommended item id 0 | reward: 203.13197832458124
user: 89 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 98 |recommended item position: 9 | recommended item id 12 | reward: 0
user: 17 |recommended item position: 12 | recommended item id 15 | reward: 0
user: 98 |recommended item position: 0 | rec

user: 90 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 80 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 99 |recommended item position: 6 | recommended item id 7 | reward: 0
user: 19 |recommended item position: 0 | recommended item id 2 | reward: 0
user: 29 |recommended item position: 5 | recommended item id 7 | reward: 372.08063048983627
user: 98 |recommended item position: 15 | recommended item id 16 | reward: 0
user: 15 |recommended item position: 12 | recommended item id 15 | reward: 0
user: 66 |recommended item position: 16 | recommended item id 16 | reward: 424.1694205414542
user: 82 |recommended item position: 5 | recommended item id 7 | reward: 0
user: 98 |recommended item position: 9 | recommended item id 12 | reward: 288.4518687144317
user: 13 |recommended item position: 12 | recommended item id 14 | reward: 0
user: 8 |recommended item position: 19 | recommended item id 23 | reward: 0
user: 19 |recommended item position: 27 |

user: 33 |recommended item position: 14 | recommended item id 15 | reward: 838.9284039750416
user: 98 |recommended item position: 15 | recommended item id 16 | reward: 424.1694205414542
user: 7 |recommended item position: 14 | recommended item id 16 | reward: 0
user: 58 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 19 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 33 |recommended item position: 14 | recommended item id 15 | reward: 838.9284039750416
user: 47 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 99 |recommended item position: 14 | recommended item id 15 | reward: 838.9284039750416
user: 0 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 24 |recommended item position: 11 | recommended item id 12 | reward: 0
user: 58 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 19 |recommended item position: 13 | recommended item id 14 | reward: 0
user: 72 |recommended it

user: 29 |recommended item position: 5 | recommended item id 7 | reward: 372.08063048983627
user: 29 |recommended item position: 4 | recommended item id 6 | reward: 0
user: 76 |recommended item position: 1 | recommended item id 2 | reward: 0
user: 27 |recommended item position: 4 | recommended item id 6 | reward: 0
user: 67 |recommended item position: 12 | recommended item id 14 | reward: 0
user: 64 |recommended item position: 3 | recommended item id 6 | reward: 175.11590538768039
user: 19 |recommended item position: 14 | recommended item id 14 | reward: 639.0122643458554
user: 7 |recommended item position: 1 | recommended item id 2 | reward: 603.6925570939856
user: 98 |recommended item position: 0 | recommended item id 0 | reward: 0
user: 98 |recommended item position: 7 | recommended item id 15 | reward: 0
user: 52 |recommended item position: 25 | recommended item id 29 | reward: 114.6113219892532
user: 66 |recommended item position: 10 | recommended item id 12 | reward: 288.45186871

user: 99 |recommended item position: 12 | recommended item id 14 | reward: 0
user: 58 |recommended item position: 5 | recommended item id 7 | reward: 0
user: 75 |recommended item position: 5 | recommended item id 7 | reward: 0
user: 98 |recommended item position: 19 | recommended item id 29 | reward: 114.6113219892532
user: 56 |recommended item position: 15 | recommended item id 16 | reward: 0
user: 60 |recommended item position: 14 | recommended item id 16 | reward: 0
user: 82 |recommended item position: 5 | recommended item id 7 | reward: 372.08063048983627
user: 76 |recommended item position: 24 | recommended item id 29 | reward: 114.6113219892532
user: 20 |recommended item position: 1 | recommended item id 2 | reward: 603.6925570939856
user: 47 |recommended item position: 5 | recommended item id 7 | reward: 0
user: 10 |recommended item position: 12 | recommended item id 14 | reward: 639.0122643458554
user: 0 |recommended item position: 15 | recommended item id 16 | reward: 424.1694

Results with the last model are pretty satisfactory and most logical ones we've obtained so far. 