# Collaborative Filtering for Implicit Feedback Datasets

Article for more details: http://yifanhu.net/PUB/cf.pdf

Facebook use this implementation: http://www.benfrederickson.com/fast-implicit-matrix-factorization/
https://jessesw.com/Rec-System/


I used the code from the following page: https://www.kaggle.com/wikhung/implicit-cf-tensorflow-implementation

In [1]:
import numpy as np 
import pandas as pd 
import tensorflow as tf
import random

#### Importing the data:

The code implemented for collaborative filtering for implicit feedback needs some kind of interaction between the user and the item. Types of implicit feedback include purchase history, browsing history, search patterns, or even mouse movements. For example, a user that purchased many books by the same author probably likes that author.

The data used here is from MovieLens with 100.000 recommendations from 943 users who have rated 1682 movies (Items).
In this example, the rating is considered as an interaction, so the interaction could be a fraction of the video that has been watched.  

Here we pretend like the interaction is the number of times the user has clicked on the video.

In [2]:
path = 'u.data' # data path
df = pd.read_csv(path, sep='\t', names=['User', 'Item', 'Click', 'Timestamp'], header=None)
df = df.drop('Timestamp', axis=1) # Removing Timestamp
print(df.head())

   User  Item  Click
0   196   242      3
1   186   302      3
2    22   377      1
3   244    51      2
4   166   346      1


#### Sort the df by User, Item, and Click

The users are sorted according to the items:

In [3]:
df_Sorted = df.sort_values(['User', 'Item', 'Click'])
print("Total number of movies watch by the users = {}\n".format(df_Sorted.size))
print(df_Sorted.head())

Total number of movies watch by the users = 300000

       User  Item  Click
32236     1     1      5
23171     1     2      3
83307     1     3      4
62631     1     4      3
47638     1     5      3


#### Drop the duplicated records (If someone watched the same item twice):

In [4]:
clean_df = df_Sorted.drop_duplicates(['User', 'Item'], keep = 'last')
print("Size of clean_df = {}\n".format(clean_df.size))
print(clean_df.head())

Size of clean_df = 300000

       User  Item  Click
32236     1     1      5
23171     1     2      3
83307     1     3      4
62631     1     4      3
47638     1     5      3


In [5]:
n_User = len(clean_df.User.unique())
n_Item = len(clean_df.Item.unique())

print('There are {0} users and {1} items in the data'.format(n_User, n_Item))

There are 943 users and 1682 items in the data


#### If we build a matrix of Users x Items, how many cells in the matrix will be filled?
Fraction of cells which is filled (Sparsity):

In [6]:
sparsity = clean_df.shape[0] / float(n_User * n_Item)
print('{:.2%} of the user-item matrix is filled'.format(sparsity))

6.30% of the user-item matrix is filled


#### User Item preference matrix:
He we make a matrix which tells us which movie has been seen by the user. If the movie is watched by a user: 

\begin{equation}
p_{ui} =    \begin{cases}
    1, & r_{ui} > 0.\\
    0, & r_{ui} = 0.
  \end{cases}
  \end{equation}
 
$r_{ui}$: user $u$ clicked(or other interaction) number of times on item $i$

$p_{ui}$: user $u$ consumed item $i$ $(r_{ui} > 0)$, then we have an indication that $u$ likes $i$ $(p_{ui} = 1)$.
On the other hand, if $u$ never consumed $i$, we believe no preference $(p_{ui} = 0)$. 

The preference matrix $p_{ui}$:

In [7]:
User_Item_pref = clean_df.copy()
User_Item_pref['Click'][User_Item_pref['Click'] > 0] = 1  
User_Item_pref = User_Item_pref.pivot(index='User', columns='Item', values='Click')
User_Item_pref.fillna(0, inplace=True)
User_Item_pref = User_Item_pref.values

print("Single user preference on each items: {}".format(User_Item_pref[0:1])) 
print("Shape of the User_Item_pref matrix: {}".format(User_Item_pref.shape))

Single user preference on each items: [[1. 1. 1. ... 0. 0. 0.]]
Shape of the User_Item_pref matrix: (943, 1682)


#### User Item interaction matrix:

User_Item_interactions: matrix where we can see the number of clicks for each user $r_{ui}$.

In [8]:
User_Item_interactions = clean_df.pivot(index='User', columns='Item', values='Click')
User_Item_interactions.fillna(0, inplace=True)
User_Item_interactions = User_Item_interactions.values
print("Single user clicks on each items: {}".format(User_Item_interactions[1:2])) 
print("Shape of the User_Item_interactions matrix: {}".format(User_Item_interactions.shape))

Single user clicks on each items: [[4. 0. 0. ... 0. 0. 0.]]
Shape of the User_Item_interactions matrix: (943, 1682)


In [9]:
k = 10 # Number of top k items we want to recommend for the user

# View_counts counts the number of item purchased by each user:
View_counts = np.apply_along_axis(np.bincount, 1, User_Item_pref.astype(int))

# buyers_idx finds the users who seen 2*k movies/items:
buyers_idx = np.where(View_counts[:, 1] >= k*2)[0] 
print('{0} users viewed {1} or more items'.format(len(buyers_idx), k*2))

943 users viewed 20 or more items


In [10]:
test_frac = 0.2 
test_users_idx = np.random.choice(buyers_idx,
                                  size = int(np.ceil(len(buyers_idx) * test_frac)),
                                  replace = False)

val_users_idx = test_users_idx[:int(len(test_users_idx) / 2)]
test_users_idx = test_users_idx[int(len(test_users_idx) / 2):]
print("Randomly selected users for test set : {}".format(test_users_idx))

Randomly selected users for test set : [144 898 394 512  82 672 774 634 780 170 375 573 923 580 890 823 464 414
 528 617 722 566 802 136 591 649 402 692 145 514 435 604  86 544 918  91
  90 140 875 679 851 839 224 328 813 760 764 105 751 399 834 929 763 520
  87 479 795 127 723 174 525 726 395 519 786 132 725  64 808 849 545 397
 803 646 543 387 814 861 666 169 119 590 388  73 248 699  29  98 440 256
 503 908 325 656 238]


#### A function used to mask the preferences data from training matrix:

In [11]:
def data_process(interaction, dat, train, test, user_idx, k):
    for user in user_idx:
        purchases = np.where(dat[user, :] == 1)[0]
        mask = np.random.choice(purchases, size = k, replace = False)
        interaction[user, mask] = 0
        train[user, mask] = 0
        test[user, mask] = dat[user, mask]
    return train, test, interaction

In [12]:
zero_matrix = np.zeros(shape = (n_User, n_Item))
train_matrix = User_Item_pref.copy()
test_matrix = zero_matrix.copy()
val_matrix = zero_matrix.copy()

# Mask the train matrix and create the validation and test matrices
train_matrix, val_matrix, User_Item_interactions = data_process(User_Item_interactions,User_Item_pref, train_matrix, val_matrix, val_users_idx, k)
train_matrix, test_matrix, User_Item_interactions = data_process(User_Item_interactions,User_Item_pref, train_matrix, test_matrix, test_users_idx, k)

In [13]:
print(train_matrix.shape)
print(val_matrix.shape)

(943, 1682)
(943, 1682)


In [14]:
# let's take a look at what was actually accomplised
# You can see the test matrix preferences are masked in the train matrix
test_matrix[      test_users_idx[0]  , test_matrix[test_users_idx[0], :].nonzero()[0]   ]
print(train_matrix[test_users_idx[0], test_matrix[test_users_idx[0], :].nonzero()[0]])

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


### TensorFlow
Important: Several of the hyperparameters should be optimized by grid search!

To understand the math read at the article.

In [15]:
tf.reset_default_graph() # Create a new graphs

pref = tf.placeholder(tf.float32, (n_User, n_Item))  # Here's the preference matrix
interactions = tf.placeholder(tf.float32, (n_User, n_Item)) # Here viewed or not viewed matrix
users_idx = tf.placeholder(tf.int32, (None))

In [16]:
#n_features: Number of latent features to be extracted. (Hyperparameter)
n_features = 10 

# The X matrix represents the user latent preferences with a shape of user x latent features
X = tf.Variable(tf.truncated_normal([n_User, n_features], mean = 0, stddev = 0.05))

# The Y matrix represents the item latent features with a shape of item x latent features
Y = tf.Variable(tf.truncated_normal([n_Item, n_features], mean = 0, stddev = 0.05))

# Here's the initilization of the confidence parameter 
conf_alpha = tf.Variable(tf.random_uniform([1], 0, 1))

In [17]:
# Initialize a user bias vector n_User, n_Item
user_bias = tf.Variable(tf.truncated_normal([n_User, 1], stddev = 0.2))

# Concatenate the vector to the user matrix
# Due to how matrix algebra works, we also need to add a column of ones to make sure
# the resulting calculation will take into account the item biases.
X_plus_bias = tf.concat([X, 
                         #tf.convert_to_tensor(user_bias, dtype = tf.float32),
                         user_bias,
                         tf.ones((n_User, 1), dtype = tf.float32)], axis = 1)

In [18]:
# Initialize the item bias vector
item_bias = tf.Variable(tf.truncated_normal([n_Item, 1], stddev = 0.2))

# Cocatenate the vector to the item matrix
# Also, adds a column one for the same reason stated above.
Y_plus_bias = tf.concat([Y, 
                         tf.ones((n_Item, 1), dtype = tf.float32),
                         item_bias],
                         axis = 1)

In [19]:
# Here, we finally multiply the matrices together to estimate the predicted preferences
pred_pref = tf.matmul(X_plus_bias, Y_plus_bias, transpose_b=True)

# Construct the confidence matrix with the clicks and alpha paramter
conf = 1 + conf_alpha * interactions

In [20]:
cost = tf.reduce_sum(tf.multiply(conf, tf.square(tf.subtract(pref, pred_pref))))
l2_sqr = tf.nn.l2_loss(X) + tf.nn.l2_loss(Y) + tf.nn.l2_loss(user_bias) + tf.nn.l2_loss(item_bias)

lambda_c = 0.01
loss = cost + lambda_c * l2_sqr

In [21]:
lr = 0.05
optimize = tf.train.AdagradOptimizer(learning_rate = lr).minimize(loss)

In [22]:
# This is a function that helps to calculate the top k item that the user would prefer
def top_k_precision(predicted, mat, k, user_idx):
    """The predicted is a (user x items) matrix is the probability for each 
    item."""
    precisions = []
    
    for user in user_idx:
        rec = np.argsort(-predicted[user, :]) # The argsort sorts the recommendation
                                            #for each user from high to low
        top_k = rec[:k] # Getting the top k items
        labels = mat[user, :].nonzero()[0]
        precision = len(set(top_k) & set(labels)) / float(k) # Calculate the precisions from actual labels
        precisions.append(precision)
    return np.mean(precisions)

In [23]:
saver = tf.train.Saver()

iterations = 70
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    for i in range(iterations):
        sess.run(optimize, feed_dict = {pref: train_matrix,
                                        interactions: User_Item_interactions})
        
        if i % 10 == 0:
            mod_loss = sess.run(loss, feed_dict = {pref: train_matrix,
                                                   interactions: User_Item_interactions})
            mod_pred = pred_pref.eval()
            train_precision = top_k_precision(mod_pred, train_matrix, k, val_users_idx)
            val_precision = top_k_precision(mod_pred, val_matrix, k, val_users_idx)
            print('Iterations {0}...'.format(i),
                  'Training Loss {:.2f}...'.format(mod_loss),
                  'Train Precision {:.3f}...'.format(train_precision),
                  'Val Precision {:.3f}'.format(val_precision)
                )

    recommendation = pred_pref.eval() # This produce a matrix where the rows are the user and 
                                      #the columns are the expected recommendation for each movie
    print("Shape of the recommendation matrix : {}\n".format(recommendation.shape))
    test_precision = top_k_precision(recommendation, test_matrix, k, test_users_idx)
    print('The average test recommendation precision for all users = {:.2f}%'.format(test_precision*100))
    
    #np.save(recommendation) # to save the recommendation matrix
    sess.close()

('Iterations 0...', 'Training Loss 421353.38...', 'Train Precision 0.213...', 'Val Precision 0.029')
('Iterations 10...', 'Training Loss 158854.97...', 'Train Precision 0.476...', 'Val Precision 0.079')
('Iterations 20...', 'Training Loss 135906.41...', 'Train Precision 0.524...', 'Val Precision 0.091')
('Iterations 30...', 'Training Loss 124850.24...', 'Train Precision 0.556...', 'Val Precision 0.091')
('Iterations 40...', 'Training Loss 117339.80...', 'Train Precision 0.560...', 'Val Precision 0.090')
('Iterations 50...', 'Training Loss 111481.17...', 'Train Precision 0.578...', 'Val Precision 0.090')
('Iterations 60...', 'Training Loss 106546.64...', 'Train Precision 0.591...', 'Val Precision 0.089')
Shape of the recommendation matrix : (943, 1682)

The average test recommendation precision for all users = 12.32%
