# Recommendation Systems: Collaborative Filtering

## Introduction

First of all, we should start with a definition of the problem. A recommender system is a software that exploits user’s preferences to suggests items to users. It helps users to find what they are looking for and it helps users to discover new interesting never seen items.

Deep Learning is acquiring great notoriety nowadays because of its ability to solve complex tasks such as image recognition, natural language processing, speech recognition, and more.

Companies like Netflix, Facebook and Google are using Deep Learning in order to determine what is more appealing to their users; and keep their engagement at a high.

Two different approaches may be adopted to solve the recommendation problem:

- Content Based
- Collaborative Filtering

The former exploits item’s description to infer a rate, the latter exploits user’s neighborhood so it’s based on the concept that similar users give similar rate to items. In this tutorial we will cover the Collaborative Filtering approach.

Collaborative filtering (CF) models aim to exploit information
about users’ preferences for items (e.g. star ratings)
to provide personalised recommendations. Owing to the
Netflix challenge, a panoply of different CF models have
been proposed, with popular choices being matrix factorisation and neighbourhood models. This first tutorial on Recommendation Systems follows AutoRec, a CF model based on the autoencoder
paradigm; our interest in this paradigm stems from
the successes of (deep) neural network models for vision
and speech tasks. In this tutorial we implement a symple AutoRec system.

<img src="images/recsys0.png">

This is a shallow neural net with only one hidden layer <font color='red'>(##TODO: maybe you can change that as part of an exercise)</font>. So we’ll just feed in all the movie ratings watched by a user and expect a more generalized rating distribution per user to come out. We can use this approach to obtain an idea of what their ratings would be for movies they haven’t watched.

# 0. Imports

In this tutorial we are going to build a recommender system using TensorFlow. We’ll use other useful packages such as:

- NumPy
- Pandas
- Sklearn
- Matplotlib

Let's import all of them!


In [None]:
# Importing tensorflow
import tensorflow as tf
# Importing some more libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error as MSE

## 1. Dataset

For this tutorial we will be using the MovieLens 1M dataset which contains 1 million ratings from 6000 users on 4000 movies. Ratings are contained in the file “ratings.dat” in the following format:

UserID::MovieID::Rating::Timestamp

We also are presented with "movies.dat" in the following format:

MovieID::Movie::Genre

We have to split our dataset in training set and test set. The training set is used to train our model and the test set will be used only to evaluate the learned model. We split the dataset using the Hold-Out 80/20 protocol, so 80% of ratings for each user are kept in the training set, the remaining 20% will be moved to the test set. If you have a dataset with few ratings, the best choice for splitting protocol would be the K-Fold. For the rows which have no data we should fill it with 0. Could you do that? 

- <font color='red'>(##TODO: Please complete the dataset for the missing values with 0.)</font> 



- <font color='red'>(##TODO: Divide the training and testing dataset with the Hold-out protocol described above. ##HINT: You may find this <a href="http://scikit-learn.org/stable/modules/classes.html#module-sklearn.model_selection">this</a> helpful.)</font>

Let’s load the dataset with pandas:

In [None]:
# Reading the ratings data
ratings = pd.read_csv('ratings.dat',\
          sep="::", header = None, engine='python')
movies = pd.read_csv('movies.dat',\
          sep="::", header = None, engine='python')
# Pivotting the data to get it at a user level
ratings_pivot = pd.pivot_table(ratings[[0,1,2]],\
          values=2, index=0, columns=1 ).fillna(0)
# YOUR CODE HERE
# ratings = (...)
# Creating train and test sets
# YOUR CODE HERE
# X_train, X_test = train_test_split(ratings, train_size=0.8)

Let's get the top 5 rows of the ratings dataframe.

In [None]:
X_train.head()

Let's do the same for movies:

- <font color='red'>(##TODO: Let's use the same method as the example before to see the first 5 entries in the movie dataframe)</font> 

In [None]:
# YOUR CODE HERE

Let's look at how many movies there are in the dataframe.

## 1. Architecture

Now, we have the data at user level, with their ratings as the features. Yay! There are a total of 3706 movies. So our dataset had 3706 features per user. Let's start building the model! Discover how I know there is 3706 movies in the model?
- <font color='red'>(##TODO: Complete with code to determine the number of movies)</font> 

In [None]:
number_movies = ##YOUR CODE HERE

# Deciding how many nodes each layer should have
n_nodes_inpl = number_movies 
n_nodes_hl1  = 256  
n_nodes_outl = number_movies 
# first hidden layer has 784*32 weights and 32 biases
hidden_1_layer_vals = {'weights':tf.Variable(tf.random_normal([n_nodes_inpl+1,n_nodes_hl1]))}
# first hidden layer has 784*32 weights and 32 biases
output_layer_vals = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1+1,n_nodes_outl]))}

Now we will define how many nodes each layer has and what the weight matrix will looke like. For this part, we are replicating the image shown at the beginning of the tutorial. Here we are not defining any bias matrix associated with the layers. This is due to the fact that we are going to add a bias node to each one of the layers which has a constant value of one.

- <font color='red'>(##TODO: Fill the bias with a constant value of one. ##HINT: You may find this <a href="https://www.tensorflow.org/api_docs/python/tf/fill">this</a> helpful.)</font> 


In [None]:
input_layer = tf.placeholder('float', [None, number_movies])
# Add a constant node to the first layer
input_layer_const = # YOUR CODE
input_layer_concat =  tf.concat([input_layer, input_layer_const], 1)
# Multiply output of input_layer wth a weight matrix 
layer_1 = tf.nn.sigmoid(tf.matmul(input_layer_concat,\
hidden_1_layer_vals['weights']))
# Adding one bias node to the hidden layer
layer1_const = # YOUR CODE
layer_concat =  tf.concat([layer_1, layer1_const], 1)

- <font color='red'>(##TODO:Multiply output of hidden with a weight matrix to get final output. ##HINT: You may find this <a href="https://www.tensorflow.org/api_docs/python/tf/matmul">this</a> helpful.)</font> 
- <font color='red'> Brownie points for you if you can tell me at the beginning of the next session in a consise way why we would use reduce_mean as opposed to reduce_sum </font>

In [None]:
# Multiply output of hidden with a weight matrix to get final output
output_layer = ## YOUR CODE
# Output_true shall have the original shape for error calculations
output_true = tf.placeholder('float', [None, number_movies])
# Define our cost function
meansq =    tf.reduce_mean(tf.square(output_layer - output_true))
# Define our optimizer
learn_rate = 0.1   # how fast the model should learn
optimizer = tf.train.AdagradOptimizer(learn_rate).minimize(meansq)

OK! Model is build. Now, let's determine how many epochs the model runs. We also need to start the session!
- <font color='red'>##TODO:How many images do we have on the dataset? (##HINT: You may find this <a href="https://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.shape.html">this</a> helpful.)</font> 

In [None]:
# Initialising variables and starting the session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
# Defining batch size, number of epochs and learning rate
batch_size = 100
number_epochs = 20 
total_images = ##YOUR CODE HERE

Now, off to training. We are running the model for a 20 epochs taking 100 users in batches. 
<font color='red'>(##TODO:You can increase the number of epochs and see how your model does? Want to try?)</font> 

In [None]:
loss_epoch_list = []
for epoch in range(number_epochs):
    epoch_loss = 0    # initializing error as 0
    
    for i in range(int(total_images/batch_size)):
        epoch_x = X_train[ i*batch_size : (i+1)*batch_size ]
        _, c = sess.run([optimizer, meansq],\
               feed_dict={input_layer: epoch_x, \
               output_true: epoch_x})
        epoch_loss += c
        
    output_train = sess.run(output_layer,\
               feed_dict={input_layer:X_train})
    output_test = sess.run(output_layer,\
                   feed_dict={input_layer:X_test})
        
    print('MSE train', MSE(output_train, X_train),'MSE test', MSE(output_test, X_test))      
    print('Epoch', epoch, '/', number_epochs, 'loss:',epoch_loss)
    loss_epoch_list.append(epoch_loss)

<font color='red'>(##TODO:Now can you plot the `loss_epoch_list`?)</font> 

In [7]:
##YOUR CODE

Now let's put this model into use. Let's get the predicted rating for an user. For that, all you have to do is pass it through the net once. If you are going to use the output to recommend that user some movies.
<font color='red'>##TODO:Can you do that? (##HINT: You may find this <a href="https://www.tensorflow.org/api_docs/python/tf/Session">this</a> helpful.)</font> 

In [None]:
# pick a user
sample_user = X_test.iloc[95,:]
#get the predicted ratings
sample_user_pred = ## YOUR CODE
sample_user_pred

## References / Credit

[0] http://users.cecs.anu.edu.au/~akmenon/papers/autorec/autorec-paper.pdf

[1] https://medium.com/@connectwithghosh/recommender-system-on-the-movielens-using-an-autoencoder-using-tensorflow-in-python-f13d3e8d600d

[2] http://www.vitobellini.com/posts/2018/01/03/how-to-build-a-recommender-system-in-tensorflow.html