# Neural Collaborative Filtering

Collaborative filtering is traditionally done with matrix factorization. I did my movie recommendation project using good ol' matrix factorization. However, recently I discovered that people have proposed new ways to do collaborative filtering with deep learning techniques! There's a paper, titled [Neural Collaborative Filtering](https://www.comp.nus.edu.sg/~xiangnan/papers/ncf.pdf), from 2017 which describes the approach to perform collaborative filtering using neural networks.

> In recent years, deep neural networks have yielded immense success on speech recognition, computer vision and 
> natural language processing. However, the exploration of deep neural networks on recommender systems has received 
> relatively less scrutiny. In this work, we strive to develop techniques based on neural networks to tackle the
> key problem in recommendation — collaborative filtering — on the basis of implicit feedback.

Here's the high level idea.

![ncf-high-level](./neural_collaborative_filtering_files/ncf-high-level.png)

We perform embedding for each user and item(movie). The embedding layer(s) is simply a matrix dot product of one hot encoding of a user/movie and the embedding weights.

In [5]:
from keras.utils import to_categorical

# We have 10 users, each is uniquely identified by an ID.
users = [i for i in range(10)]

one_hot_encoding_users = to_categorical(users)
print one_hot_encoding_users

[[ 1.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  1.  0.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  1.]]


The embedding vectors will be fed into a deep neural network and its objective is to predict the rating from a user to a movie. For example, user 1 may rate movie 1 with five stars. The network should be able to predict that after training.

## Generalized Matrix Factorization (GMF)

In the context of the paper, a generalized matrix factorization can be described by the following equation.

$$
\hat{y}_{ui} = a\left(h^{T}(p_{u} \cdot q_{i})\right)
$$

where $a$ is an activation function and $h$ is the edge weight matrix of the output layer. The edge weight matrix can be seen as an additional weight to the layer. 

If we use an identity function for activation and enforce the edge weight matrix to be a uniform vector of 1, we can exactly recover the standard matrix factorization model. However, we will do something more here to make it non-linear.

In [78]:
from keras.models import Model, Sequential
from keras.layers import Embedding, Flatten
from keras.layers.merge import Dot
import keras

num_users = 5
num_movies = 5
latent_dim = 10

user_input = Input(shape=(num_users, 1))
user_embedding = Embedding(num_users, latent_dim, input_length=1, name='user-embedding')
user_flatten = Flatten()

user_embedding_model = Model(user_input, user_embedding, user_flatten)

movie_input = Input(shape=(num_movies, 1))
movie_embedding = Embedding(num_movies, latent_dim, input_length=1, name='movie-embedding')
movie_flatten = Flatten()

movie_embedding_model = Model(movie_input, movie_embedding, movie_flatten)