# Collaborative Filtering with Deep Learning (Matrix Factorization)

A popular method of collaborative filtering used that leverages the power of deep learning in Matrix Factorization. The idea is that the matrix representing the users rating for each item (or interaction score) can be factored into two matrix such that the dimension of all three matricies satsify (n x l) * (l x m) = (n x m). The n and m represent the number of users and items respectively, while the l represent the feature space which size we may choose. Since this is all based around linear algebra, its very natually can be formed into a trainable deep learning model. And whatsmore is additional hidden layers can be added. (This makes it go from General Matrix Factorization to Neural Matrix Factorization)

Using the matrix made in the third notebook, a deep learning model will be trained for proof of concept. Since this will not be fully productionized, it will be built with keras, the convient python framework for deep learning that can be backend by tensorflow.

Sources

[Functional API example](https://keras.io/guides/functional_api/)

[Article example](https://towardsdatascience.com/neural-collaborative-filtering-96cef1009401)

[Second simple example with PCA Viz](https://towardsdatascience.com/building-a-book-recommendation-system-using-keras-1fba34180699)

In [20]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

print("numpyy version: ",np.__version__)
print("tensorflow version: ", tf.__version__)
print("keras version: ", keras.__version__)

numpyy version:  1.17.5
tensorflow version:  1.14.0
keras version:  2.2.4-tf


In [41]:
#Dummy Variables for constructing model
num_users = 1000
num_games = 500
latent_dim = 100
fc_dim = 50

#note that they must be the same which is why they're assigned as such
user_embedding_shape = game_embedding_shape = 50


In [44]:
#design input layer
user_input = keras.Input(shape = (1,), name="user")
game_input = keras.Input(shape = (1,), name="game")

#create embeddings for vector representing user and their games played
user_embeddings = layers.Embedding(input_dim = num_users, output_dim = latent_dim, name = 'user_embedding',
                                  embeddings_initializer="uniform", input_length=1)(user_input)
game_embeddings = layers.Embedding(input_dim = num_games, output_dim = latent_dim, name = 'game_embedding',
                                  embeddings_initializer="uniform", input_length=1)(game_input)

#add dropout to both embedding layers
user_dropout = layers.Dropout(rate=.5, name='user_dropout')(user_embeddings)
game_dropout = layers.Dropout(rate=.5, name='game_dropout')(game_embeddings)

#flatten out results into one dimension
user_latent = layers.Flatten(name = "user_flatten")((user_dropout))
game_latent = layers.Flatten(name = "game_flatten")((game_dropout))

#dot product of these latent vectors
predict_vector = layers.Multiply(name = "dot_product")([user_latent, game_latent])

#one hidden fully connected layer after dot product
dense_layer = layers.Dense(fc_dim, name = "fully_connected")(predict_vector)

#dropout after fully connected to help against overfitting
dense_dropout = layers.Dropout(rate=.5, name = "fc_dropout")(dense_layer)

#final dense layer
prediction = layers.Dense(1, activation= 'sigmoid', kernel_initializer= 'lecun_uniform', name = 'predicition')(dense_dropout)

#Build the model
model = keras.Model(inputs = [user_input, game_input], outputs = prediction, name='Matrix Factorization')

#print summary
model.summary()

Model: "Matrix Factorization"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
user (InputLayer)               [(None, 1)]          0                                            
__________________________________________________________________________________________________
game (InputLayer)               [(None, 1)]          0                                            
__________________________________________________________________________________________________
user_embedding (Embedding)      (None, 1, 100)       100000      user[0][0]                       
__________________________________________________________________________________________________
game_embedding (Embedding)      (None, 1, 100)       50000       game[0][0]                       
_______________________________________________________________________________

[]