In [1]:
import fastbook
fastbook.setup_book()

In [2]:
from fastbook import *
from fastai.collab import *
from fastai.tabular.all import *

In [3]:
# Downloading the data the usual way
path = untar_data(URLs.ML_100k)

# Extracting the Ratings
ratings = pd.read_csv(path/'u.data', delimiter='\t', header=None,
                      names=['user', 'movie', 'rating', 'timestamp'])

# Extracting the Movie Titles
movies = pd.read_csv(path/'u.item', delimiter='|', encoding='latin-1',
                    usecols=(0,1), names=('movie', 'title'), header=None)

# Merging the two dataframes
ratings = ratings.merge(movies)

# Creating our DataLoaders
dls = CollabDataLoaders.from_df(ratings,
                                 user_name='user',
                                 item_name='title',
                                 rating_name = 'rating',
                                 bs=64)

# Initialising our Latent Factors
n_users = len(dls.classes['user'])
n_movies = len(dls.classes['title'])
n_factors = 5

user_factors = torch.randn(n_users, n_factors)
movie_factors = torch.randn(n_movies, n_factors)

# Collaborative Filtering Deep Dive

## Deep Learning for Collaborative Filtering

Our dot product model approach to collaborative filtering is known as *probalistic matrix factorisation (PMF)* and works quite well, and is the basis of many successful real-world recommendation systems. Another approach, which generally works similarly well given the same data is deep learning.

To turn our architure into a deep learning model, we need to take the results of the embedding lookup and concatenate these activations together. This gives us a matrix which we can then pass through linear layers and nonlinearities in the usual way. Since we'll be concatenating the embeddings, rather than taking their dot product, the two embedding matrices can have different sizes (i.e., different numbers of later factors).

fastai has a function `get_emb_sz` that returns recommended sizes for embedding matrices for your data, based on a heuristic.



In [6]:
embs = get_emb_sz(dls)
embs

[(944, 74), (1665, 102)]

Implementing this into a class, as before, we get the following.

In [7]:
class CollabNN(Module):
    def __init__(self, user_sz, item_sz, y_range=(0,5.5), n_act=100):
        self.user_factors = Embedding(*user_sz)
        self.item_factors = Embedding(*item_sz)
        self.layers = nn.Sequential(
            nn.Linear(user_sz[1] + item_sz[1], n_act),
            nn.ReLU(),
            nn.Linear(n_act, 1))
        self.y_range = y_range
    
    def forward(self, x):
        embs = self.user_factors(x[:,0]),self.item_factors(x[:,1])
        x = self.layers(torch.cat(embs, dim=1))
        return sigmoid_range(x, *self.y_range)

And using this to create a model:

In [8]:
model = CollabNN(*embs)

`CollabNN` creates our `Embedding` layers in the same way as previous classes in this chapter, except that we now use the `embs` sizes.

`self.layers` is identical to the mini-neural net we created for MNIST.

In `forward`, we apply the embeddings, concatenate the results, and pass this through the mini-neural net.

Finally, we apply `sigmoid_range` as we have in previous models.

Training the model, we do the following:

In [9]:
learn = Learner(dls, model, loss_func=MSELossFlat())
learn.fit_one_cycle(5, 5e-3, wd=0.01)

epoch,train_loss,valid_loss,time
0,0.933598,0.952186,00:07
1,0.898114,0.905402,00:06
2,0.870572,0.882016,00:08
3,0.818981,0.87063,00:06
4,0.756483,0.873019,00:06


### Deep Learning using fastai's `collab_learner`

fastai provides the above model in `fastai.collab` if you pass `use_nn=True` in your call to `collab_learner` (including calling `get_emb_sz` for you), and lets you easily create more layers.

For instance, the below architecture has two hidden layers of size 100 and 50, respectively.

In [10]:
learn = collab_learner(dls, use_nn=True, y_range=(0, 5.5), layers=[100,50])
learn.fit_one_cycle(5, 5e-3, wd=0.1)

epoch,train_loss,valid_loss,time
0,0.969892,0.97392,00:11
1,0.921386,0.914596,00:07
2,0.886867,0.889625,00:08
3,0.834015,0.868451,00:09
4,0.758782,0.873408,00:08


Although the results of `EmbeddingNN` are a bit worse than the dot product approach (which shows the power of carefully constructing an architecture for a domain), it does allow us to do something very important: we can now directly incorporate other user and movie information, date and time information, or any other information that may be relevant to the recommendation.