# NCF Recommender

A lot of recommender systems are build on some form of matrix factorization, but in the recent research what seems to be most popular is to do matrix factorization through some kind of neural network. This usually allows for the usual benefits of matrix factorization(scaling, complex user/item relationships, etc.) along with more line/nonlinear relationships.

Neural Collaborative Filtering (NCF) is one of the most popular algorithms in this area. We'll build a simple NCF on a movielens dataset, as is standard for most Product Recommendation Algorithms.

# Building Recommender Systems using Implicit Feedback

Originally, since product recommendation only really took off after the netflix prices, a lot of product recommendation algorithms are done using "explicit" (meaning 1-5 stars) feedback. That said, this is pretty rare in the real world. Most feedback is in truth, implicit (1 if liked or 0 otherwise) and thats the type of data we will use in this notebook

# Data Preprocessing

Before we start building and training our model, let's do some preprocessing to get the data in the required format.

In [None]:
import pandas as pd
import numpy as np
from tqdm.notebook import tqdm

import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
import pytorch_lightning as pl

np.random.seed(123)

In [None]:
ratings = pd.read_csv('movielens-20m-dataset/rating.csv', 
                      parse_dates=['timestamp'])

Since this is being run on collaboratory, lets limit this to 30% of users.

In [None]:
rand_userIds = np.random.choice(ratings['userId'].unique(), 
                                size=int(len(ratings['userId'].unique())*0.3), 
                                replace=False)

ratings = ratings.loc[ratings['userId'].isin(rand_userIds)]

print('There are {} rows of data from {} users'.format(len(ratings), len(rand_userIds)))

In [None]:
ratings.sample(5)

Now 6,027,314  rows of data from 41,547 users

### Train-test split

A regular train test split wouldn't really make sense here since time is particularly relevant, so we're going to split this based on the timestamp with leave out 1 interaction. That last interaction is what we need to "predict" to be right. 

There's a lot more nuance to this in the real world, but this is good enough for a start.

![](https://i.imgur.com/oNJnLqU.png)
> **Movie posters from themoviedb.org ()**
> 





In [None]:
ratings['rank_latest'] = ratings.groupby(['userId'])['timestamp'] \
                                .rank(method='first', ascending=False)

train_ratings = ratings[ratings['rank_latest'] != 1]
test_ratings = ratings[ratings['rank_latest'] == 1]

# drop columns that we no longer need
train_ratings = train_ratings[['userId', 'movieId', 'rating']]
test_ratings = test_ratings[['userId', 'movieId', 'rating']]

### Converting the dataset into an implicit feedback dataset

Now we're going to turn this into an implicit feedback dataset. This does bring a few issues into the front, mainly how if its now rated its a 0 (meaning now interested) which isn't necessarily true. That said, its been shown to work in academia and industry so we can accept it for our modeling. 

There's also the problem of how many negative examples to include (there's always tons of negative examples). For now we're gonna include 4.






In [None]:
train_ratings.loc[:, 'rating'] = 1

train_ratings.sample(5)

In [None]:
# Get a list of all movie IDs
all_movieIds = ratings['movieId'].unique()

# Placeholders that will hold the training data
users, items, labels = [], [], []

# This is the set of items that each user has interaction with
user_item_set = set(zip(train_ratings['userId'], train_ratings['movieId']))

# 4:1 ratio of negative to positive samples
num_negatives = 4

for (u, i) in tqdm(user_item_set):
    users.append(u)
    items.append(i)
    labels.append(1) # items that the user has interacted with are positive
    for _ in range(num_negatives):
        # randomly select an item
        negative_item = np.random.choice(all_movieIds) 
        # check that the user has not interacted with this item
        while (u, negative_item) in user_item_set:
            negative_item = np.random.choice(all_movieIds)
        users.append(u)
        items.append(negative_item)
        labels.append(0) # items not interacted with are negative

Now lets create a PyTorch dataset class for training

In [None]:
class MovieLensTrainDataset(Dataset):
    """MovieLens PyTorch Dataset for Training
    
    Args:
        ratings (pd.DataFrame): Dataframe containing the movie ratings
        all_movieIds (list): List containing all movieIds
    
    """

    def __init__(self, ratings, all_movieIds):
        self.users, self.items, self.labels = self.get_dataset(ratings, all_movieIds)

    def __len__(self):
        return len(self.users)
  
    def __getitem__(self, idx):
        return self.users[idx], self.items[idx], self.labels[idx]

    def get_dataset(self, ratings, all_movieIds):
        users, items, labels = [], [], []
        user_item_set = set(zip(ratings['userId'], ratings['movieId']))

        num_negatives = 4
        for u, i in user_item_set:
            users.append(u)
            items.append(i)
            labels.append(1)
            for _ in range(num_negatives):
                negative_item = np.random.choice(all_movieIds)
                while (u, negative_item) in user_item_set:
                    negative_item = np.random.choice(all_movieIds)
                users.append(u)
                items.append(negative_item)
                labels.append(0)

        return torch.tensor(users), torch.tensor(items), torch.tensor(labels)

# Our model - Neural Collaborative Filtering (NCF)

As mentioned before, we're going to use NCF to model this dataset[He et al.](https://arxiv.org/abs/1708.05031) 

<img src="https://recodatasets.z20.web.core.windows.net/images/NCF.svg?sanitize=true">

Its a pretty popular model building on past research on matrix factorization and neural networks.fundamentally, we're building 2 embedding layers for users and items which we put through a "shallow layer" and a "deep layer" in order to make our predictions.


In [None]:
class NCF(pl.LightningModule):
    """ Neural Collaborative Filtering (NCF)
    
        Args:
            num_users (int): Number of unique users
            num_items (int): Number of unique items
            ratings (pd.DataFrame): Dataframe containing the movie ratings for training
            all_movieIds (list): List containing all movieIds (train + test)
    """
    
    def __init__(self, num_users, num_items, ratings, all_movieIds):
        super().__init__()
        self.user_embedding = nn.Embedding(num_embeddings=num_users, embedding_dim=8)
        self.item_embedding = nn.Embedding(num_embeddings=num_items, embedding_dim=8)
        self.fc1 = nn.Linear(in_features=16, out_features=64)
        self.fc2 = nn.Linear(in_features=64, out_features=32)
        self.output = nn.Linear(in_features=32, out_features=1)
        self.ratings = ratings
        self.all_movieIds = all_movieIds
        
    def forward(self, user_input, item_input):
        
        # Pass through embedding layers
        user_embedded = self.user_embedding(user_input)
        item_embedded = self.item_embedding(item_input)

        # Concat the two embedding layers
        vector = torch.cat([user_embedded, item_embedded], dim=-1)

        # Pass through dense layer
        vector = nn.ReLU()(self.fc1(vector))
        vector = nn.ReLU()(self.fc2(vector))

        # Output layer
        pred = nn.Sigmoid()(self.output(vector))

        return pred
    
    def training_step(self, batch, batch_idx):
        user_input, item_input, labels = batch
        predicted_labels = self(user_input, item_input)
        loss = nn.BCELoss()(predicted_labels, labels.view(-1, 1).float())
        return loss

    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters())

    def train_dataloader(self):
        return DataLoader(MovieLensTrainDataset(self.ratings, self.all_movieIds),
                          batch_size=512, num_workers=4)

In [None]:
num_users = ratings['userId'].max()+1
num_items = ratings['movieId'].max()+1

all_movieIds = ratings['movieId'].unique()

model = NCF(num_users, num_items, train_ratings, all_movieIds)

Let's train our NCF model for 5 epochs using the GPU.

In [None]:
trainer = pl.Trainer(max_epochs=5, gpus=1, reload_dataloaders_every_epoch=True,
                     progress_bar_refresh_rate=50, logger=False, checkpoint_callback=False)

trainer.fit(model)

# Evaluating our Recommender System

Now lets evaluate the model. Generally, product recommendations are used in the following way in the real world:

![](https://i.imgur.com/XZZ2Ni8.png)

You get a list of recommendations (not just one) and if you click/buy an item on the list, its a "hit". This is what we would want to emulate in our metrics.

To emulate this, Lets do the following:

* For each user, randomly select 99 items that the user **has not interacted with**
* Combine these 99 items with the test item (the actual item that the user interacted with). We now have 100 items.
* Run the model on these 100 items, and rank them according to their predicted probabilities
* Select the top 10 items from the list of 100 items. If the test item is present within the top 10 items, then we say that this is a hit.
* Repeat the process for all users. The Hit Ratio is then the average hits.

This model metric is usually called **Hit Ratio @ 10**.

### Hit Ratio @ 10 



In [None]:
# User-item pairs for testing
test_user_item_set = set(zip(test_ratings['userId'], test_ratings['movieId']))

# Dict of all items that are interacted with by each user
user_interacted_items = ratings.groupby('userId')['movieId'].apply(list).to_dict()

hits = []
for (u,i) in tqdm(test_user_item_set):
    interacted_items = user_interacted_items[u]
    not_interacted_items = set(all_movieIds) - set(interacted_items)
    selected_not_interacted = list(np.random.choice(list(not_interacted_items), 99))
    test_items = selected_not_interacted + [i]
    
    predicted_labels = np.squeeze(model(torch.tensor([u]*100), 
                                        torch.tensor(test_items)).detach().numpy())
    
    top10_items = [test_items[i] for i in np.argsort(predicted_labels)[::-1][0:10].tolist()]
    
    if i in top10_items:
        hits.append(1)
    else:
        hits.append(0)
        
print("The Hit Ratio @ 10 is {:.2f}".format(np.average(hits)))

To put this into context, what this means is that 86% of the users were recommended the actual item (among a list of 10 items) that they eventually interacted with. This is honestly better than is likely in the real world, but not bad for practice models