# Recommendations Systems
## Assignment 3:  Neural Collaborative Filtering

**By:**  
Group 16

<br><br>

**The goal of this assignment is to:**
- Understand the concept of recommendations based on implicit data which is very common in real life.
- Understand how DL components can be used to implement a collaborative filtering & hybrid approach recommenders.
- Understand pros&cons comparing to other recommender system approaches.
- Practice recommender system training and evaluation.

**Instructions:**
- Students will form teams of two people each, and submit a single homework for each team.
- The same score for the homework will be given to each member of the team.
- Your solution in the form of an Jupyter notebook file (with extension ipynb).
- Images/Graphs/Tables should be submitted inside the notebook.
- The notebook should be runnable and properly documented. 
- Please answer all the questions and include all your code.
- English only.

**Submission:**
- Submission of the homework will be done via Moodle by uploading a link to google colab.
- The homwork needs to be entirely in English.
- The deadline for submission is on Moodle.

**Requirements:**  
- Python 3.6+ should be used. 
- You may use Torch/Keras/TF packeges.
- You should implement the recommender system by yourself using only basic Python libraries (such as numpy).

**LINKS:**
- <a href='https://github.com/hexiangnan/neural_collaborative_filtering/tree/master/Data'>Dataset</a>
- <a href='https://github.com/hexiangnan/neural_collaborative_filtering'>Repository</a>
- <a href='https://towardsdatascience.com/paper-review-neural-collaborative-filtering-explanation-implementation-ea3e031b7f96'>Blog Post Review</a>
<br>

**Google <a href='https://colab.research.google.com/'>Colaboratory</a>**  
        
    This is a great academic tool for students. Instead of installing and running "everything" on your Laptop - which probably will take you a lot of time - you can use Google Colab.  
    Basically, you can use it for all your Python needs.  

**PyTorch <a href='https://pytorch.org/tutorials/beginner/basics/intro.html'>Tutorials</a>**   
    
    Just follow steps 0-7 and you will have the basics skills to understand, build, and run DL recommender models. 

**Keras Kaggle's <a href='https://www.kaggle.com/learn/intro-to-deep-learning'>intro-to-deep-learning</a>**  
    
    This will give you a quick idea of what DL is, and how to utilize it.  
    They're using TensorFlow, while in our MLDL program we're using PyTorch.  




**Grading:**

- Q1 - 20 points - Dataset Preparation
- Q2 - 50 points - Neural Collaborative Filtering
- Q3 - 30 points - Loss Function

`Total: 100`

<br><br><br>

**Prerequisites**

In [None]:
%pip install torch torchvision --quiet

**Imports**

In [None]:
# basic
import os 
import sys
import math
import heapq
import argparse
from time import time
import multiprocessing

# general
import warnings
import numpy as np
import scipy as sp
import pandas as pd
import scipy.sparse as sp

# visual
import matplotlib
import seaborn as sns
import matplotlib.pyplot as plt

# visual 3D
from mpl_toolkits import mplot3d

# notebook
from IPython.display import display, HTML


# torch
import torch
from torch import nn
import torch.nn.functional as F
from torch.nn import Sequential
from torch.nn import Sigmoid,ReLU
from torch.nn import Embedding,Linear,Dropout
from torch.utils.data import DataLoader, Dataset
from torchvision.transforms import ToTensor,Compose
from torch.optim import SparseAdam,Adam,Adagrad,SGD

# colab
# from google.colab import drive  

**Hide Warnings**

In [None]:
warnings.filterwarnings('ignore')

**Disable Autoscrolling**

In [None]:
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) {
    return false;
};

<br><br><br>
<br><br><br>
<br><br><br>

## Question 1: Dataset Preparation (Ingestion)

---


<br><br>

This implementation contains one file for training and two files for testing:   
- ml-1m.train.rating   
- ml-1m.test.rating  
- ml-1m.test.negative   

<br>
(feel free to use visual explanations)
<br>

a. **Explain** the role and structure of each file and how it was created from the original <a href='https://github.com/hexiangnan/neural_collaborative_filtering/tree/master/Data'>MovieLens 1M rating dataset</a>.

b. **Explain** how the training dataset is created.

- We iterate through each line of the train file.
- We find the number of items and users.
- We create a sparse matrix with NXM (users X items)
- We go through the file again and for each user item pair in the file line we check if there has been any positive rating.
- If there has been a positive rating we add a binary indicator of implicit feedback (1).

c. **Explain** how the test dataset is created.

Test dataset has been created in a similar manner to the train dataset, except for the negative samples which have been created as vector.
Each line contains 99 negative samples(the id of the user matches the index of the line).

#### Data Preperations:

In [5]:
# urls
train_url = "https://github.com/hexiangnan/neural_collaborative_filtering/blob/master/Data/ml-1m.train.rating?raw=true"
test_url = "https://github.com/hexiangnan/neural_collaborative_filtering/blob/master/Data/ml-1m.test.rating?raw=true"
test_neg_url = "https://github.com/hexiangnan/neural_collaborative_filtering/blob/master/Data/ml-1m.test.negative?raw=true"


In [9]:
import requests

def get_num_users_and_items(data_lines):
    num_users = 0
    num_items = 0
    for line in data_lines:
        line_values = line.split("\t")
        user_id = int(line_values[0])
        item_id = int(line_values[1])
        num_users = max(num_users, user_id)
        num_items = max(num_items, item_id)

    return num_users, num_items

def generate_matrix(data_lines, num_users, num_items):
    matrix = sp.dok_matrix((num_users + 1, num_items + 1), dtype=np.float32)
    for line in data_lines:
        line_values = line.split("\t")
        user_id = int(line_values[0])
        item_id = int(line_values[1])
        rating = float(line_values[2])
        if rating > 0:
            matrix[user_id, item_id] = 1.0
    
    return matrix
    

def load_data_as_matrix(url):
    # download data with requests
    response = requests.get(url)
    # Read as a text file
    data = response.text
    # Split by lines
    data_lines = data.splitlines()
    # Get number of users and items
    num_users, num_items = get_num_users_and_items(data_lines)\
    # Construct matrix
    mat = generate_matrix(data_lines, num_users, num_items)
    
    return mat

def load_negatives_vector(url):
    # download data with requests
    response = requests.get(url)
    # Read as a text file
    data = response.text
    # Split by lines
    data_lines = data.splitlines()
    # Construct vector
    vector = []
    for line in data_lines:
        line_values = line.split("\t")
        vector.append([int(x) for x in line_values[1:]])
    
    return vector
     


In [10]:
train_matrix = load_data_as_matrix(train_url)
test_matrix = load_data_as_matrix(test_url)
test_neg_vector = load_negatives_vector(test_neg_url)

num_users, num_items = train_matrix.shape

<br><br><br>
<br><br><br>
<br><br><br>

## Question 2: Neural Collaborative Filtering 
<br><br>

## a. Build the following four models using the neural collaborative filtering approach: 
- Matrix Factorization (MF)
- Multi layer perceptron (MLP)
- Generalized Matrix Factorization (GMF) 
- NeuroMatrixFactorization (NMF)

**For each model, use the best hyper-parameters suggested in the neuMF paper.**

<br><br><br><br>
#### Matrix Factorization (MF)  
<br>

In [62]:
class MF(nn.Module):
    def __init__(self, num_users, num_items, embedding_size=32):
        super(MF, self).__init__()
        self.num_users = num_users
        self.num_items = num_items
        self.emb_size = embedding_size
        self.emb_item = nn.Embedding(num_embeddings=num_items, embedding_dim=self.emb_size)
        self.emb_user = nn.Embedding(num_embeddings=num_users, embedding_dim=self.emb_size)
        self._init_weights()

    def _init_weights(self):
        nn.init.normal_(self.emb_user.weight, std=0.01)
        nn.init.normal_(self.emb_item.weight, std=0.01)

    def forward(self, users, items):
        emb_user = self.emb_user(users)
        emb_item = self.emb_item(items)
        return emb_user @ emb_item.T

In [63]:
model_MF = MF(num_users, num_items, embedding_size=32)

Model's architecture:

In [64]:
# display/print the model architecture
print(model_MF)

MF(
  (emb_item): Embedding(3706, 32)
  (emb_user): Embedding(6040, 32)
)


<br><br><br><br><br><br>
#### Multi Layer Perceptron (MLP)

In [70]:
class MLP(nn.Module):
    def __init(self,  num_users, num_items, embedding_size=16, mlp_layers_sizes=[32, 16, 8], dropout=0.1):
        super(MLP, self).__init__()
        self.num_users = num_users
        self.num_items = num_items
        self.embedding_size = embedding_size
        self.mlp_layers_sizes = mlp_layers_sizes
        self.dropout = dropout

        self.user_embedding = nn.Embedding(num_embeddings=num_users, embedding_dim=embedding_size)
        self.item_embedding = nn.Embedding(num_embeddings=num_items, embedding_dim=embedding_size)
        self.mlp_layers = nn.Sequential(
            nn.Linear(2 * embedding_size, mlp_layers_sizes[0]),
            nn.ReLU(),
            nn.Dropout(p=dropout),
            nn.Linear(mlp_layers_sizes[0], mlp_layers_sizes[1]),
            nn.ReLU(),
            nn.Dropout(p=dropout),
            nn.Linear(mlp_layers_sizes[1], mlp_layers_sizes[2]),
            nn.ReLU(),
            nn.Dropout(p=dropout),
            nn.Linear(mlp_layers_sizes[2], 1)
        )
        self.activation = nn.Sigmoid()


    
    def forward(self, user_ids, item_ids):
        user_embedding = self.user_embedding(user_ids)
        item_embedding = self.item_embedding(item_ids)
        input_vector = torch.cat([user_embedding, item_embedding], dim=1)
        output = self.mlp_layers(input_vector)
        output = self.activation(output)
        return output

        

In [73]:
model_MLP = MLP()

Model's architecture:

In [72]:
# display/print the model architecture
print(model_MLP)

MLP()


<br><br><br><br><br><br>
####Generalized Matrix Factorization (GMF)

In [74]:
class GMF(nn.Module):
    def __init__(self, num_users, num_items, embedding_size=32):
        super(GMF, self).__init__()
        self.num_users = num_users
        self.num_items = num_items
        self.emb_size = embedding_size
        self.emb_item = nn.Embedding(num_embeddings=num_items, embedding_dim=self.emb_size)
        self.emb_user = nn.Embedding(num_embeddings=num_users, embedding_dim=self.emb_size)
        self.hidden = torch.nn.Linear(self.emb_size, 1) 
        self.activation = nn.Sigmoid()
        self._init_weights()

    def _init_weights(self):
        nn.init.normal_(self.emb_user.weight, std=0.01)
        nn.init.normal_(self.emb_item.weight, std=0.01)

    def forward(self, users, items):
        emb_user = self.emb_user(users)
        emb_item = self.emb_item(items)
        element_wise = emb_user * emb_item
        output = self.hidden(element_wise)
        output = self.activation(output)
        


In [75]:
model_GMF = GMF(num_users, num_items,embedding_size = 32)

Model's architecture:

In [76]:
# display/print the model architecture
print(model_GMF)

GMF(
  (emb_item): Embedding(3706, 32)
  (emb_user): Embedding(6040, 32)
  (hidden): Linear(in_features=32, out_features=1, bias=True)
  (activation): Sigmoid()
)


<br><br><br><br><br><br>
#### NeuroMatrixFactorization (NMF)


In [78]:
# Note for the simplicity of the implementation I've decided not to take mlp and gmf as backbones(like it was done in the paper)
class NCF(nn.Module):
    def __init__(self, num_users, num_items, embedding_size=32, mlp_embedding_size=16, mlp_layers_sizes=[32, 16, 8], dropout=0.1):
        super(NCF, self).__init__()
        self.mlp = MLP(num_users, num_items, embedding_size=mlp_embedding_size, mlp_layers_sizes=mlp_layers_sizes, dropout=dropout)
        self.gmf = GMF(num_users, num_items, embedding_size=embedding_size)
        # Remove activation layers of mlp and gmf
        self.mlp.activation = nn.Identity()
        self.gmf.activation = nn.Identity()
        self.neu_mf = nn.Linear(2 * embedding_size, 1)
        self.activation = nn.Sigmoid()

    
    def forward(self, users, items):
        mlp_output = self.mlp(users, items)
        gmf_output = self.gmf(users, items)
        output = torch.cat([mlp_output, gmf_output], dim=1)
        output = self.neu_mf(output)
        output = self.activation(output)
        return output

In [79]:
# Give both models without last layer
model_NMF = NCF(num_users, num_items, embedding_size=32, mlp_embedding_size=16, mlp_layers_sizes=[32, 16, 8], dropout=0.1)

TypeError: __init__() got an unexpected keyword argument 'embedding_size'

Model's architecture:

In [None]:
# display/print the model architecture
print(model_NMF)

<br><br><br><br><br><br>

## b. Train and evaluate the recommendations accuracy of the models: 
- MF
- GMF
- MLP
- NMF

Compare the `LogLoss` and recommendations accuracy using `NDCG` and `MRR` metrics with cutoff values of 5 and 10.   
Discuss the comparison. 

**Metrics:**
- HitRatio
- nDCG
- MRR

In [None]:
# Use your own metrics implementation OR use external packages for the metrics.
# If you are using external packages make sure they work properly. 
# A lot of the packages available does not work as you would expect.

**Evaluation:**

**HyperParams:**

In [None]:
# the choosen hyperparams will effect your models & your grade 

<br><br>
Create train data:

<br><br>
train & eval:
- Create a training function 
- Evaluate the models trained and save the results accordingly 

In [None]:
# feel free to change the function signature
def model_train(model):
    pass




<br><br><br><br>
<br><br><br><br>
<br><br><br><br>
All Results:

In [None]:
# df_results
# each hyperparam will add a column to the dataframe
# this is an example for a df that would allow you to create plots easily

<br><br><br><br>
**Train & Validation Loss:**

Make sure you did not overfit.  
In case you did, fix that by adding early-stopping, regularization, etc.

In [None]:
# plot
# make sure that you did not overfit!

**Training Time:**

In [None]:
# plot

**Metric Evaluation:**

In [None]:
# plot suggestion/example - you may create your own plot (you should achieve higher results)

<br><br><br><br>
<br><br><br><br>

**c. How do the values of MRR and NDCG differ between your current model and the results you got in the previous exercises which implemented the explicit recommendation approach? What are the differences in preparing the dataset for evaluation?**

**d. How will you measure item similarity using the NeuMF model?**

<br><br><br>
<br><br><br>
<br><br><br>

## Question 3: Loss Function 
<br><br>

#### a. One of the enhancements presented in the Neural Collaborative Filtering paper is the usage of probabilistic activation function (the sigmoid) and binary cross entropy loss function.    

Select one of the models you implemented in question 2 and change the loss function to a `Mean Squared Error` and the activation function of the last layer to `RELU`.   

Train the model and evaluate it in a similar way to what you did in question 2. 
Compare the results and discuss.

<br><br><br><br>
<br><br><br><br>
NMFs Results:

In [None]:
# example: df_results[df_results.model.str.startswith('NMF')]

<br><br><br>
<br><br><br>

Train & Validation Loss:

In [None]:
# plot

<br><br><br>
Training Time:

In [None]:
# plot

<br><br><br>
Metric Evaluation:

In [None]:
# plot

<br><br><br><br>
**Conclusions:**

    - In
    - Your
    - Own
    - Words
    

<br><br>
<br><br>


Good Luck :)