## Neural Collaborative Filtering

Neural Collaborative Filtering (NCF), which is an innovative algorithm based on deep neural networks to tackle the key problem in recommendation — collaborative filtering — on the basis of implicit feedback. Since we are using neural networks to find relation between users and items, we can easily scale the solution to large datasets. Thus making this method better than Item based collaborative filtering.

NCF works by first representing users and items as vectors in a latent space. These vectors are then used to calculate a score for each user-item pair. The score is then used to predict whether the user will interact with the item. NCF is useful because it can learn non-linear relationships between users and items. This makes it a more powerful model than traditional matrix factorization methods.

### Setting up the environment


In [1]:
# This is only necessary for colab since it only supports python 3.10, but the library we are using only supports <= 3.9.
# Comment this section if you are running it on your local machine

!sudo rm -rf /usr/local/lib/python3.8/dist-packages/OpenSSL
!sudo rm -rf /usr/local/lib/python3.8/dist-packages/pyOpenSSL-22.1.0.dist-info/

!wget https://repo.anaconda.com/miniconda/Miniconda3-py39_23.5.2-0-Linux-x86_64.sh
!chmod +x Miniconda3-py39_23.5.2-0-Linux-x86_64.sh

!bash ./Miniconda3-py39_23.5.2-0-Linux-x86_64.sh -b -f -p /usr/local
import sys
sys.path.append('/usr/local/lib/python3.9/site-packages/')
!pip3 install pyOpenSSL==22.0.0

# Installing the recommenders library.
# Ensure that you have python version <=3.9 when installing this.
!pip install recommenders[examples]

--2024-04-02 09:03:15--  https://repo.anaconda.com/miniconda/Miniconda3-py39_23.5.2-0-Linux-x86_64.sh
Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.130.3, 104.16.131.3, 2606:4700::6810:8203, ...
Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.130.3|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 93409434 (89M) [application/x-sh]
Saving to: ‘Miniconda3-py39_23.5.2-0-Linux-x86_64.sh.6’


2024-04-02 09:03:16 (168 MB/s) - ‘Miniconda3-py39_23.5.2-0-Linux-x86_64.sh.6’ saved [93409434/93409434]

PREFIX=/usr/local
Unpacking payload ...
                                                                                                   
Installing base environment...


Downloading and Extracting Packages

Preparing transaction: - done
Executing transaction: | done
installation finished.
    You currently have a PYTHONPATH environment variable set. This may cause
    unexpected behavior when running the Python interpreter in Miniconda3.
    For 

In [1]:
import sys
import os
import shutil

import pandas as pd
import numpy as np

from recommenders.utils.timer import Timer
from recommenders.datasets.python_splitters import python_chrono_split, python_stratified_split

from recommenders.models.ncf.dataset import Dataset as NCFDataset

# Importing the NCF model class from the recommenders library
from recommenders.models.ncf.ncf_singlenode import NCF

# importing the evaluation metrics
from recommenders.evaluation.python_evaluation import (rmse, mae, rsquared, exp_var, map_at_k, ndcg_at_k, precision_at_k,
                                                     recall_at_k, get_top_k_items)
from recommenders.utils.constants import SEED as DEFAULT_SEED


print("System version: {}".format(sys.version))
print("Pandas version: {}".format(pd.__version__))




System version: 3.9.12 (main, Apr  4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)]
Pandas version: 1.4.2


### Loading the Dataset

In [3]:
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
df_full = pd.read_csv('user_songs_filtered.csv')[['Username', 'track_name', 'artist_name',
                                                    'rank',
                                                    # 'playcount'
                                                    ]]
df_full['track'] = df_full['track_name'] + ' ' + df_full['artist_name']
df_full['itemID'] = df_full.groupby('track').ngroup() + 1
df_full['userID'] = df_full.groupby('Username').ngroup() + 1

df = df_full.copy()
df.rename(columns={'rank': 'rating'}, inplace=True)
# df.rename(columns={'playcount': 'rating'}, inplace=True)
df = df.drop(['track', 'track_name', 'artist_name', 'Username'], axis = 1)

# using a subset of data to reduce runtime to manageable duration, select users who have more than 48 top songs
threshold = 48
df = df[df.groupby('userID')['userID'].transform('size') > threshold]
df = df[['userID', 'itemID', 'rating']]
df.info()

# df.to_csv('df_NCF.csv', index = False)


<class 'pandas.core.frame.DataFrame'>
Int64Index: 51548 entries, 0 to 393030
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   userID  51548 non-null  int64
 1   itemID  51548 non-null  int64
 2   rating  51548 non-null  int64
dtypes: int64(3)
memory usage: 1.6 MB


In [4]:
# df = pd.read_csv('/content/drive/MyDrive/df_NCF.csv')
# df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 94748 entries, 0 to 94747
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   rating  94748 non-null  int64
 1   itemID  94748 non-null  int64
 2   userID  94748 non-null  int64
dtypes: int64(3)
memory usage: 2.2 MB


In [3]:
#Split the dataset into 75% train and 25% test

# header = {
#     "col_user": "userID",
#     "col_item": "itemID",
#     "col_rank": "rank",
#     # "col_rank": 'playcount',
#     "col_prediction": "Prediction",
# }

train, test = python_stratified_split(
    df, ratio=0.75,
    #   col_user="userID", col_item="itemID", seed=42
)

# Filtering out users and items in the test set that do not appear in the training set.
# This is done so that we can see if our model has learnt user's previous item interactions and can recommend relevant items.
test = test[test["userID"].isin(train["userID"].unique())]
test = test[test["itemID"].isin(train["itemID"].unique())]

# Creating a test set which only contains the last interaction for each user. Remaining data of the user is used in the train set
leave_one_out_test = test.groupby("userID").last().reset_index()

test.head()

Unnamed: 0,userID,itemID,rating
307882,6,69610,36
119014,6,92294,33
153328,10,121378,17
219164,10,83514,10
270950,10,129097,32


In [4]:
# top k items to recommend
TOP_K = 10

# Model parameters
# Number of iterations during the training process
EPOCHS = 100
# Batch size means how many user-item pairs you want to predict at once
BATCH_SIZE = 256

# Setting seed to remove any stochasticity and reproduce results
SEED = DEFAULT_SEED  # Set None for non-deterministic results

In [5]:
# Writing the data into csv files
train_file = "train.csv"
test_file = "test.csv"
leave_one_out_test_file = "leave_one_out_test.csv"

train.to_csv(train_file, index=False)
test.to_csv(test_file, index=False)
leave_one_out_test.to_csv(leave_one_out_test_file, index=False)

In [6]:
data = NCFDataset(train_file=train_file, test_file=leave_one_out_test_file, seed=SEED, overwrite_test_file_full=True)

INFO:recommenders.models.ncf.dataset:Indexing train.csv ...
INFO:recommenders.models.ncf.dataset:Indexing leave_one_out_test.csv ...
INFO:recommenders.models.ncf.dataset:Creating full leave-one-out test file leave_one_out_test_full.csv ...
100%|██████████| 999/999 [00:13<00:00, 76.04it/s]
INFO:recommenders.models.ncf.dataset:Indexing leave_one_out_test_full.csv ...


### Training the NCF Model

NCF parameters:

`n_users`, number of users. We are one hot encoding our user data. Therefore the input size of the model will be number of users.

`n_items`, number of items. Same logic as `n_users`.

`batch_size`, number of examples you want the model to process at a time. Higher value will consume more memory.

`learning_rate`, this can be thought of as how much you want the model to change after one iteration. Large value will lead to unstability and very small values will take more time to converge.

`n_factors`, which controls the dimension of the latent space. Usually, the quality of the training set predictions grows with as n_factors gets higher.

`layer_sizes`, sizes of input layer (and hidden layers) of MLP, input type is list. We have set it to [64,32,16,8,4] as from training and testing, higher values gave better results. 

`n_epochs`, which defines the number of iteration of the SGD procedure. Note that both parameter also affect the training time.

`model_type`, we can train single "MLP", "GMF" or combined model "NCF" by changing the type of model.

[Reference: https://github.com/recommenders-team/recommenders/blob/main/examples/02_model_collaborative_filtering/ncf_deep_dive.ipynb]

In [7]:
model = NCF (
    n_users=data.n_users,
    n_items=data.n_items,
    model_type="NeuMF",
    n_factors=4,
    layer_sizes=[64,32,16,8,4],
    n_epochs=EPOCHS,
    batch_size=BATCH_SIZE,
    learning_rate=1e-3,
    verbose=10,
    seed=SEED
)





In [8]:
with Timer() as train_time:
    model.fit(data)

print("Took {} seconds for training.".format(train_time.interval))

INFO:recommenders.models.ncf.ncf_singlenode:Epoch 10 [7.69s]: train_loss = 0.122677 
INFO:recommenders.models.ncf.ncf_singlenode:Epoch 20 [7.67s]: train_loss = 0.045938 
INFO:recommenders.models.ncf.ncf_singlenode:Epoch 30 [8.34s]: train_loss = 0.028823 
INFO:recommenders.models.ncf.ncf_singlenode:Epoch 40 [8.46s]: train_loss = 0.020940 
INFO:recommenders.models.ncf.ncf_singlenode:Epoch 50 [8.53s]: train_loss = 0.016842 
INFO:recommenders.models.ncf.ncf_singlenode:Epoch 60 [8.60s]: train_loss = 0.014509 
INFO:recommenders.models.ncf.ncf_singlenode:Epoch 70 [8.08s]: train_loss = 0.011180 
INFO:recommenders.models.ncf.ncf_singlenode:Epoch 80 [7.73s]: train_loss = 0.010230 
INFO:recommenders.models.ncf.ncf_singlenode:Epoch 90 [7.73s]: train_loss = 0.009298 
INFO:recommenders.models.ncf.ncf_singlenode:Epoch 100 [7.71s]: train_loss = 0.008830 


Took 792.3021583000001 seconds for training.


### Prediction

After fitting the model, we can call `predict` to get some predictions. `predict` returns an internal object Prediction which can be easily converted back to a dataframe.

In [9]:
predictions = [[row.userID, row.itemID, model.predict(row.userID, row.itemID)]
               for (_, row) in test.iterrows()]

predictions = pd.DataFrame(predictions, columns=['userID', 'itemID', 'prediction'])
predictions.head()

Unnamed: 0,userID,itemID,prediction
0,6,69610,4.548416e-28
1,6,92294,3.2561480000000004e-17
2,10,121378,1.326578e-18
3,10,83514,1.529767e-11
4,10,129097,1.34721e-09


### Generic Evaluation

We remove songs that are already users' top songs in the top k recommendations. To compute ranking metrics, we need predictions on all user, item pairs. We do not want to recommend the same item again to the user.

MAP - It is the average precision for each user normalized over all users.

Normalized Discounted Cumulative Gain (NDCG) - evaluates how well the predicted items for a user are ranked based on relevance

Precision - this measures the proportion of recommended items that are relevant

Recall - this measures the proportion of relevant items that are recommended


In [10]:
with Timer() as test_time:

    users, items, preds = [], [], []
    item = list(train.itemID.unique())
    for user in train.userID.unique():
        user = [user] * len(item)
        users.extend(user)
        items.extend(item)
        preds.extend(list(model.predict(user, item, is_list=True)))

    all_predictions = pd.DataFrame(data={"userID": users, "itemID":items, "prediction":preds})

    merged = pd.merge(train, all_predictions, on=["userID", "itemID"], how="outer")
    all_predictions = merged[merged.rating.isnull()].drop('rating', axis=1)

print("Took {} seconds for prediction.".format(test_time.interval))

Took 32.72318150000001 seconds for prediction.


In [3]:
# # all for the RAM limitations...
# all_predictions = pd.read_csv("/content/drive/MyDrive/all_predictions_1.csv")
# train = pd.read_csv("/content/drive/MyDrive/train.csv")

# merged = pd.merge(train, all_predictions, on=["userID", "itemID"], how="outer")
# all_predictions = merged[merged.rating.isnull()].drop('rating', axis=1)

# all_predictions.to_csv("/content/drive/MyDrive/all_predictions_2.csv", index = False)

In [1]:
# import pandas as pd
# # importing the evaluation metrics
# from recommenders.evaluation.python_evaluation import (rmse, mae, rsquared, exp_var, map_at_k, ndcg_at_k, precision_at_k,
#                                                      recall_at_k, get_top_k_items)



In [2]:
# all_predictions = pd.read_csv("all_predictions_2.csv")
# test = pd.read_csv("test.csv")
# TOP_K = 10

In [11]:
eval_precision = precision_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
eval_recall = recall_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
print(f"Precision: {eval_precision} \n Recall: {eval_recall}")

Precision: 0.028828828828828833 
 Recall: 0.036776603443270106


In [12]:
eval_map = map_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
print(f"MAP@K: {eval_map}")

MAP@K: 0.016139370706169647


In [13]:
eval_ndcg = ndcg_at_k(test, all_predictions, col_prediction='prediction', k=TOP_K)
print(f"NDCG@K: {eval_ndcg}")

NDCG@K: 0.03807577652405223


### Summary of Ranking Metrics

<center>

|Metric|Range|Selection criteria|Limitation|
|------|-------------------------------|---------|----------|
|Precision|$\geq 0$ and $\leq 1$|Higher the better.|Only for hits in recommendations.|
|Recall|$\geq 0$ and $\leq 1$|Higher the better.|Only for hits in the ground truth.|
|NDCG|$\geq 0$ and $\leq 1$|Higher the better.|Does not penalize for bad/missing items, and does not perform for several equally good items.|
|MAP|$\geq 0$ and $\leq 1$|Higher the better.|Depend on variable distributions.|

</center>

### "Leave-one-out" Evaluation

For each item in test data, we randomly samples 100 items that are not interacted by the user, ranking the test item among the 101 items (1 positive item and 100 negative items). The performance of a ranked list is judged by Hit Ratio (HR) and Normalized Discounted Cumulative Gain (NDCG). Finally, we average the values of those ranked lists to obtain the overall HR and NDCG on test data.

We truncated the ranked list at 10 for both metrics. As such, the HR intuitively measures whether the test item is present on the top-10 list, and the NDCG accounts for the position of the hit by assigning higher scores to hits at top ranks.

In [14]:
k = TOP_K

ndcgs = []
hit_ratio = []

for b in data.test_loader():
    user_input, item_input, labels = b
    output = model.predict(user_input, item_input, is_list=True)

    output = np.squeeze(output)
    rank = sum(output >= output[0])
    if rank <= k:
        ndcgs.append(1 / np.log(rank + 1))
        hit_ratio.append(1)
    else:
        ndcgs.append(0)
        hit_ratio.append(0)

eval_ndcg = np.mean(ndcgs)
eval_hr = np.mean(hit_ratio)

print("HR:\t%f" % eval_hr)
print("NDCG:\t%f" % eval_ndcg)

HR:	0.506507
NDCG:	0.444740
