# MLP and NEUMF Test

In this notebook, we will test the MLP and NeuMF methods to predict top recommendations for each user. First, we do a leave-one-out split for the test set. Then, we evaluate our prediction for each user with 100 sampled items to which we add the last item bought.

In [8]:
# Requirements

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [9]:
df_recommendation = pd.read_csv('base de donnée_20_20.csv')
df_recommendation.head()

Unnamed: 0,user,id,rating,timestamp
0,AE23JYHGEN3D35CHE5OQQYJOW5RA,B000EEHKVY,5.0,1427926325000
1,AE23JYHGEN3D35CHE5OQQYJOW5RA,B000TGSM6E,5.0,1480348230000
2,AE23JYHGEN3D35CHE5OQQYJOW5RA,B008FDSWJ0,5.0,1528832546194
3,AE23JYHGEN3D35CHE5OQQYJOW5RA,B012VQ5A7S,5.0,1528829957330
4,AE23JYHGEN3D35CHE5OQQYJOW5RA,B076ZSHQ47,3.0,1583185764147


## Train / Test Split

We begin by doing a train-test split to perform leave-one-out evaluation on the recommendations. Moreover, we will create a file containing negative samples.

Train / Test Split

In [10]:
# Sort dataframe by user and timestamp
df_recommendation = df_recommendation.sort_values(by=['user', 'timestamp'])

# Test set
df_test = df_recommendation.groupby('user').tail(1)

#Train set
df_train = df_recommendation.drop(df_test.index)

df_train.shape, df_test.shape

((37519, 4), (5107, 4))

Let's create a dataset with 5 negative samples for each user that we concatenate with the test set.

In [15]:
# List of all users and items
all_users = df_recommendation['user'].unique()
all_items = df_recommendation['id'].unique()

# All existing interactions set
interactions = set(zip(df_recommendation['user'], df_recommendation['id']))

# Negative items list
negative_samples = []
num_negatives = 5

for user in df_test['user'].unique():
    # All negative samples for each user
    negative_items = [item for item in all_items if (user, item) not in interactions]

    # Sample from negative samples for each user
    sampled_negatives = np.random.choice(negative_items, size=num_negatives, replace=False)

    # Add the sampled items to their list
    negative_samples.append({'user': user, 
                             'negative_1': sampled_negatives[0], 
                             'negative_2': sampled_negatives[1], 
                             'negative_3': sampled_negatives[2], 
                             'negative_4': sampled_negatives[3], 
                             'negative_5': sampled_negatives[4]})

negative_samples_df = pd.DataFrame(negative_samples)

df_test_negative = pd.merge(df_test, negative_samples_df, on='user', how='left')

df_test_negative.shape()

In [19]:
df_test_negative.head()

Unnamed: 0,user,id,rating,timestamp,negative_1,negative_2,negative_3,negative_4,negative_5
0,AE23JYHGEN3D35CHE5OQQYJOW5RA,B09YDBKT7M,5.0,1642464772969,B081592T7F,B0098DFNY8,B01DN7OC00,B0BR2VHR6T,B07MZYNZ1C
1,AE23LDQTB7L76AP6E6WPBFVYL5DA,B09198262S,5.0,1664305424891,B00HTWQWFO,B00B1ZONFG,B079P9LDHN,B0BHG58G2F,B00IPHITHG
2,AE23WLBRYKEC67DM43M6E2MF7GPQ,B07JL1KFT3,5.0,1631833014190,B01CS87OLO,B07HGRFG5J,B0B176CWVM,B0791VK6D9,B0B8Z4YV4T
3,AE23ZFVUOMPKR57BVSWXV34QLMVA,B0719KM5Y8,4.0,1549635394981,B095QBH5ZB,B015WD6FIK,B00K17YFC6,B0C6HXYY3R,B0BT84L827
4,AE24I2EU3AJAAKBXF367XSV37U6Q,B00GG20DJ4,5.0,1647063756658,B076K32CH1,B0C4GR5C3L,B09VP6F1DV,B0BXT6TM6N,B09JJ784V5


In [20]:
# Pandas Datframes to CSV
df_train.to_csv('train.csv', index=False)
df_test.to_csv('test.csv', index=False)
df_test_negative.to_csv('test_negative.csv', index=False)