# Product Recommender on RetailRocket Dataset

- We have used the cornac package and mircosoft's recommenders module to train a Bayesian Personalised Ranking model on retail rocket e commere dataset. 

- The model learns and recommends top K items after ranking them based on user and product interactions

- Dataset in itself is huge, hence we have taken a subsample to train the model on google colab

In [None]:
NAME = "Archit Kaila"
COLLABORATORS = "Shrey Gupta, Shen Juin Lee"

In [None]:
## Install required libraries (only for google colab)
!pip install cornac
!pip install recommenders

In [None]:
## Inport standard libraries
import pandas as pd
import numpy as np

In [None]:
## Mount google drive folder
from google.colab import drive
drive.mount('/content/gdrive/')

## Set active path to the folder where our codebase is present
import sys
sys.path.append('/content/gdrive/MyDrive/recommenders_aipi590')

## Import python script to run and evaluate BPR model
from Non_DRL_Recommenders.bpr_model import run_bpr_model

### Read dataset

In [None]:
## Reading the e-commerce dataset
df = pd.read_csv('/content/gdrive/MyDrive/Retail_Rocket_Dataset/events.csv')
df.head()

In [None]:
## The implicit feedback between items and users pairs can be obtained using the events column
df.event.value_counts()

In [None]:
## We take a subsample of our original dataset to train out model
df = df.sample(n=5000, random_state=0)

### Prepare datset

- The BPR implimentation in Cornac module works on the rankings (implicit feedbacks) for each user item pair. 

- We use the Negative Sampling method to prepare our data. This works on the assumption that if there is a interaction between user and item, then ranking is set to one else it is set to 0

- The postive interactions are present in our dataset and the negative interactions we prepare manually

In [None]:
## Set ranking (implicit feedbak) to 1 for interactions between user and item
df = df[['visitorid', 'itemid']].copy()
df['FEEDBACK'] = 1

# Remove duplicates from our samples
df = df.drop_duplicates()

# Rename the columns for explanability
df.rename(columns = {'visitorid': 'userID', 'itemid': 'itemID', 'FEEDBACK': 'rating'}, inplace = True)

df.head()

In [None]:
## Obtain list of unique items and users present in our dataset to genrate negative interations
item_ids = df['itemID'].unique()
user_ids = df['userID'].unique()

In [None]:
## Adding negative feedback (0 ranking) for instances of no interaction between items and users
absent_interactions_feedback = [[user, item, 0] for item in item_ids for user in user_ids] 

# Convert prepared data into a dataframe
negative_feedback_df = pd.DataFrame(data=absent_interactions_feedback, columns=["userID", "itemID", "rating"])

negative_feedback_df.head()

In [None]:
## Merge the positive and negative feedback into one single master dataframe
prepared_dataset = pd.merge(negative_feedback_df, df, on=['userID', 'itemID'], how='outer').fillna(0).drop('rating_x', axis = 1)

# Cleaning up the column names
prepared_dataset.rename(columns = {'rating_y': 'rating'}, inplace = True)

prepared_dataset.head()

In [None]:
## Check number of positive and negative feedback samples
prepared_dataset['rating'].value_counts()

### Run and Evaluate Product Ranking Model

- We use the Cornac module the train and evaluate a Bayesian Personalised Ranking model
- We set the value for top K as 5 and train our model for 50 epochs
- We set the LR to 0.001
- We utilize 80% of our dataset for training and 20% for testing

In [None]:
## Call our BPR model train and evaluation script on our prepared dataset
result = run_bpr_model(data=prepared_dataset, k=5, epochs=20, learning_rate=0.001, train_size=0.8)

In [None]:
## Capture the model metric results on test data
print(result)

# **References**

1. Data Preparation for Colborative Filtering | Microsoft
https://github.com/microsoft/recommenders/blob/main/examples/01_prepare_data/data_transform.ipynb

2. Cornac Movie Recommendation using BPR | Microsoft
https://github.com/microsoft/recommenders/blob/main/examples/02_model_collaborative_filtering/cornac_bpr_deep_dive.ipynb

3. Bayesian Personalised Ranking (BPR) Evaluation Example | PreferredAI, Cornac
https://github.com/PreferredAI/cornac/blob/master/examples/bpr_netflix.py
https://cornac.preferred.ai/

4. BPR: Bayesian personalized ranking from implicit feedback | Rendle, S., Freudenthaler, C., Gantner, Z., & Schmidt-Thieme, L. (2009, June).
https://arxiv.org/ftp/arxiv/papers/1205/1205.2618.pdf