[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/meeslindhout/Master-Thesis-Batch-Reinforcement-Learning-in-Recommender-Systems/blob/main/2.1%20Retailrocket%20Training%20Agent.ipynb)



Warning! <font color="red">Google collab has maximum session lengths of 1 hour, after 1 hour of running a jupyter notebook it will automatically shut down. This means that if you are training a model, you will lose all progress after 1 hour when you decide to train multiple agents in one session. Therefore, it is recommended to train one agent at a time and save the model weights immediately after training!</font>  

On average it takes about 10 minutes to train an agent for 3000 episodes.

If you would like to immediately use a a pre-trained agent, you can download the pre-trained agent from the following link: [Pre-trained agents](https://icthva-my.sharepoint.com/:f:/g/personal/mees_lindhout_hva_nl/EjZqaU7dDDhNlYVhK2PJT18BKCpxamzrmgyMT9tIHuk-kQ?e=JgkLKM)

## Accessing the datasets from Google Drive & cloning the Github repo

In [None]:
# connect to google drive to retrieve our preprocessed datasets.
from google.colab import drive
drive.mount('/content/drive')
import os

In [None]:
# the processed datasets folder needs to be moved to the github data folder
!ls "drive/MyDrive/Thesis research project"

In [None]:
# Clone the github repo that contains the code of the trainig agent
!git clone https://github.com/meeslindhout/Master-Thesis-Batch-Reinforcement-Learning-in-Recommender-Systems.git
# you can ignore the lfs error.

In [None]:
# At the moment our data folder does not contain the preprocessed data that is currently in our google drive. We need to copy it from google drive to the github repo
!ls Master-Thesis-Batch-Reinforcement-Learning-in-Recommender-Systems/data

In [None]:
# Copy the processed data folder to the data folder in the github repo
!cp -r "/content/drive/MyDrive/Thesis research project/processed datasets" Master-Thesis-Batch-Reinforcement-Learning-in-Recommender-Systems/data

In [None]:
# Now, our data folder does contain the preprocessed data folder and we are ready to go
!ls Master-Thesis-Batch-Reinforcement-Learning-in-Recommender-Systems/data

In [None]:
# set the working directory to be our github repo
os.chdir('Master-Thesis-Batch-Reinforcement-Learning-in-Recommender-Systems')

In [None]:
# install required packages
!pip install wandb -qU

In [1]:
# load all required libraries
import pandas as pd
from src.recsys_rl import rl_recommender
import wandb
from google.colab import runtime

In [None]:
# login to wandb to monitor agent training
wandb.login()

## Load the dataset

In [2]:
train = pd.read_csv(r'data/processed datasets/retailrocket/events_train.csv',
                    sep='\t')
display(train.head())
display(train.shape)

Unnamed: 0,Time,UserId,Type,ItemId,SessionId
0,1438969904,2,view,325215,3
1,1438970013,2,view,325215,3
2,1438970212,2,view,259884,3
3,1438970468,2,view,216305,3
4,1438970905,2,view,342816,3


(1079830, 5)

In [3]:
# how many sessions are in the training set?
train['SessionId'].nunique()

322114

In [4]:
# get unique values from type
train['Type'].unique()

array(['view', 'addtocart', 'transaction'], dtype=object)

In [11]:
# replace view with 0, and purchase with 1
train['Type'] = train['Type'].replace({'view': 0, 'addtocart':1, 'transaction': 2})

In [None]:
# initialize the model / agent
model = rl_recommender(
    n_history=5,
    reward_dict={0:3, 1:8, 2:10},
    event_key='Type',
    mode='training',
    num_episodes=3000,
    batch_size=64,
    target_update_freq=1000,
    memory=100_000,
    learning_rate=0.0003,
    gamma=0.99,
    dataset_name='retailrocket',
    custom_wandb_note=''
    )
model.fit(train=train)

In [None]:
# copy all trained agents to google drive
!cp -r "/content/Master-Thesis-Batch-Reinforcement-Learning-in-Recommender-Systems/trained agents/" "/content/drive/MyDrive/Thesis research project/"

In [None]:
# unassign the runtime from google collab. 
# This is to prevent the runtime from being disconnected after 12 hours.
runtime.unassign()