##Overview

This Notebook is a variation on the 2nd notebook in the project, implementing the candidates model.

The difference is in the creation of the user complex embedding.\
Instead of averaging over the embeddings of the viewing history movies, an RNN unit is applied to the history.
Several modifications had to be made to accomedate the RNN:
1. The user history is of constant size 50 with padding when necessary. This is kept the same, but the 0's padding is now to the left of the history. It was verfied that this has no effect over the nominal results, so all was handled correctly.
2. Dealing with 0 embeddings had to change. If before we could just not take the 0 embedding in the average, now we are forcing the 0 embedding to be a vector of 0s always, to not interfere with the RNN result.

### RNN variations and outputs
We tried both GRU and LSTM as RNN unit.\
The configuration was always one-directional with 2 layers.

For final RNN representation, we tried using both the last hidden state, as well as the average of all stages.

### Results

1. We were not able to improve the results beyond those of the basic averaging model.
2. LSTM performed much better than the GRU and got close to our best results with 40.3% HR@200



**Imports and administration**

In [1]:
# basic
from google.colab import runtime
import os 
import sys
import math
from time import time
import zipfile
import requests
import pickle
import gdown
!pip install --upgrade --no-cache-dir gdown

# general
import warnings
import numpy as np
import pandas as pd
# !pip install scikit-learn
# from sklearn.neighbors import LSHForest
# from sklearn import neighbors.LSHForest
# !apt install libomp-dev
# !python -m pip install --upgrade faiss faiss-gpu
# !pip install faiss-gpu
# import faiss
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import average_precision_score

# visual
import matplotlib
import seaborn as sns
import matplotlib.pyplot as plt

# notebook
from IPython.display import display, HTML
from tqdm import tqdm
import copy


# torch
import torch
from torch import nn
import torch.nn.functional as F
from torch.nn import Sequential
from torch.nn import Sigmoid,ReLU
from torch.nn import Embedding,Linear,Dropout
from torch.utils.data import DataLoader, Dataset
from torchvision.transforms import ToTensor,Compose
from torch.optim import SparseAdam,Adam,Adagrad,SGD

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
device

In [None]:
warnings.filterwarnings('ignore')

In [None]:
# from google.colab import drive
# drive.mount('/content/drive')

Mounted at /content/drive


**Loading the datasets**

In [None]:
# path = './drive/My Drive/Colab Notebooks/recsys_final_project/train_10_gru'
# with open(path , 'rb') as f:
#   (train, test, val, n_items, max_watches, num_neg_samples) = pickle.load(f)
#   print('loaded training')

loaded training


In [2]:
url = "https://drive.google.com/file/d/1-0YxUZUm3Vw3HzTCB4RHrH7upK9gcnXb/view?usp=sharing"
datasets_path = 'gru_datasets'
gdown.download(url, datasets_path, quiet=False,fuzzy=True)

# path = './drive/My Drive/Colab Notebooks/recsys_final_project/train_10_gru'
with open(datasets_path , 'rb') as f:
  (train, test, val, n_items, max_watches, num_neg_samples) = pickle.load(f)
  print('loaded training')

Downloading...
From: https://drive.google.com/uc?id=1-0YxUZUm3Vw3HzTCB4RHrH7upK9gcnXb
To: /content/gru_datasets
100%|██████████| 402M/402M [00:02<00:00, 193MB/s]


loaded training


In [None]:
n_users = 6040

In [None]:
class Candidates(torch.nn.Module):
    def __init__(self, config):
        super(Candidates, self).__init__()

        self.num_items = config['num_items']
        self.n_neg_samples = config['n_neg_samples']
        self.max_watches = config['max_watches']
        self.latent_dim = config['latent_dim']
        self.features_dim = config['features_dim']
        self.layers = config['layers']

        self.embed_items = torch.nn.Embedding(num_embeddings=self.num_items, embedding_dim=self.latent_dim) #log 4000 ~12
        # with torch.no_grad():
        #   self.embed_items.weight[0,:] = torch.zeros(self.latent_dim)#.to(device)
        self.embed_occ = torch.nn.Embedding(num_embeddings=21, embedding_dim=5) # 21 occupations log~5
        self.embed_age = torch.nn.Embedding(num_embeddings=7, embedding_dim=3) # 7 occupations log~3

        # self.rnn = nn.GRU(input_size = self.latent_dim, hidden_size = self.latent_dim, \
        #             num_layers = 2, batch_first = True, bidirectional=False) 
        
        self.rnn = nn.LSTM(input_size = self.latent_dim, hidden_size = self.latent_dim, \
                    num_layers = 2, batch_first = True, bidirectional=False) 

        self.fc_layers = torch.nn.ModuleList()
        self.fc_layers.append(torch.nn.Linear(self.features_dim, self.layers[0])) #linking features to MLP
        if len(self.layers) > 1:
          for (in_size, out_size) in zip(self.layers[:-1], self.layers[1:]):
              self.fc_layers.append(torch.nn.Linear(in_size, out_size))
        
        # self.LSHIndex = LSHIndex()
        self.N_candidates = config['N_candidates']
        self.knn = KNeighborsClassifier(n_neighbors=self.N_candidates) #initiating the KNN model with maximal N_candidates

    def forward(self, features, mode):
        # previous watches embedding and averaging
        # We must take into account the possible padding with 0's of the user history
        n = self.max_watches
        previous_watches = features[:, :n].int()
        with torch.no_grad():
          self.embed_items.weight[0,:] = torch.zeros(self.latent_dim)#.to(device)
        embedded_items = self.embed_items(previous_watches)
        embedded_items_rnn, ht = self.rnn(embedded_items)

        last_hidden_rnn = embedded_items_rnn[:,-1, :] # Taking the last hidden state

        # # optional averaging of RNN outputs
        # mask = torch.unsqueeze((previous_watches > 0),2)
        # sum = mask.sum(dim=1)
        # sum[sum == 0] = 1 #so we don't divide by 0 and get nan (even though overwritten)
        # embedded_items_mean = (embedded_items_rnn*mask).sum(dim=1)/sum
        # n_indices = torch.count_nonzero(previous_watches, dim=1)
        # embedded_items_mean[(n_indices == 0)] = torch.zeros(self.latent_dim).to(device)

        # other embeddings
        user_age = features[:, n].int()
        emb_age = self.embed_age(user_age)
        user_occupation = features[:, n+1].int()
        emb_occ = self.embed_occ(user_occupation)
        
        # other features
        other_features = features[:, n+2:n+5]
        
        # all
        # vector = torch.hstack((embedded_items_mean, emb_age, emb_occ, other_features)).to(torch.float32)
        vector = torch.hstack((last_hidden_rnn, emb_age, emb_occ, other_features)).to(torch.float32)

        for idx, _ in enumerate(range(len(self.fc_layers)-1)):
            vector = self.fc_layers[idx](vector)
            vector = torch.nn.ReLU()(vector)
        vector = self.fc_layers[-1](vector) #last layer without RELU

        if mode == 'training':
          item_id_label = features[:, -self.n_neg_samples-1].int()
          negative_samples = features[:, -self.n_neg_samples:].int() #negative samples are precalculated

          sample_indices = torch.hstack([torch.unsqueeze(item_id_label, 1), negative_samples])
          sample_embeddings = self.embed_items(sample_indices)
          dot_products = torch.matmul(sample_embeddings, torch.unsqueeze(vector, dim = 2))
          return dot_products.squeeze()
    
        elif mode == 'serving':
          # candidates = self.lshf.kneighbors(vector, n_neighbors=self.N_candidates, return_distance=False)
          # candidates = self.LSHIndex.query(vector, k = self.N_candidates)
          candidates = self.knn.kneighbors(vector.detach().cpu().numpy(), return_distance=False)
          return candidates #only indices

In [None]:
class Training(object):

  def __init__(self, model, config):
    self.config = config
    self.model = model.to(self.config['device'])
    self.n_neg_samples = config['n_neg_samples']
    self.labels = torch.hstack((torch.tensor(1), torch.zeros(self.n_neg_samples))).repeat(config['batch_size'], 1).to(device)
    #importance weights with ratio of ~3000/100 = 30
    self.importance_weights = torch.hstack((torch.tensor(1), 30*torch.ones(self.n_neg_samples))).to(device)
    self.optimizer = config['optimizer_type'](model.parameters(), **config['optimizer_parameter'])
    self.criterion = config['criterion'](weight = self.importance_weights) #here we add importance weights to the loss function
    self.dl_train = DataLoader(train, batch_size=config['batch_size'], shuffle=True) # create dataloader with given batch size
    self.dl_val = DataLoader(val, batch_size=config['batch_size'], shuffle=False) # create dataloader with given batch size
    self.knn_labels = np.ones(self.model.num_items)
    self.AP_labels = torch.tensor([1, 0]).repeat(1, n_users).squeeze()

  def train(self):
    self.train_loss_history = []
    self.eval_loss_history = []
    self.eval_map_history = []
    self.eval_HR100_history = []
    self.eval_HR200_history = []
    self.eval_MRR100_history = []
    self.eval_MRR200_history = []
    self.eval_NDCG100_history = []
    self.eval_NDCG200_history = []

    epochs_without_improvement = 0
    best_HR = None 
    train_start = time()
    for epoch in range(self.config['n_epochs']):
      self.train_epoch() #train
      self.train_loss_history.append(self.epoch_train_loss/len(self.dl_train))
      # extract all embeddings
      all_embeddings = self.model.embed_items(torch.arange(0, n_items).to(device)).detach().cpu().numpy()
      self.model.knn.fit(all_embeddings, self.knn_labels)
      # self.model.LSHIndex.build(all_embeddings)
      # self.model.lshf.kneighbors.fit(all_embeddings)
      if epoch%1 == 0:
      # if epoch%10 == 0:
        self.evaluate_epoch(self.dl_val) #evaluate
        # aggregate metrics: note len(val) = len(test)
        self.eval_loss_history.append(self.epoch_eval_loss/len(self.dl_val))
        average_precision = average_precision_score(self.AP_labels, self.epoch_pred)
        self.eval_map_history.append(average_precision)
        self.eval_HR100_history.append(self.epoch_HR100/len(val))
        self.eval_HR200_history.append(self.epoch_HR200/len(val))
        self.eval_MRR100_history.append(self.epoch_MRR100/len(val))
        self.eval_MRR200_history.append(self.epoch_MRR200/len(val))
        self.eval_NDCG100_history.append(self.epoch_NDCG100/len(val))
        self.eval_NDCG200_history.append(self.epoch_NDCG200/len(val))
        print(f'epoch {epoch}: loss = {self.train_loss_history[-1]}, HR@200 = {self.eval_HR200_history[-1]}, AP = {self.eval_map_history[-1]}')
        # print(f'epoch {epoch}: loss = {self.train_loss_history[-1]}, HR@200 = {self.eval_HR200_history[-1]}')
        #check for early stopping
        if not best_HR or self.eval_HR200_history[-1] > best_HR:
          best_HR = self.eval_HR200_history[-1]
          # best_MRR200 = self.eval_MRR200_history[-1]
          # best_NDCG200 = self.eval_NDCG200_history[-1]
          # best_loss = self.eval_loss_history[-1]
          epochs_without_improvement = 0
          #print ("Achieved lower validation loss, save model at epoch number {} ".format(epoch + 1) )
          best_model = copy.deepcopy(self.model.state_dict())
        else:
          epochs_without_improvement += 1

        if epochs_without_improvement == self.config['early_stopping']:
          if self.config['verbose']:
              print('\nEarly stoping after {} epochs. validation loss did not imporve for more than {} epcochs'.format(epoch, self.config['early_stopping']))
          break
    self.training_time = time() - train_start

    # load best model and best performance
    self.model.load_state_dict(best_model)
    if self.config['verbose']:
        print('\nFinished Training:')
        print('Best metrics are:')
        print(f'Hit Ratio eval = {best_HR}')
    
  def train_epoch(self):
    self.epoch_train_loss   = 0
    self.model.train() # train mode
    for batch in tqdm(self.dl_train, disable=(not self.config['verbose'])):
      self.train_batch(batch)

  def train_batch(self, batch):
     
    batch = batch.to(device)  
    pred = self.model(batch, mode = 'training')
    labels = self.labels
    if pred.shape[0] < self.config['batch_size']:
      labels = torch.hstack((torch.tensor(1), torch.zeros(self.n_neg_samples))).repeat(pred.shape[0], 1).to(device)
    loss = self.criterion(pred, labels)
              
    self.optimizer.zero_grad()
    loss.backward()
    self.optimizer.step()               
    self.epoch_train_loss += loss.item()

  def evaluate_epoch(self, dl_eval):
    self.epoch_eval_loss = 0
    self.epoch_pred = torch.empty(0) #Aggregated for AP calculation
    self.epoch_HR100 = 0
    self.epoch_HR200 = 0
    self.epoch_MRR100 = 0
    self.epoch_MRR200 = 0
    self.epoch_NDCG100 = 0
    self.epoch_NDCG200 = 0

    self.model.eval() #evaluation mode
    with torch.no_grad():
      for batch in tqdm(dl_eval, disable=(not self.config['verbose'])):
        self.eval_batch(batch) # dl_val was built so that one batch is one user


  def eval_batch(self, batch):
    target_items = batch[:, -num_neg_samples-1]
    # Send tensor to GPU    
    batch = batch.to(device)

    #evaluations of candidates retrieval
    candidates_batch = self.model(batch, mode = 'serving') #already returned sorted from high to low
    for candidates, target_item in zip(candidates_batch, target_items):
      self.epoch_HR100 += self.HitRatio(candidates, target_item, 100)
      self.epoch_HR200 += self.HitRatio(candidates, target_item, 200)
      self.epoch_MRR100 += self.MRR(candidates, target_item, 100)
      self.epoch_MRR200 += self.MRR(candidates, target_item, 200)
      self.epoch_NDCG100 += self.NDCG(candidates, target_item, 100)
      self.epoch_NDCG200 += self.NDCG(candidates, target_item, 200)
  
    #AP evaluation using negative samples
    pred = self.model(batch, mode = 'training') # not really training - extracts negative samples prediction for AP calculation
    labels = self.labels
    if pred.shape[0] < self.config['batch_size']:
      labels = torch.hstack((torch.tensor(1), torch.zeros(self.n_neg_samples))).repeat(pred.shape[0], 1).to(device)
    loss = self.criterion(pred, labels)        
    self.epoch_eval_loss += loss.item()

    probabilities = F.softmax(pred.detach().cpu(), dim = 1)
    for prob in probabilities: #concatenating predictions for average precision calculation at the end of the epoch
      self.epoch_pred = torch.hstack((self.epoch_pred, prob[:2])) #first two here (positive and negative)


  def extract_candidates(self, dl_eval):
    self.model.eval() #evaluation mode
    all_candidates = np.empty((0, self.model.N_candidates))
    with torch.no_grad():
      for batch in tqdm(dl_eval, disable=(not self.config['verbose'])):
        target_items = batch[:, -num_neg_samples-1]
        # Send tensor to GPU    
        batch = batch.to(device)
        candidates_batch = self.model(batch, mode = 'serving')
        all_candidates = np.vstack([all_candidates, candidates_batch])
        # previous_watches = batch[:, :max_watches].int()
        # for candidates, prev_watches in zip(candidates_batch, previous_watches):
        #   #remove previous watches from candidates, since there are no duplicities in this dataset
        #   candidates = [x for x in candidates if x not in prev_watches]
      return all_candidates

  def HitRatio(self, ranked_items, target_item, k):
    for item in ranked_items[:k]:
      if item == target_item:
        return 1
    return 0

  def MRR(self, ranked_items, target_item, k):
    for i, item in enumerate(ranked_items[:k]):
      if item == target_item:
        return 1/(i + 1)
    return 0

  def NDCG(self, ranked_items, target_item, k):
    for i, item in enumerate(ranked_items[:k]):
      if item == target_item:
        return np.log(2)/np.log(i + 2)
    return 0


###Experiments

Below you can find learning processes for different configurations (in comment).\
The Best configuration achieved was with a 2 layers LSTM and MLP of layers = [1024, 512, 256].\
The test results for HR@200 were 40.3%, which is a little below the averaging model.\
We only run 61 epochs for all cases, which don't converge completely, but the behavior is still consistently lower than the nominal.

In [None]:
# best_results = pd.DataFrame(columns=['maximal width', 'Batch Size', 'Learning Rate', 'Topk', 'Metric', 'Score']) #uncomment to run

In [None]:
def add_results(training_model, results_df, max_width, batch_size, lr):

  MAP = average_precision_score(training_model.AP_labels, training_model.epoch_pred)
  loss = training_model.epoch_eval_loss/len(val)
  hr100 = training_model.epoch_HR100/len(val)
  hr200 = training_model.epoch_HR200/len(val)
  mrr100 = training_model.epoch_MRR100/len(val)
  mrr200 = training_model.epoch_MRR200/len(val)
  ndcg100 = training_model.epoch_NDCG100/len(val)
  ndcg200 = training_model.epoch_NDCG200/len(val)
  tr_time = training_model.training_time

  results_df.loc[len(results_df)] = max_width, batch_size, lr, 0, 'LOSS', loss
  results_df.loc[len(results_df)] = max_width, batch_size, lr, 0, 'MAP', MAP
  results_df.loc[len(results_df)] = max_width, batch_size, lr, 0, 'TIME', tr_time
  results_df.loc[len(results_df)] = max_width, batch_size, lr, 100, 'HR' , hr100
  results_df.loc[len(results_df)] = max_width, batch_size, lr, 200, 'HR', hr200
  results_df.loc[len(results_df)] = max_width, batch_size, lr, 100, 'MRR', mrr100
  results_df.loc[len(results_df)] = max_width, batch_size, lr, 200, 'MRR', mrr200
  results_df.loc[len(results_df)] = max_width, batch_size, lr, 100, 'NDCG', ndcg100
  results_df.loc[len(results_df)] = max_width, batch_size, lr, 200, 'NDCG', ndcg200

In [None]:
batch_size = 50 # with 2 layers LSTM and 0 for 0 embedding - reverse padding
dl_test = DataLoader(test, batch_size=batch_size, shuffle=False)
lr = 0.001
layers = [1024, 512, 256] 

torch.manual_seed(42)
np.random.seed(42) 
maximal_width = layers[0]
Candidates_config = {'num_items': n_items, 'n_neg_samples': num_neg_samples, 'max_watches': max_watches, 'latent_dim': 256, 'features_dim': 267, 'layers':layers, 'N_candidates':250}
model_Candidates = Candidates(Candidates_config)

training_config = {'n_neg_samples': num_neg_samples, 'batch_size': batch_size, 'optimizer_type': Adam, 'optimizer_parameter': {'lr': lr}, \
              'criterion' : torch.nn.CrossEntropyLoss, 'n_epochs' : 61, 'early_stopping' : 8, 'verbose' : True, 'device' : device}


training_candidates = Training(model_Candidates, training_config)
training_candidates.train()

100%|██████████| 6040/6040 [01:09<00:00, 87.26it/s]
100%|██████████| 121/121 [00:55<00:00,  2.18it/s]


epoch 0: loss = 2.846972787735478, HR@200 = 0.13990066225165562, AP = 0.9178394396176548


100%|██████████| 6040/6040 [01:07<00:00, 88.91it/s]
100%|██████████| 121/121 [00:54<00:00,  2.22it/s]


epoch 1: loss = 2.1036979544439065, HR@200 = 0.1607615894039735, AP = 0.9350982177866177


100%|██████████| 6040/6040 [01:07<00:00, 89.00it/s]
100%|██████████| 121/121 [00:54<00:00,  2.23it/s]


epoch 2: loss = 1.7959864473895522, HR@200 = 0.1804635761589404, AP = 0.9402215113224162


100%|██████████| 6040/6040 [01:08<00:00, 88.79it/s]
100%|██████████| 121/121 [00:53<00:00,  2.26it/s]


epoch 3: loss = 1.5865162094105159, HR@200 = 0.19983443708609272, AP = 0.9404166902192743


100%|██████████| 6040/6040 [01:08<00:00, 88.35it/s]
100%|██████████| 121/121 [00:53<00:00,  2.26it/s]


epoch 4: loss = 1.4302396861823978, HR@200 = 0.2208609271523179, AP = 0.9381992735547047


100%|██████████| 6040/6040 [01:08<00:00, 88.57it/s]
100%|██████████| 121/121 [00:52<00:00,  2.29it/s]


epoch 5: loss = 1.3102589648587024, HR@200 = 0.23576158940397351, AP = 0.9364530807938034


100%|██████████| 6040/6040 [01:08<00:00, 88.80it/s]
100%|██████████| 121/121 [00:52<00:00,  2.31it/s]


epoch 6: loss = 1.2145880748351283, HR@200 = 0.26026490066225166, AP = 0.9344839984610211


100%|██████████| 6040/6040 [01:08<00:00, 88.26it/s]
100%|██████████| 121/121 [00:52<00:00,  2.30it/s]


epoch 7: loss = 1.1454918512593437, HR@200 = 0.2589403973509934, AP = 0.9319684955535823


100%|██████████| 6040/6040 [01:08<00:00, 87.79it/s]
100%|██████████| 121/121 [00:51<00:00,  2.36it/s]


epoch 8: loss = 1.088430686931539, HR@200 = 0.27549668874172184, AP = 0.9313438811809623


100%|██████████| 6040/6040 [01:08<00:00, 87.61it/s]
100%|██████████| 121/121 [00:51<00:00,  2.36it/s]


epoch 9: loss = 1.042808849670437, HR@200 = 0.28824503311258276, AP = 0.9296121102177879


100%|██████████| 6040/6040 [01:08<00:00, 88.53it/s]
100%|██████████| 121/121 [00:51<00:00,  2.35it/s]


epoch 10: loss = 1.0046956836535836, HR@200 = 0.3, AP = 0.9297834593647021


100%|██████████| 6040/6040 [01:08<00:00, 88.50it/s]
100%|██████████| 121/121 [00:50<00:00,  2.40it/s]


epoch 11: loss = 0.9709330872737414, HR@200 = 0.3172185430463576, AP = 0.9303070973613194


100%|██████████| 6040/6040 [01:07<00:00, 89.15it/s]
100%|██████████| 121/121 [00:50<00:00,  2.42it/s]


epoch 12: loss = 0.9409051820931845, HR@200 = 0.316887417218543, AP = 0.9279148074375285


100%|██████████| 6040/6040 [01:08<00:00, 88.48it/s]
100%|██████████| 121/121 [00:50<00:00,  2.41it/s]


epoch 13: loss = 0.9215787431558237, HR@200 = 0.3293046357615894, AP = 0.9271868464918587


100%|██████████| 6040/6040 [01:08<00:00, 88.42it/s]
100%|██████████| 121/121 [00:49<00:00,  2.46it/s]


epoch 14: loss = 0.8975180417842028, HR@200 = 0.33427152317880793, AP = 0.9287457442215659


100%|██████████| 6040/6040 [01:08<00:00, 88.80it/s]
100%|██████████| 121/121 [00:49<00:00,  2.44it/s]


epoch 15: loss = 0.8803148831974789, HR@200 = 0.34139072847682117, AP = 0.9236642656069736


100%|██████████| 6040/6040 [01:08<00:00, 88.62it/s]
100%|██████████| 121/121 [00:49<00:00,  2.43it/s]


epoch 16: loss = 0.8594140406673317, HR@200 = 0.3485099337748344, AP = 0.9244465079475914


100%|██████████| 6040/6040 [01:08<00:00, 88.46it/s]
100%|██████████| 121/121 [00:48<00:00,  2.47it/s]


epoch 17: loss = 0.8426348207086719, HR@200 = 0.35016556291390727, AP = 0.925687818197404


100%|██████████| 6040/6040 [01:09<00:00, 87.44it/s]
100%|██████████| 121/121 [00:49<00:00,  2.43it/s]


epoch 18: loss = 0.8290972479649924, HR@200 = 0.35844370860927155, AP = 0.9247208733642707


100%|██████████| 6040/6040 [01:08<00:00, 87.83it/s]
100%|██████████| 121/121 [00:49<00:00,  2.44it/s]


epoch 19: loss = 0.8151336828653781, HR@200 = 0.3665562913907285, AP = 0.923544366376249


100%|██████████| 6040/6040 [01:08<00:00, 88.66it/s]
100%|██████████| 121/121 [00:48<00:00,  2.51it/s]


epoch 20: loss = 0.8008341050651294, HR@200 = 0.35894039735099337, AP = 0.9207100182150614


100%|██████████| 6040/6040 [01:08<00:00, 88.67it/s]
100%|██████████| 121/121 [00:49<00:00,  2.47it/s]


epoch 21: loss = 0.7885462208546156, HR@200 = 0.36324503311258277, AP = 0.9243865969178315


100%|██████████| 6040/6040 [01:07<00:00, 89.52it/s]
100%|██████████| 121/121 [00:47<00:00,  2.55it/s]


epoch 22: loss = 0.7798473315991905, HR@200 = 0.3610927152317881, AP = 0.9233171189383631


100%|██████████| 6040/6040 [01:07<00:00, 89.71it/s]
100%|██████████| 121/121 [00:48<00:00,  2.50it/s]


epoch 23: loss = 0.7729722452923556, HR@200 = 0.36771523178807947, AP = 0.9228724774380639


100%|██████████| 6040/6040 [01:07<00:00, 89.92it/s]
100%|██████████| 121/121 [00:47<00:00,  2.54it/s]


epoch 24: loss = 0.7593267614231599, HR@200 = 0.36821192052980134, AP = 0.9222520136839889


100%|██████████| 6040/6040 [01:07<00:00, 89.75it/s]
100%|██████████| 121/121 [00:48<00:00,  2.52it/s]


epoch 25: loss = 0.7516782520301887, HR@200 = 0.3793046357615894, AP = 0.92052401978271


100%|██████████| 6040/6040 [01:07<00:00, 90.03it/s]
100%|██████████| 121/121 [00:47<00:00,  2.54it/s]


epoch 26: loss = 0.7441569937602771, HR@200 = 0.37996688741721857, AP = 0.9205609158484374


100%|██████████| 6040/6040 [01:07<00:00, 89.30it/s]
100%|██████████| 121/121 [00:47<00:00,  2.55it/s]


epoch 27: loss = 0.7331545509052593, HR@200 = 0.38079470198675497, AP = 0.9251819814218192


100%|██████████| 6040/6040 [01:07<00:00, 89.24it/s]
100%|██████████| 121/121 [00:48<00:00,  2.51it/s]


epoch 28: loss = 0.729058694605105, HR@200 = 0.38294701986754964, AP = 0.9210623255364224


100%|██████████| 6040/6040 [01:08<00:00, 88.48it/s]
100%|██████████| 121/121 [00:47<00:00,  2.54it/s]


epoch 29: loss = 0.724587277831226, HR@200 = 0.38145695364238413, AP = 0.9205795095180008


100%|██████████| 6040/6040 [01:07<00:00, 90.12it/s]
100%|██████████| 121/121 [00:47<00:00,  2.53it/s]


epoch 30: loss = 0.7149526580206014, HR@200 = 0.38559602649006625, AP = 0.9206263918645128


100%|██████████| 6040/6040 [01:07<00:00, 89.90it/s]
100%|██████████| 121/121 [00:46<00:00,  2.59it/s]


epoch 31: loss = 0.7093078122653117, HR@200 = 0.38725165562913905, AP = 0.9187457222993642


100%|██████████| 6040/6040 [01:07<00:00, 89.81it/s]
100%|██████████| 121/121 [00:48<00:00,  2.52it/s]


epoch 32: loss = 0.7058619813413809, HR@200 = 0.38973509933774836, AP = 0.9200051069110966


100%|██████████| 6040/6040 [01:07<00:00, 89.99it/s]
100%|██████████| 121/121 [00:46<00:00,  2.58it/s]


epoch 33: loss = 0.6944860411450168, HR@200 = 0.38211920529801324, AP = 0.9211150709642852


100%|██████████| 6040/6040 [01:07<00:00, 89.58it/s]
100%|██████████| 121/121 [00:47<00:00,  2.53it/s]


epoch 34: loss = 0.688203668426599, HR@200 = 0.3968543046357616, AP = 0.9204981554599883


100%|██████████| 6040/6040 [01:07<00:00, 89.57it/s]
100%|██████████| 121/121 [00:47<00:00,  2.56it/s]


epoch 35: loss = 0.6894891142351738, HR@200 = 0.38658940397350994, AP = 0.9209942032317981


100%|██████████| 6040/6040 [01:07<00:00, 89.00it/s]
100%|██████████| 121/121 [00:48<00:00,  2.50it/s]


epoch 36: loss = 0.6815407834552376, HR@200 = 0.39072847682119205, AP = 0.9204891474661094


100%|██████████| 6040/6040 [01:07<00:00, 89.20it/s]
100%|██████████| 121/121 [00:46<00:00,  2.59it/s]


epoch 37: loss = 0.674886307297953, HR@200 = 0.396523178807947, AP = 0.920636115345346


100%|██████████| 6040/6040 [01:07<00:00, 89.21it/s]
100%|██████████| 121/121 [00:48<00:00,  2.52it/s]


epoch 38: loss = 0.6722188457624604, HR@200 = 0.3890728476821192, AP = 0.919456094960711


100%|██████████| 6040/6040 [01:07<00:00, 89.26it/s]
100%|██████████| 121/121 [00:47<00:00,  2.56it/s]


epoch 39: loss = 0.6699337416359328, HR@200 = 0.4, AP = 0.9198211068099148


100%|██████████| 6040/6040 [01:07<00:00, 89.52it/s]
100%|██████████| 121/121 [00:47<00:00,  2.56it/s]


epoch 40: loss = 0.6623378868208618, HR@200 = 0.39917218543046357, AP = 0.9211837946001006


100%|██████████| 6040/6040 [01:07<00:00, 89.16it/s]
100%|██████████| 121/121 [00:47<00:00,  2.54it/s]


epoch 41: loss = 0.6548900258351142, HR@200 = 0.3968543046357616, AP = 0.9185321637602801


100%|██████████| 6040/6040 [01:07<00:00, 89.38it/s]
100%|██████████| 121/121 [00:47<00:00,  2.57it/s]


epoch 42: loss = 0.6556126787974839, HR@200 = 0.3945364238410596, AP = 0.9189081788646889


100%|██████████| 6040/6040 [01:07<00:00, 89.16it/s]
100%|██████████| 121/121 [00:48<00:00,  2.51it/s]


epoch 43: loss = 0.6487714224723199, HR@200 = 0.3963576158940397, AP = 0.9189202886374934


100%|██████████| 6040/6040 [01:07<00:00, 89.43it/s]
100%|██████████| 121/121 [00:47<00:00,  2.57it/s]


epoch 44: loss = 0.6454594080145193, HR@200 = 0.398841059602649, AP = 0.9198063048316601


100%|██████████| 6040/6040 [01:07<00:00, 89.08it/s]
100%|██████████| 121/121 [00:47<00:00,  2.53it/s]


epoch 45: loss = 0.6391169866830703, HR@200 = 0.4019867549668874, AP = 0.9177443246264623


100%|██████████| 6040/6040 [01:07<00:00, 89.64it/s]
100%|██████████| 121/121 [00:46<00:00,  2.61it/s]


epoch 46: loss = 0.6345482752812619, HR@200 = 0.4016556291390728, AP = 0.9209205628406869


100%|██████████| 6040/6040 [01:07<00:00, 89.21it/s]
100%|██████████| 121/121 [00:47<00:00,  2.54it/s]


epoch 47: loss = 0.6314739529117448, HR@200 = 0.40463576158940395, AP = 0.9176973486549358


100%|██████████| 6040/6040 [01:07<00:00, 89.34it/s]
100%|██████████| 121/121 [00:47<00:00,  2.56it/s]


epoch 48: loss = 0.6300273629016434, HR@200 = 0.40049668874172184, AP = 0.918184345553479


100%|██████████| 6040/6040 [01:07<00:00, 89.32it/s]
100%|██████████| 121/121 [00:47<00:00,  2.57it/s]


epoch 49: loss = 0.6231502461635711, HR@200 = 0.398841059602649, AP = 0.9174763060936544


100%|██████████| 6040/6040 [01:07<00:00, 89.30it/s]
100%|██████████| 121/121 [00:48<00:00,  2.51it/s]


epoch 50: loss = 0.6219466546538057, HR@200 = 0.4014900662251656, AP = 0.9201368293479654


100%|██████████| 6040/6040 [01:07<00:00, 89.05it/s]
100%|██████████| 121/121 [00:46<00:00,  2.59it/s]


epoch 51: loss = 0.6235654497334104, HR@200 = 0.405794701986755, AP = 0.91797041217722


100%|██████████| 6040/6040 [01:07<00:00, 89.33it/s]
100%|██████████| 121/121 [00:48<00:00,  2.52it/s]


epoch 52: loss = 0.6181305206090902, HR@200 = 0.40397350993377484, AP = 0.9208235315510835


100%|██████████| 6040/6040 [01:07<00:00, 89.14it/s]
100%|██████████| 121/121 [00:46<00:00,  2.60it/s]


epoch 53: loss = 0.6131944217910337, HR@200 = 0.40860927152317883, AP = 0.9188719964638021


100%|██████████| 6040/6040 [01:07<00:00, 89.18it/s]
100%|██████████| 121/121 [00:47<00:00,  2.52it/s]


epoch 54: loss = 0.6094181127833905, HR@200 = 0.40049668874172184, AP = 0.9206571062724154


100%|██████████| 6040/6040 [01:11<00:00, 84.42it/s]
100%|██████████| 121/121 [00:48<00:00,  2.51it/s]


epoch 55: loss = 0.6108525274531139, HR@200 = 0.406953642384106, AP = 0.9182739870166229


100%|██████████| 6040/6040 [01:08<00:00, 87.71it/s]
100%|██████████| 121/121 [00:47<00:00,  2.53it/s]


epoch 56: loss = 0.6038545562854863, HR@200 = 0.4051324503311258, AP = 0.9163870259983018


100%|██████████| 6040/6040 [01:07<00:00, 88.99it/s]
100%|██████████| 121/121 [00:46<00:00,  2.58it/s]


epoch 57: loss = 0.6040749022922176, HR@200 = 0.40447019867549666, AP = 0.9160959301338494


100%|██████████| 6040/6040 [01:07<00:00, 89.07it/s]
100%|██████████| 121/121 [00:48<00:00,  2.52it/s]


epoch 58: loss = 0.5977184067326073, HR@200 = 0.4076158940397351, AP = 0.917754591969393


100%|██████████| 6040/6040 [01:07<00:00, 89.22it/s]
100%|██████████| 121/121 [00:46<00:00,  2.58it/s]


epoch 59: loss = 0.5938634243713605, HR@200 = 0.4067880794701987, AP = 0.9168994776312034


100%|██████████| 6040/6040 [01:07<00:00, 89.10it/s]
100%|██████████| 121/121 [00:47<00:00,  2.53it/s]

epoch 60: loss = 0.5910417292045047, HR@200 = 0.4100993377483444, AP = 0.9177285071465753

Finished Training:
Best metrics are:
Hit Ratio eval = 0.4100993377483444





In [None]:
results = pd.DataFrame(columns=['maximal width', 'Batch Size', 'Learning Rate', 'Topk', 'Metric', 'Score']) #uncomment to run

In [None]:
training_candidates.evaluate_epoch(dl_test)

100%|██████████| 121/121 [00:48<00:00,  2.49it/s]


In [None]:
add_results(training_candidates, results, maximal_width, batch_size, lr)

In [None]:
results

Unnamed: 0,maximal width,Batch Size,Learning Rate,Topk,Metric,Score
0,1024,50,0.001,0,LOSS,0.102132
1,1024,50,0.001,0,MAP,0.907252
2,1024,50,0.001,0,TIME,7128.634667
3,1024,50,0.001,100,HR,0.265894
4,1024,50,0.001,200,HR,0.403146
5,1024,50,0.001,100,MRR,0.014933
6,1024,50,0.001,200,MRR,0.015907
7,1024,50,0.001,100,NDCG,0.058599
8,1024,50,0.001,200,NDCG,0.077752


In [None]:
path = './drive/My Drive/Colab Notebooks/recsys_final_candidates_LSTM'
with open(path, 'wb') as f:
  pickle.dump((results), f)

In [None]:
batch_size = 50 # with 2 layers LSTM and 0 for 0 embedding - reverse padding and averaging lstm output
dl_test = DataLoader(test, batch_size=batch_size, shuffle=False)
lr = 0.001
layers = [1024, 512, 256] 

torch.manual_seed(42)
np.random.seed(42) 
maximal_width = layers[0]
Candidates_config = {'num_items': n_items, 'n_neg_samples': num_neg_samples, 'max_watches': max_watches, 'latent_dim': 256, 'features_dim': 267, 'layers':layers, 'N_candidates':250}
model_Candidates = Candidates(Candidates_config)

training_config = {'n_neg_samples': num_neg_samples, 'batch_size': batch_size, 'optimizer_type': Adam, 'optimizer_parameter': {'lr': lr}, \
              'criterion' : torch.nn.CrossEntropyLoss, 'n_epochs' : 61, 'early_stopping' : 8, 'verbose' : True, 'device' : device}


training_candidates = Training(model_Candidates, training_config)
training_candidates.train()

100%|██████████| 6040/6040 [01:14<00:00, 81.03it/s]
100%|██████████| 121/121 [00:55<00:00,  2.19it/s]


epoch 0: loss = 3.063849717141777, HR@200 = 0.1316225165562914, AP = 0.8986300806349167


100%|██████████| 6040/6040 [01:12<00:00, 83.53it/s]
100%|██████████| 121/121 [00:54<00:00,  2.21it/s]


epoch 1: loss = 2.2974756523276008, HR@200 = 0.1597682119205298, AP = 0.9168927263545821


100%|██████████| 6040/6040 [01:12<00:00, 83.61it/s]
100%|██████████| 121/121 [00:54<00:00,  2.24it/s]


epoch 2: loss = 2.003398383315036, HR@200 = 0.1793046357615894, AP = 0.92731791770016


100%|██████████| 6040/6040 [01:12<00:00, 83.09it/s]
100%|██████████| 121/121 [00:53<00:00,  2.25it/s]


epoch 3: loss = 1.813417441945597, HR@200 = 0.19950331125827814, AP = 0.9324450226063776


100%|██████████| 6040/6040 [01:12<00:00, 83.07it/s]
100%|██████████| 121/121 [00:52<00:00,  2.28it/s]


epoch 4: loss = 1.6749928373374687, HR@200 = 0.21142384105960266, AP = 0.9327682089556482


100%|██████████| 6040/6040 [01:13<00:00, 82.72it/s]
100%|██████████| 121/121 [00:53<00:00,  2.28it/s]


epoch 5: loss = 1.5704158214149095, HR@200 = 0.225, AP = 0.9309173865372029


100%|██████████| 6040/6040 [01:12<00:00, 83.62it/s]
100%|██████████| 121/121 [00:52<00:00,  2.30it/s]


epoch 6: loss = 1.4852838309099343, HR@200 = 0.24668874172185432, AP = 0.933235646757263


100%|██████████| 6040/6040 [01:12<00:00, 83.33it/s]
100%|██████████| 121/121 [00:52<00:00,  2.32it/s]


epoch 7: loss = 1.413505632800377, HR@200 = 0.25298013245033113, AP = 0.9350301892739495


100%|██████████| 6040/6040 [01:12<00:00, 83.12it/s]
100%|██████████| 121/121 [00:52<00:00,  2.32it/s]


epoch 8: loss = 1.3521878390418773, HR@200 = 0.26175496688741723, AP = 0.9325390451544663


100%|██████████| 6040/6040 [01:13<00:00, 82.61it/s]
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 9: loss = 1.3086873215357988, HR@200 = 0.2817880794701987, AP = 0.9323976349387806


100%|██████████| 6040/6040 [01:13<00:00, 81.79it/s]
100%|██████████| 121/121 [00:52<00:00,  2.32it/s]


epoch 10: loss = 1.2677859916789642, HR@200 = 0.28211920529801326, AP = 0.9337146340368321


100%|██████████| 6040/6040 [01:12<00:00, 82.83it/s]
100%|██████████| 121/121 [00:51<00:00,  2.36it/s]


epoch 11: loss = 1.230819908524586, HR@200 = 0.2955298013245033, AP = 0.9325130135120723


 14%|█▍        | 873/6040 [00:10<00:59, 86.25it/s]


KeyboardInterrupt: ignored

In [None]:
batch_size = 50 # with gru and 0 for 0 embedding - reverse padding
dl_test = DataLoader(test, batch_size=batch_size, shuffle=False)
lr = 0.001
layers = [1024, 512, 256] 

torch.manual_seed(42)
np.random.seed(42) 
maximal_width = layers[0]
Candidates_config = {'num_items': n_items, 'n_neg_samples': num_neg_samples, 'max_watches': max_watches, 'latent_dim': 256, 'features_dim': 267, 'layers':layers, 'N_candidates':250}
model_Candidates = Candidates(Candidates_config)

training_config = {'n_neg_samples': num_neg_samples, 'batch_size': batch_size, 'optimizer_type': Adam, 'optimizer_parameter': {'lr': lr}, \
              'criterion' : torch.nn.CrossEntropyLoss, 'n_epochs' : 61, 'early_stopping' : 8, 'verbose' : True, 'device' : device}


training_candidates = Training(model_Candidates, training_config)
training_candidates.train()
# training_candidates.evaluate_epoch(dl_test)
# add_results(training_candidates, best_results, maximal_width, batch_size, lr)

100%|██████████| 6040/6040 [01:09<00:00, 87.00it/s]
100%|██████████| 121/121 [00:54<00:00,  2.23it/s]


epoch 0: loss = 2.7405855264490016, HR@200 = 0.14288079470198675, AP = 0.9204618166815888


100%|██████████| 6040/6040 [01:07<00:00, 89.87it/s]
100%|██████████| 121/121 [00:54<00:00,  2.22it/s]


epoch 1: loss = 2.067728934303814, HR@200 = 0.16655629139072847, AP = 0.9321732206347175


100%|██████████| 6040/6040 [01:08<00:00, 88.61it/s]
100%|██████████| 121/121 [00:54<00:00,  2.23it/s]


epoch 2: loss = 1.8699579397475483, HR@200 = 0.17963576158940397, AP = 0.9354707384359697


100%|██████████| 6040/6040 [01:08<00:00, 88.58it/s]
100%|██████████| 121/121 [00:53<00:00,  2.25it/s]


epoch 3: loss = 1.7626951211434327, HR@200 = 0.19751655629139073, AP = 0.9372226958875383


100%|██████████| 6040/6040 [01:07<00:00, 89.73it/s]
100%|██████████| 121/121 [00:53<00:00,  2.27it/s]


epoch 4: loss = 1.699770888834205, HR@200 = 0.20596026490066224, AP = 0.9382224538628983


100%|██████████| 6040/6040 [01:07<00:00, 89.11it/s]
100%|██████████| 121/121 [00:53<00:00,  2.27it/s]


epoch 5: loss = 1.6646960847030412, HR@200 = 0.2140728476821192, AP = 0.938994099199524


100%|██████████| 6040/6040 [01:07<00:00, 89.23it/s]
100%|██████████| 121/121 [00:51<00:00,  2.33it/s]


epoch 6: loss = 1.6437886978518095, HR@200 = 0.226158940397351, AP = 0.9377189139610822


100%|██████████| 6040/6040 [01:07<00:00, 90.07it/s]
100%|██████████| 121/121 [00:51<00:00,  2.33it/s]


epoch 7: loss = 1.6325153003662627, HR@200 = 0.23526490066225167, AP = 0.9386152529198153


100%|██████████| 6040/6040 [01:06<00:00, 91.44it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 8: loss = 1.628298818719703, HR@200 = 0.2380794701986755, AP = 0.9409624944462736


100%|██████████| 6040/6040 [01:05<00:00, 92.59it/s]
100%|██████████| 121/121 [00:50<00:00,  2.38it/s]


epoch 9: loss = 1.6269594749670155, HR@200 = 0.2347682119205298, AP = 0.9392531911032249


100%|██████████| 6040/6040 [01:05<00:00, 92.59it/s]
100%|██████████| 121/121 [00:49<00:00,  2.45it/s]


epoch 10: loss = 1.6307036990361499, HR@200 = 0.2380794701986755, AP = 0.9404924150463214


100%|██████████| 6040/6040 [01:05<00:00, 92.67it/s]
100%|██████████| 121/121 [00:50<00:00,  2.37it/s]


epoch 11: loss = 1.6379609764806482, HR@200 = 0.2435430463576159, AP = 0.9424832807503245


100%|██████████| 6040/6040 [01:06<00:00, 90.94it/s]
100%|██████████| 121/121 [00:50<00:00,  2.38it/s]


epoch 12: loss = 1.6450330188534907, HR@200 = 0.2435430463576159, AP = 0.9416164993419122


100%|██████████| 6040/6040 [01:07<00:00, 89.49it/s]
100%|██████████| 121/121 [00:51<00:00,  2.37it/s]


epoch 13: loss = 1.6575398274506166, HR@200 = 0.24420529801324503, AP = 0.9392743076699758


100%|██████████| 6040/6040 [01:06<00:00, 90.40it/s]
100%|██████████| 121/121 [00:51<00:00,  2.36it/s]


epoch 14: loss = 1.6753103645826808, HR@200 = 0.24254966887417218, AP = 0.9394640612386311


 22%|██▏       | 1335/6040 [00:14<00:51, 90.52it/s]


KeyboardInterrupt: ignored

In [None]:
batch_size = 50
dl_test = DataLoader(test, batch_size=batch_size, shuffle=False)
lr = 0.001
layers = [1024, 512, 256] 

torch.manual_seed(42)
np.random.seed(42) 
maximal_width = layers[0]
Candidates_config = {'num_items': n_items, 'n_neg_samples': num_neg_samples, 'max_watches': max_watches, 'latent_dim': 256, 'features_dim': 267, 'layers':layers, 'N_candidates':250}
model_Candidates = Candidates(Candidates_config)

training_config = {'n_neg_samples': num_neg_samples, 'batch_size': batch_size, 'optimizer_type': Adam, 'optimizer_parameter': {'lr': lr}, \
              'criterion' : torch.nn.CrossEntropyLoss, 'n_epochs' : 61, 'early_stopping' : 8, 'verbose' : True, 'device' : device}


training_candidates = Training(model_Candidates, training_config)
training_candidates.train()

100%|██████████| 6040/6040 [00:30<00:00, 199.38it/s]
100%|██████████| 121/121 [00:57<00:00,  2.09it/s]


epoch 0: loss = 2.9936776732372135, HR@200 = 0.12019867549668874, AP = 0.8903327626110267


100%|██████████| 6040/6040 [00:30<00:00, 199.37it/s]
100%|██████████| 121/121 [00:55<00:00,  2.19it/s]


epoch 1: loss = 2.3623286915733326, HR@200 = 0.148841059602649, AP = 0.9012660051509083


100%|██████████| 6040/6040 [00:29<00:00, 202.49it/s]
100%|██████████| 121/121 [00:53<00:00,  2.24it/s]


epoch 2: loss = 2.108319604179717, HR@200 = 0.16705298013245032, AP = 0.9089055196364073


100%|██████████| 6040/6040 [00:29<00:00, 207.79it/s]
100%|██████████| 121/121 [00:53<00:00,  2.28it/s]


epoch 3: loss = 1.935941887849214, HR@200 = 0.17963576158940397, AP = 0.9101693249498848


100%|██████████| 6040/6040 [00:29<00:00, 207.86it/s]
100%|██████████| 121/121 [00:52<00:00,  2.29it/s]


epoch 4: loss = 1.8092913146840026, HR@200 = 0.20281456953642385, AP = 0.9092156429198874


100%|██████████| 6040/6040 [00:29<00:00, 207.23it/s]
100%|██████████| 121/121 [00:52<00:00,  2.32it/s]


epoch 5: loss = 1.6985827646310756, HR@200 = 0.21291390728476822, AP = 0.9049987625881426


100%|██████████| 6040/6040 [00:29<00:00, 203.37it/s]
100%|██████████| 121/121 [00:51<00:00,  2.34it/s]


epoch 6: loss = 1.6103372751087541, HR@200 = 0.22466887417218542, AP = 0.9081969862461365


100%|██████████| 6040/6040 [00:29<00:00, 203.98it/s]
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 7: loss = 1.527582406790446, HR@200 = 0.24420529801324503, AP = 0.9049813099231602


100%|██████████| 6040/6040 [00:28<00:00, 208.91it/s]
100%|██████████| 121/121 [00:51<00:00,  2.35it/s]


epoch 8: loss = 1.4559065719511335, HR@200 = 0.2576158940397351, AP = 0.9028328807929136


100%|██████████| 6040/6040 [00:28<00:00, 208.89it/s]
100%|██████████| 121/121 [00:51<00:00,  2.35it/s]


epoch 9: loss = 1.38821471378898, HR@200 = 0.2630794701986755, AP = 0.9012265274463876


100%|██████████| 6040/6040 [00:28<00:00, 209.40it/s]
100%|██████████| 121/121 [00:51<00:00,  2.36it/s]


epoch 10: loss = 1.3293064298317923, HR@200 = 0.2774834437086093, AP = 0.8976115543175787


100%|██████████| 6040/6040 [00:28<00:00, 209.73it/s]
100%|██████████| 121/121 [00:49<00:00,  2.44it/s]


epoch 11: loss = 1.278058978096144, HR@200 = 0.2894039735099338, AP = 0.8958426019977416


100%|██████████| 6040/6040 [00:29<00:00, 206.00it/s]
100%|██████████| 121/121 [00:49<00:00,  2.42it/s]


epoch 12: loss = 1.2263056888485586, HR@200 = 0.298841059602649, AP = 0.8913500722439394


100%|██████████| 6040/6040 [00:28<00:00, 209.76it/s]
100%|██████████| 121/121 [00:50<00:00,  2.41it/s]


epoch 13: loss = 1.1844409492434256, HR@200 = 0.3109271523178808, AP = 0.8905889548341444


100%|██████████| 6040/6040 [00:28<00:00, 208.94it/s]
 36%|███▋      | 44/121 [00:17<00:31,  2.47it/s]


KeyboardInterrupt: ignored

In [None]:
batch_size = 50 # with gru and no constraint on 0 embedding - reverse padding
dl_test = DataLoader(test, batch_size=batch_size, shuffle=False)
lr = 0.001
layers = [512, 256] 

torch.manual_seed(42)
np.random.seed(42) 
maximal_width = layers[0]
Candidates_config = {'num_items': n_items, 'n_neg_samples': num_neg_samples, 'max_watches': max_watches, 'latent_dim': 256, 'features_dim': 267, 'layers':layers, 'N_candidates':250}
model_Candidates = Candidates(Candidates_config)

training_config = {'n_neg_samples': num_neg_samples, 'batch_size': batch_size, 'optimizer_type': Adam, 'optimizer_parameter': {'lr': lr}, \
              'criterion' : torch.nn.CrossEntropyLoss, 'n_epochs' : 61, 'early_stopping' : 8, 'verbose' : True, 'device' : device}


training_candidates = Training(model_Candidates, training_config)
training_candidates.train()

100%|██████████| 6040/6040 [01:04<00:00, 94.02it/s] 
100%|██████████| 121/121 [00:53<00:00,  2.27it/s]


epoch 0: loss = 2.735156523293217, HR@200 = 0.15529801324503312, AP = 0.9194635095057576


100%|██████████| 6040/6040 [01:03<00:00, 95.87it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 1: loss = 2.077304863475806, HR@200 = 0.1759933774834437, AP = 0.9304382272676134


100%|██████████| 6040/6040 [01:02<00:00, 96.66it/s]
100%|██████████| 121/121 [00:51<00:00,  2.35it/s]


epoch 2: loss = 1.8822249774111817, HR@200 = 0.19503311258278147, AP = 0.932448197561973


100%|██████████| 6040/6040 [01:01<00:00, 98.25it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.36it/s]


epoch 3: loss = 1.7802218862914092, HR@200 = 0.20645695364238412, AP = 0.9341272691211457


100%|██████████| 6040/6040 [01:01<00:00, 98.25it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.36it/s]


epoch 4: loss = 1.7172516916070553, HR@200 = 0.22350993377483444, AP = 0.9336291024854078


100%|██████████| 6040/6040 [01:03<00:00, 94.47it/s]
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 5: loss = 1.6844377179907648, HR@200 = 0.226158940397351, AP = 0.9350264249903713


100%|██████████| 6040/6040 [01:03<00:00, 95.82it/s]
100%|██████████| 121/121 [00:49<00:00,  2.44it/s]


epoch 6: loss = 1.6626047118708787, HR@200 = 0.23609271523178807, AP = 0.9365826190519915


100%|██████████| 6040/6040 [01:04<00:00, 93.13it/s] 
100%|██████████| 121/121 [00:49<00:00,  2.43it/s]


epoch 7: loss = 1.6495155298058561, HR@200 = 0.24602649006622518, AP = 0.935987357815245


100%|██████████| 6040/6040 [01:03<00:00, 94.91it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 8: loss = 1.640501073742939, HR@200 = 0.2596026490066225, AP = 0.9375939566182476


100%|██████████| 6040/6040 [01:03<00:00, 95.37it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 9: loss = 1.6392300001832822, HR@200 = 0.26572847682119205, AP = 0.9368373240523965


100%|██████████| 6040/6040 [01:02<00:00, 95.98it/s]
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 10: loss = 1.642552589284663, HR@200 = 0.26672185430463574, AP = 0.9379586621602323


100%|██████████| 6040/6040 [01:02<00:00, 96.95it/s]
100%|██████████| 121/121 [00:49<00:00,  2.45it/s]


epoch 11: loss = 1.6460272478050744, HR@200 = 0.272682119205298, AP = 0.9366546688026685


100%|██████████| 6040/6040 [01:02<00:00, 96.25it/s]
100%|██████████| 121/121 [00:48<00:00,  2.48it/s]


epoch 12: loss = 1.6605202651576492, HR@200 = 0.26473509933774836, AP = 0.9363118087052813


100%|██████████| 6040/6040 [01:02<00:00, 97.07it/s]
100%|██████████| 121/121 [00:49<00:00,  2.44it/s]


epoch 13: loss = 1.6726352265339024, HR@200 = 0.2640728476821192, AP = 0.9376962431777436


100%|██████████| 6040/6040 [01:02<00:00, 96.75it/s]
100%|██████████| 121/121 [00:48<00:00,  2.48it/s]


epoch 14: loss = 1.6904558524193354, HR@200 = 0.2683774834437086, AP = 0.9361114313682839


100%|██████████| 6040/6040 [01:02<00:00, 96.45it/s] 
100%|██████████| 121/121 [00:48<00:00,  2.48it/s]


epoch 15: loss = 1.7136366772631935, HR@200 = 0.26589403973509934, AP = 0.9368789186898461


100%|██████████| 6040/6040 [01:02<00:00, 96.91it/s] 
 15%|█▍        | 18/121 [00:06<00:38,  2.65it/s]


KeyboardInterrupt: ignored

In [None]:
batch_size = 50 # with gru and no constraint on 0 embedding
dl_test = DataLoader(test, batch_size=batch_size, shuffle=False)
lr = 0.001
layers = [512, 256] 

torch.manual_seed(42)
np.random.seed(42) 
maximal_width = layers[0]
Candidates_config = {'num_items': n_items, 'n_neg_samples': num_neg_samples, 'max_watches': max_watches, 'latent_dim': 256, 'features_dim': 267, 'layers':layers, 'N_candidates':250}
model_Candidates = Candidates(Candidates_config)

training_config = {'n_neg_samples': num_neg_samples, 'batch_size': batch_size, 'optimizer_type': Adam, 'optimizer_parameter': {'lr': lr}, \
              'criterion' : torch.nn.CrossEntropyLoss, 'n_epochs' : 61, 'early_stopping' : 8, 'verbose' : True, 'device' : device}


training_candidates = Training(model_Candidates, training_config)
training_candidates.train()
# training_candidates.evaluate_epoch(dl_test)
# add_results(training_candidates, best_results, maximal_width, batch_size, lr)

100%|██████████| 6040/6040 [01:01<00:00, 97.56it/s] 
100%|██████████| 121/121 [00:53<00:00,  2.27it/s]


epoch 0: loss = 3.0310574258400114, HR@200 = 0.13956953642384107, AP = 0.9079621037539802


100%|██████████| 6040/6040 [01:02<00:00, 96.96it/s]
100%|██████████| 121/121 [00:51<00:00,  2.33it/s]


epoch 1: loss = 2.347568300682188, HR@200 = 0.15529801324503312, AP = 0.9213751951612015


100%|██████████| 6040/6040 [01:02<00:00, 96.17it/s] 
100%|██████████| 121/121 [00:52<00:00,  2.32it/s]


epoch 2: loss = 2.1164299966483715, HR@200 = 0.17102649006622517, AP = 0.923968758047196


100%|██████████| 6040/6040 [01:03<00:00, 94.86it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.33it/s]


epoch 3: loss = 1.985582875021246, HR@200 = 0.18990066225165564, AP = 0.9268856073764835


100%|██████████| 6040/6040 [01:01<00:00, 97.57it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.34it/s]


epoch 4: loss = 1.9076872683913502, HR@200 = 0.2, AP = 0.924688939462459


100%|██████████| 6040/6040 [01:02<00:00, 96.15it/s]
100%|██████████| 121/121 [00:50<00:00,  2.41it/s]


epoch 5: loss = 1.8636636211185267, HR@200 = 0.2120860927152318, AP = 0.9252965414191302


100%|██████████| 6040/6040 [01:02<00:00, 96.05it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.41it/s]


epoch 6: loss = 1.8371288467716698, HR@200 = 0.22003311258278146, AP = 0.9249025079731861


100%|██████████| 6040/6040 [01:02<00:00, 96.37it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.35it/s]


epoch 7: loss = 1.8184340212814856, HR@200 = 0.22698675496688742, AP = 0.927039253958571


100%|██████████| 6040/6040 [01:01<00:00, 97.68it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.37it/s]


epoch 8: loss = 1.8143110066257566, HR@200 = 0.2349337748344371, AP = 0.9264078602514825


100%|██████████| 6040/6040 [01:01<00:00, 97.61it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.37it/s]


epoch 9: loss = 1.816523803859357, HR@200 = 0.2445364238410596, AP = 0.925217528778424


100%|██████████| 6040/6040 [01:02<00:00, 97.09it/s]
100%|██████████| 121/121 [00:50<00:00,  2.38it/s]


epoch 10: loss = 1.8214989062571367, HR@200 = 0.24370860927152319, AP = 0.9235706784438558


100%|██████████| 6040/6040 [01:02<00:00, 96.19it/s]
100%|██████████| 121/121 [00:49<00:00,  2.43it/s]


epoch 11: loss = 1.8333972231145725, HR@200 = 0.24619205298013244, AP = 0.9258423244199808


100%|██████████| 6040/6040 [01:02<00:00, 96.08it/s] 
100%|██████████| 121/121 [00:49<00:00,  2.43it/s]


epoch 12: loss = 1.849354935600268, HR@200 = 0.2456953642384106, AP = 0.9266780946599538


100%|██████████| 6040/6040 [01:03<00:00, 95.72it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.41it/s]


epoch 13: loss = 1.8632454869960318, HR@200 = 0.24867549668874173, AP = 0.9262024919286147


 29%|██▉       | 1767/6040 [00:18<00:44, 96.05it/s] 


KeyboardInterrupt: ignored

In [None]:
batch_size = 50 # with gru and no constraint on 0 embedding
dl_test = DataLoader(test, batch_size=batch_size, shuffle=False)
lr = 0.001
layers = [1024, 512, 256] 

torch.manual_seed(42)
np.random.seed(42) 
maximal_width = layers[0]
Candidates_config = {'num_items': n_items, 'n_neg_samples': num_neg_samples, 'max_watches': max_watches, 'latent_dim': 256, 'features_dim': 267, 'layers':layers, 'N_candidates':250}
model_Candidates = Candidates(Candidates_config)

training_config = {'n_neg_samples': num_neg_samples, 'batch_size': batch_size, 'optimizer_type': Adam, 'optimizer_parameter': {'lr': lr}, \
              'criterion' : torch.nn.CrossEntropyLoss, 'n_epochs' : 61, 'early_stopping' : 8, 'verbose' : True, 'device' : device}


training_candidates = Training(model_Candidates, training_config)
training_candidates.train()
# training_candidates.evaluate_epoch(dl_test)
# add_results(training_candidates, best_results, maximal_width, batch_size, lr)

100%|██████████| 6040/6040 [01:05<00:00, 92.60it/s] 
100%|██████████| 121/121 [00:52<00:00,  2.30it/s]


epoch 0: loss = 2.8968589526730657, HR@200 = 0.1369205298013245, AP = 0.9157254486963474


100%|██████████| 6040/6040 [01:04<00:00, 93.25it/s] 
100%|██████████| 121/121 [00:52<00:00,  2.30it/s]


epoch 1: loss = 2.166946370613496, HR@200 = 0.16605960264900663, AP = 0.9299226641349213


100%|██████████| 6040/6040 [01:04<00:00, 92.94it/s]
100%|██████████| 121/121 [00:50<00:00,  2.38it/s]


epoch 2: loss = 1.9476828149612375, HR@200 = 0.1783112582781457, AP = 0.933915614581752


100%|██████████| 6040/6040 [01:04<00:00, 93.03it/s]
100%|██████████| 121/121 [00:51<00:00,  2.33it/s]


epoch 3: loss = 1.8369194282107795, HR@200 = 0.1923841059602649, AP = 0.9370031791564856


100%|██████████| 6040/6040 [01:04<00:00, 93.31it/s]
100%|██████████| 121/121 [00:50<00:00,  2.38it/s]


epoch 4: loss = 1.7688230554098325, HR@200 = 0.19966887417218543, AP = 0.937671213770614


100%|██████████| 6040/6040 [01:06<00:00, 91.12it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.33it/s]


epoch 5: loss = 1.7302411244997125, HR@200 = 0.20430463576158941, AP = 0.938534014537501


100%|██████████| 6040/6040 [01:04<00:00, 93.59it/s]
100%|██████████| 121/121 [00:51<00:00,  2.36it/s]


epoch 6: loss = 1.7061784093743129, HR@200 = 0.2195364238410596, AP = 0.9392484708992683


100%|██████████| 6040/6040 [01:05<00:00, 92.72it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.42it/s]


epoch 7: loss = 1.6903927170282957, HR@200 = 0.22019867549668873, AP = 0.9382674170701447


100%|██████████| 6040/6040 [01:04<00:00, 93.38it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.38it/s]


epoch 8: loss = 1.6856329815375883, HR@200 = 0.2283112582781457, AP = 0.9389109077809293


100%|██████████| 6040/6040 [01:04<00:00, 94.03it/s]
100%|██████████| 121/121 [00:50<00:00,  2.40it/s]


epoch 9: loss = 1.6843026893719142, HR@200 = 0.22466887417218542, AP = 0.9418763602009321


100%|██████████| 6040/6040 [01:04<00:00, 93.38it/s] 
100%|██████████| 121/121 [00:49<00:00,  2.43it/s]


epoch 10: loss = 1.6857761831867775, HR@200 = 0.22483443708609271, AP = 0.9406372738722542


100%|██████████| 6040/6040 [01:04<00:00, 93.43it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 11: loss = 1.6874957265344677, HR@200 = 0.2314569536423841, AP = 0.9407796578910759


100%|██████████| 6040/6040 [01:03<00:00, 94.63it/s] 
100%|██████████| 121/121 [00:50<00:00,  2.39it/s]


epoch 12: loss = 1.6967989081183805, HR@200 = 0.23410596026490066, AP = 0.9393458947767118


100%|██████████| 6040/6040 [01:06<00:00, 91.25it/s] 
100%|██████████| 121/121 [00:49<00:00,  2.44it/s]


epoch 13: loss = 1.709724249626627, HR@200 = 0.2316225165562914, AP = 0.9379264624746944


100%|██████████| 6040/6040 [01:04<00:00, 92.95it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.37it/s]


epoch 14: loss = 1.7268253048030746, HR@200 = 0.22582781456953643, AP = 0.9400903669612881


100%|██████████| 6040/6040 [01:04<00:00, 94.19it/s] 
100%|██████████| 121/121 [00:51<00:00,  2.37it/s]


epoch 15: loss = 1.749784088332132, HR@200 = 0.22201986754966888, AP = 0.9384861767786492


 75%|███████▍  | 4503/6040 [00:48<00:16, 92.04it/s]


KeyboardInterrupt: ignored