# Matrix Factorization-based Collaborative Filtering

In this tutorial, you will see an example of using a strong Matrix Factorization (MF) method for recommendation, namely [SVD++](https://https://dl.acm.org/doi/pdf/10.1145/1401890.1401944). Please follow the instructions step-by-step as they try to explain different stages of developing an MF-based recommender.

**Step1:** Importing necessary packages and loading the data

First things first, we begin by install a group of necessary packages that are used in this notebook. The main ones are numpy and torch with which you probably are familiar. Major MF methods are already implemented in recommendation packages such as [Surprise](https://https://surpriselib.com/) and [LightFM](https://https://making.lyst.com/lightfm/docs/home.html), but here you can see a from scratch implementation. The main reasons for this are developing a deeper acquaintance for under-the-hood details of how an MF-based recommender works, and the fact that most of these excellent packages are not optimal for training using GPUs, but we need the acceleration caused by GPU and autodiff tools for gradient descent-based real-world applications on massive datasets.

In [2]:
import os, sys, re, pickle, torch
import numpy as np
from numpy.random import default_rng
import numpy as np
import pandas as pd
import pickle
from tqdm import tqdm
import time
import sys, os
import pickle
from torch import nn
from collections import defaultdict

In [3]:
##Importing the data file from google drive

from google.colab import drive
drive.mount('/content/drive/')          # this will direct you to a link where you can get anmhkk authorization key
import sys
sys.path.append('/content/drive/My Drive/')
##Changing the working directory

%cd '/content/drive/My Drive/'


Mounted at /content/drive/
/content/drive/My Drive


In [4]:
%cd VectorProject/

/content/drive/My Drive/VectorProject


In this folder, you are provided with three data files. "u.data" is the ratings data file of [MovieLens 100K](https://https://grouplens.org/datasets/movielens/100k/), "ML_ratings" includes a larger version of the same dataset, [MovieLens 20m ](https://grouplens.org/datasets/movielens/20m/), and LFM_ratings are the ratings from [LastFM ](https://https://grouplens.org/datasets/hetrec-2011/). You are more than welcome to explore working with the larger two datasets, but due to the greater training time, we just use MovieLens 100k for this showcase tutorial.

In [5]:
ls

LFM_ratings.txt  ML_ratings.txt  u.data


In [6]:

def u_i_dict_maker(rec):
  """
    Calculates the user to item mapping dictionary. Keys of this dict are 
    user ids and values are the list of movies each user likes
    """

  u_i_dict = defaultdict(list)
  for line in enumerate(rec):
    if line[1][1] not in u_i_dict[line[1][0]]:
      u_i_dict[line[1][0]].append(line[1][1])
  
  return u_i_dict


def load_rec_data(rec_path):
  """
    Reads the ratings data from the text files and converts them to numpy arrays.
    Next, it randomly splits the data to train, validation, and test sets.
    Finally, it forms the user to item mapping dicts for each split.
  """
  rec = np.genfromtxt(rec_path, delimiter='\t', dtype=np.int32)
  rec = rec[:,:3]
  rec_mapped = 1*rec
  unique_users = np.unique(rec[:,0])
  unique_items = np.unique(rec[:,1])
  users_map = {val: i for i, val in enumerate(unique_users)}
  items_map = {val: i for i, val in enumerate(unique_items)}

  ufunc = np.vectorize(lambda x: users_map[x])
  ifunc = np.vectorize(lambda x: items_map[x])

  # the mapped recommendation matrix where both user and item ids are 0,1,2,...
  np.random.seed(42)
  rec_mapped[:,0] = ufunc(rec[:,0])
  rec_mapped[:,1] = ifunc(rec[:,1])

  # split to train, test, and val
  train_ratio = 0.7 # 70% of data for training
  val_ratio = 0.15 # 15% of data for validation
  test_ratio = 0.15 # 15% of data for testing
  indices = np.arange(rec_mapped.shape[0])
  np.random.shuffle(indices)
  # split the data array into train, validation, and test sets
  train_indices = indices[:int(train_ratio*len(indices))]
  val_indices = indices[int(train_ratio*len(indices)):int((train_ratio+val_ratio)*len(indices))]
  test_indices = indices[int((train_ratio+val_ratio)*len(indices)):]

  train_data = rec_mapped[train_indices]
  val_data = rec_mapped[val_indices]
  test_data = rec_mapped[test_indices]
  # remove users from test and val that are not present in train
  #(we can't test on users for which we didn't have any training data. Although SVD++ is useful for the cold-start problem,
  # you often want to study that experiment separately)
  train_users = train_data[:,0]
  val_data = val_data[np.isin(val_data[:,0], train_users)]
  test_data = test_data[np.isin(test_data[:,0], train_users)]


  #maps each user to the list of items she interacted with

  u_i_dict_train, u_i_dict_val, u_i_dict_test = u_i_dict_maker(train_data), u_i_dict_maker(val_data), u_i_dict_maker(test_data)


  return rec_mapped, train_data, val_data, test_data, u_i_dict_train, u_i_dict_val, u_i_dict_test

In [7]:

ML_mapped, train_ML, val_ML, test_ML, u_i_dict_train_ML, u_i_dict_val_ML, u_i_dict_test_ML = load_rec_data('u.data')


Let's have a look at the data statistics. As you see, MovieLens 100k is a minuscule dataset usually used for tutorials rather than research.

In [8]:

print('number of users in MovieLens is: {}'.format(np.unique(ML_mapped[:,0]).shape[0]))
print('number of items in MovieLens is: {}'.format(np.unique(ML_mapped[:,1]).shape[0]))


number of users in MovieLens is: 943
number of items in MovieLens is: 1682


# Step 2. Define the model structure


<br>
SVD++ estimates the rank of an item i given by user u as:

- Model definition
$$
\hat{\mathbf{R}}_{ui} = \mu + b_u + b_i + \mathbf{q}_{i}^{T} \left( \mathbf{p}_{u} + |N(u)|^{-1/2} \sum_{j\in N(u)}\mathbf{y}_{j} \right) $$

In which:
  - $\hat{\mathbf{R}}_{ui}\in \mathbb{R}$: predicted rating user $u$ gives to item $i$
  - $\mu \in \mathbb{R}$: global bias
  - $b_{u} \in \mathbb{R}$: user bias
  - $b_{i} \in \mathbb{R}$: item bias
  - $\mathbf{q}_{i} \in \mathbb{R}^{1\text{x} k}$: $i^{\text{th}}$ row of $\mathbf{Q}$
  - $\mathbf{p}_{u} \in \mathbb{R}^{1\text{x} k}$: $u^{\text{th}}$ row of $\mathbf{P}$
  - $N(u) = \{i: | \, \mathbf{R}_{ui} \text{ is known} \}$: all items for which user $u$ provided a rating
  - $\mathbf{y}_{j} \in \mathbb{R}^{1\text{x}k}$: another item latent vector for implicit feedback


  - where, 
    - $k << m, n$: latent factor size
    - $\hat{\mathbf{R}} \in \mathbb{R}^{m \text{x} n}$: predicted rating matrix 
    - $\mathbf{P} \in \mathbb{R}^{m \text{x}k}$: user latent matrix
    - $\mathbf{Q} \in \mathbb{R}^{k \text{x}n}$: item latent matrix

<br>

- This estimation is found by minimizing the following Objective function:
$$
\underset{\mathbf{P}, \mathbf{Q}, \mathbf{y}_{*}, b}{\mathrm{argmin}} \sum_{(u, i) \in \mathcal{K}} \| \mathbf{R}_{ui} -
\hat{\mathbf{R}}_{ui} \|^2 + \lambda (\| \mathbf{P} \|^2_F + \| \mathbf{Q}
\|^2_F + b_u^2 + b_i^2 +\sum_{j\in N(u)} \|\mathbf{y}_{j}\|^2)
$$
Where:
  - $\lambda$: regularization rate (hyper-param)
  - $\mathcal{K}=\{(u, i) \mid \mathbf{R}_{ui} \text{ is known}\}$: The $(u,i)$ pairs for which $\mathbf{R}_{ui}$ is known and stored in the set


In [9]:
from torch import nn

class SVDpp(nn.Module):
    def __init__(self, num_factors, num_users, num_items, device, **kwargs):
        super(SVDpp, self).__init__(**kwargs)
        self.device = device
        # plain MF params
        self.P = nn.Embedding(num_users, num_factors).to(self.device)
        self.Q = nn.Embedding(num_items, num_factors).to(self.device)
        self.user_bias = nn.Embedding(num_users, 1).to(self.device)
        self.item_bias = nn.Embedding(num_items, 1).to(self.device)
        self.global_bias = nn.Parameter(torch.zeros(1)).to(self.device)
        # implicit feedback params
        self.y_j = nn.Embedding(num_items, num_factors).to(self.device) 

    def forward(self, user_id, item_id, u_i_dict):
        
        P_u = self.P(user_id)
        Q_i = self.Q(item_id)
        b_u = self.user_bias(user_id)
        b_i = self.item_bias(item_id)
        mu = self.global_bias

        # incoporating implicit feedback
        u_impl_list = [] 
        for u in user_id:
          u_i = u_i_dict[u.item()]
          u_impl = self.y_j(torch.tensor(u_i).to(self.device)).sum(axis=0)
          u_impl_list.append(u_impl / torch.tensor(len(u_i)).sqrt())
        u_impl_vec = torch.stack((u_impl_list))
        
        P_u += u_impl_vec

        if len(b_u) < 2:
          outputs = mu + torch.squeeze(b_u) + torch.squeeze(b_i) + (P_u * Q_i).sum() 
        else:
          outputs = mu + torch.squeeze(b_u) + torch.squeeze(b_i) + (P_u * Q_i).sum(axis=1) 
        return outputs.flatten()


After defining the model structure, for efficient computations using pyTorch, we form our data as Torch tensors and get them on the pyTorch DataLoader which helps us with efficiently shuffling and batching the data for the training and testing procedures.

In [10]:
def data_forming(train_ML, val_ML, test_ML, batch_size=256):
  """
  gets the dataset splits, converts it to torch tensors, and gets them on the DataLoader
  """
  train_u, train_i, train_r = train_ML[:,0], train_ML[:,1], train_ML[:,2]
  val_u, val_i, val_r = val_ML[:,0], val_ML[:,1], val_ML[:,2]
  test_u, test_i, test_r = test_ML[:,0], test_ML[:,1], test_ML[:,2]
  # Get on TensorDataset
  train_set = torch.utils.data.TensorDataset(
    torch.tensor(train_u), torch.tensor(train_i), torch.tensor(train_r, dtype=torch.float32))
  val_set = torch.utils.data.TensorDataset(
    torch.tensor(val_u), torch.tensor(val_i), torch.tensor(val_r))
  test_set = torch.utils.data.TensorDataset(
    torch.tensor(test_u), torch.tensor(test_i), torch.tensor(test_r))
  
  # Get on DataLoader
  train_iter = torch.utils.data.DataLoader(
      train_set, shuffle=True, batch_size=batch_size)
  val_iter = torch.utils.data.DataLoader(
      val_set, shuffle=True, batch_size=batch_size)
  test_iter = torch.utils.data.DataLoader(
      test_set, shuffle=True, batch_size=batch_size)
  
  return train_iter, val_iter, test_iter


Finally, we can train our model with the hyperparameter selection that you want. Generally, you want to use a hyperparameter tuning package such as [Ray](https://https://docs.ray.io/en/latest/tune/index.html) or [Weights and Biases](https://wandb.ai/site) for real-world problems, but since that is not the focal point of our interest here, we just manually define our hyperparameters. You are strongly encouraged to repeat training and testing with other hyperparameter settings to make conclusions about the influence of each hyperparameter. 

In [17]:
# model hyperparameters 
config={
    "num_factors":5,
    "batch_size":2048,
    'model_type': 'SVDpp',
    "optimizer": 'Adam',
    "wd":  1e-5 ,
    "lr": 0.01,
    "num_epochs":100,
    "save_every":100
}



# get the formed data splits
train_loader, val_loader, test_loader = data_forming(train_ML, val_ML, test_ML, config["batch_size"])


def train_eval_model(data, u_i_dict_train, u_i_dict_val, config, device):
  """
  Performs training and validation using the defined model and hyperparameters.
  """

  # we need these for model configuration
  num_users = np.unique(data[:,0]).shape[0]
  num_items = np.unique(data[:,1]).shape[0]
  

  # make an instance of the model. Here, the only model type we implemented is SVD++,
  # but using the same skeleton, you can easily define other MF-based models as well
  if config['model_type'] == 'SVDpp':
    model = SVDpp(config['num_factors'], num_users, num_items, device)

  # Define a loss function (MSE is the go-to choice for explicit rating data), but you
  # can define other metrics as well
  loss_fn = nn.MSELoss(reduction='mean')

  # Define an optimizer
  if config["optimizer"] == "Adam":
    optimizer = torch.optim.Adam((param for param in model.parameters()
                            if param.requires_grad), 
                         weight_decay=config["wd"], lr=config["lr"])
    
  else:
    optimizer = torch.optim.SGD((param for param in model.parameters()
                            if param.requires_grad), 
                         weight_decay=config["wd"], lr=config["lr"])
    

  ######################################################################
  # Train & Eval
  ######################################################################

  # Train
  for epoch in tqdm(range(config['num_epochs'])):
    tr_rmse = 0
    model.train()
    for u, i, r in train_loader:
      u, i, r = u.to(device), i.to(device), r.to(device)
      optimizer.zero_grad()
      output = model(u, i, u_i_dict_train)
      l = loss_fn(output, r)
      l.backward()
      optimizer.step()
      with torch.no_grad():
        tr_rmse += np.sqrt(loss_fn(output, r).cpu().numpy())
      
    print("\n training RMSE Loss: {}".format(tr_rmse))
    
    # Evaluate on Valid-set
    val_rmse = 0
    model.eval()
    for u, i, r in val_loader:
      u, i, r = u.to(device), i.to(device), r.to(device)
      r_hat = model(u, i, u_i_dict_val)
      with torch.no_grad():
        val_rmse += np.sqrt(loss_fn(r_hat, r).cpu().numpy())

    print("\t validation RMSE Loss: {}".format(val_rmse))

    if epoch % config["save_every"] + 1 == 0:
      #path = os.getcwd()
      torch.save((model.state_dict(), optimizer.state_dict()))


def try_gpu(i=0): 
    return f'cuda:{i}' if torch.cuda.device_count() >= i + 1 else 'cpu'

Check if you are connected to a GPU runtime

In [18]:
device = try_gpu()
print(device)



cuda:0


Finally run the training loop

In [19]:
train_eval_model(ML_mapped, u_i_dict_train_ML, u_i_dict_val_ML, config, device)

  0%|          | 0/100 [00:00<?, ?it/s]


 training RMSE Loss: 156.27580451965332


  1%|          | 1/100 [00:27<44:48, 27.16s/it]

	 validation RMSE Loss: 34.672574043273926

 training RMSE Loss: 124.96423649787903


  2%|▏         | 2/100 [00:54<44:24, 27.19s/it]

	 validation RMSE Loss: 30.067250728607178

 training RMSE Loss: 99.02428960800171


  3%|▎         | 3/100 [01:21<44:14, 27.36s/it]

	 validation RMSE Loss: 26.294906616210938

 training RMSE Loss: 77.32036781311035


  4%|▍         | 4/100 [01:49<43:56, 27.47s/it]

	 validation RMSE Loss: 23.830009698867798

 training RMSE Loss: 61.61072874069214


  5%|▌         | 5/100 [02:16<43:24, 27.41s/it]

	 validation RMSE Loss: 22.1870276927948

 training RMSE Loss: 51.607699155807495


  6%|▌         | 6/100 [02:44<43:02, 27.47s/it]

	 validation RMSE Loss: 21.347156763076782

 training RMSE Loss: 45.451340317726135


  7%|▋         | 7/100 [03:11<42:32, 27.45s/it]

	 validation RMSE Loss: 20.566476106643677

 training RMSE Loss: 41.58953392505646


  8%|▊         | 8/100 [03:39<42:09, 27.49s/it]

	 validation RMSE Loss: 20.104827880859375

 training RMSE Loss: 39.14206540584564


  9%|▉         | 9/100 [04:07<41:47, 27.55s/it]

	 validation RMSE Loss: 19.833343505859375

 training RMSE Loss: 37.38560092449188


 10%|█         | 10/100 [04:34<41:21, 27.57s/it]

	 validation RMSE Loss: 19.653231620788574

 training RMSE Loss: 36.22630190849304


 11%|█         | 11/100 [05:01<40:40, 27.42s/it]

	 validation RMSE Loss: 19.3358211517334

 training RMSE Loss: 35.27573734521866


 12%|█▏        | 12/100 [05:29<40:11, 27.40s/it]

	 validation RMSE Loss: 19.149911880493164

 training RMSE Loss: 34.62777602672577


 13%|█▎        | 13/100 [05:56<39:40, 27.36s/it]

	 validation RMSE Loss: 19.078425884246826

 training RMSE Loss: 34.061663925647736


 14%|█▍        | 14/100 [06:24<39:18, 27.42s/it]

	 validation RMSE Loss: 18.87735104560852

 training RMSE Loss: 33.592869222164154


 15%|█▌        | 15/100 [06:51<38:53, 27.45s/it]

	 validation RMSE Loss: 18.89169430732727

 training RMSE Loss: 33.26659852266312


 16%|█▌        | 16/100 [07:18<38:11, 27.28s/it]

	 validation RMSE Loss: 18.773109674453735

 training RMSE Loss: 32.997050642967224


 17%|█▋        | 17/100 [07:46<37:52, 27.38s/it]

	 validation RMSE Loss: 18.773961305618286

 training RMSE Loss: 32.77343046665192


 18%|█▊        | 18/100 [08:13<37:22, 27.34s/it]

	 validation RMSE Loss: 18.630336046218872

 training RMSE Loss: 32.51907479763031


 19%|█▉        | 19/100 [08:40<36:51, 27.30s/it]

	 validation RMSE Loss: 18.602782726287842

 training RMSE Loss: 32.35643172264099


 20%|██        | 20/100 [09:07<36:19, 27.25s/it]

	 validation RMSE Loss: 18.61349606513977

 training RMSE Loss: 32.16586619615555


 21%|██        | 21/100 [09:34<35:53, 27.26s/it]

	 validation RMSE Loss: 18.426133632659912

 training RMSE Loss: 32.02150785923004


 22%|██▏       | 22/100 [10:01<35:14, 27.11s/it]

	 validation RMSE Loss: 18.41005802154541

 training RMSE Loss: 31.93505448102951


 23%|██▎       | 23/100 [10:28<34:42, 27.04s/it]

	 validation RMSE Loss: 18.35312247276306

 training RMSE Loss: 31.79121422767639


 24%|██▍       | 24/100 [10:55<34:17, 27.07s/it]

	 validation RMSE Loss: 18.329846382141113

 training RMSE Loss: 31.72230154275894


 25%|██▌       | 25/100 [11:22<33:50, 27.08s/it]

	 validation RMSE Loss: 18.197348833084106

 training RMSE Loss: 31.580394089221954


 26%|██▌       | 26/100 [11:50<33:28, 27.15s/it]

	 validation RMSE Loss: 18.191727876663208

 training RMSE Loss: 31.522745430469513


 27%|██▋       | 27/100 [12:17<33:01, 27.14s/it]

	 validation RMSE Loss: 18.223432302474976

 training RMSE Loss: 31.435004889965057


 28%|██▊       | 28/100 [12:44<32:34, 27.15s/it]

	 validation RMSE Loss: 18.134655952453613

 training RMSE Loss: 31.361010909080505


 29%|██▉       | 29/100 [13:11<32:07, 27.15s/it]

	 validation RMSE Loss: 18.129488945007324

 training RMSE Loss: 31.28182876110077


 30%|███       | 30/100 [13:38<31:39, 27.14s/it]

	 validation RMSE Loss: 18.149807929992676

 training RMSE Loss: 31.295157253742218


 31%|███       | 31/100 [14:05<31:12, 27.14s/it]

	 validation RMSE Loss: 17.99822974205017

 training RMSE Loss: 31.172215938568115


 32%|███▏      | 32/100 [14:33<30:47, 27.17s/it]

	 validation RMSE Loss: 17.92353844642639

 training RMSE Loss: 31.140181303024292


 33%|███▎      | 33/100 [15:00<30:18, 27.14s/it]

	 validation RMSE Loss: 17.86622643470764

 training RMSE Loss: 31.14545238018036


 34%|███▍      | 34/100 [15:26<29:41, 26.99s/it]

	 validation RMSE Loss: 17.845893144607544

 training RMSE Loss: 30.97445160150528


 35%|███▌      | 35/100 [15:53<29:16, 27.02s/it]

	 validation RMSE Loss: 17.916208744049072

 training RMSE Loss: 30.94557911157608


 36%|███▌      | 36/100 [16:20<28:50, 27.04s/it]

	 validation RMSE Loss: 17.979087591171265

 training RMSE Loss: 30.9353329539299


 37%|███▋      | 37/100 [16:48<28:27, 27.11s/it]

	 validation RMSE Loss: 17.814913749694824

 training RMSE Loss: 30.856707274913788


 38%|███▊      | 38/100 [17:15<28:01, 27.12s/it]

	 validation RMSE Loss: 17.768842935562134

 training RMSE Loss: 30.82245832681656


 39%|███▉      | 39/100 [17:42<27:40, 27.22s/it]

	 validation RMSE Loss: 17.725149631500244

 training RMSE Loss: 30.76525527238846


 40%|████      | 40/100 [18:10<27:17, 27.29s/it]

	 validation RMSE Loss: 17.803792238235474

 training RMSE Loss: 30.71822738647461


 41%|████      | 41/100 [18:37<26:52, 27.32s/it]

	 validation RMSE Loss: 17.790006160736084

 training RMSE Loss: 30.712328791618347


 42%|████▏     | 42/100 [19:05<26:31, 27.44s/it]

	 validation RMSE Loss: 17.71829080581665

 training RMSE Loss: 30.642335057258606


 43%|████▎     | 43/100 [19:33<26:07, 27.50s/it]

	 validation RMSE Loss: 17.6511447429657

 training RMSE Loss: 30.585197746753693


 44%|████▍     | 44/100 [20:00<25:42, 27.55s/it]

	 validation RMSE Loss: 17.628788709640503

 training RMSE Loss: 30.52093482017517


 45%|████▌     | 45/100 [20:28<25:21, 27.66s/it]

	 validation RMSE Loss: 17.62494158744812

 training RMSE Loss: 30.48096215724945


 46%|████▌     | 46/100 [20:55<24:42, 27.45s/it]

	 validation RMSE Loss: 17.686169147491455

 training RMSE Loss: 30.451267302036285


 47%|████▋     | 47/100 [21:22<24:08, 27.33s/it]

	 validation RMSE Loss: 17.538233041763306

 training RMSE Loss: 30.420143604278564


 48%|████▊     | 48/100 [21:49<23:40, 27.31s/it]

	 validation RMSE Loss: 17.55417037010193

 training RMSE Loss: 30.385132133960724


 49%|████▉     | 49/100 [22:16<23:05, 27.17s/it]

	 validation RMSE Loss: 17.566104888916016

 training RMSE Loss: 30.307355225086212


 50%|█████     | 50/100 [22:43<22:34, 27.10s/it]

	 validation RMSE Loss: 17.48247742652893

 training RMSE Loss: 30.31489896774292


 51%|█████     | 51/100 [23:11<22:14, 27.24s/it]

	 validation RMSE Loss: 17.452073574066162

 training RMSE Loss: 30.304022431373596


 52%|█████▏    | 52/100 [23:38<21:46, 27.21s/it]

	 validation RMSE Loss: 17.405268669128418

 training RMSE Loss: 30.249697864055634


 53%|█████▎    | 53/100 [24:05<21:15, 27.15s/it]

	 validation RMSE Loss: 17.394226789474487

 training RMSE Loss: 30.175486266613007


 54%|█████▍    | 54/100 [24:32<20:50, 27.17s/it]

	 validation RMSE Loss: 17.3487446308136

 training RMSE Loss: 30.12313461303711


 55%|█████▌    | 55/100 [24:59<20:21, 27.15s/it]

	 validation RMSE Loss: 17.364465713500977

 training RMSE Loss: 30.084636688232422


 56%|█████▌    | 56/100 [25:26<19:54, 27.16s/it]

	 validation RMSE Loss: 17.27960228919983

 training RMSE Loss: 30.007656395435333


 57%|█████▋    | 57/100 [25:54<19:29, 27.21s/it]

	 validation RMSE Loss: 17.19178295135498

 training RMSE Loss: 30.024890780448914


 58%|█████▊    | 58/100 [26:21<18:58, 27.10s/it]

	 validation RMSE Loss: 17.17542600631714

 training RMSE Loss: 29.992263317108154


 59%|█████▉    | 59/100 [26:47<18:29, 27.05s/it]

	 validation RMSE Loss: 17.164183139801025

 training RMSE Loss: 29.92122358083725


 60%|██████    | 60/100 [27:15<18:05, 27.14s/it]

	 validation RMSE Loss: 17.214885473251343

 training RMSE Loss: 29.941559672355652


 61%|██████    | 61/100 [27:42<17:39, 27.18s/it]

	 validation RMSE Loss: 17.18499207496643

 training RMSE Loss: 29.888533234596252


 62%|██████▏   | 62/100 [28:09<17:15, 27.25s/it]

	 validation RMSE Loss: 17.088382959365845

 training RMSE Loss: 29.80159640312195


 63%|██████▎   | 63/100 [28:37<16:48, 27.26s/it]

	 validation RMSE Loss: 17.07491946220398

 training RMSE Loss: 29.770936191082


 64%|██████▍   | 64/100 [29:04<16:20, 27.23s/it]

	 validation RMSE Loss: 17.09996008872986

 training RMSE Loss: 29.696403563022614


 65%|██████▌   | 65/100 [29:31<15:53, 27.23s/it]

	 validation RMSE Loss: 16.94617748260498

 training RMSE Loss: 29.61395251750946


 66%|██████▌   | 66/100 [29:58<15:21, 27.11s/it]

	 validation RMSE Loss: 16.918849229812622

 training RMSE Loss: 29.63653075695038


 67%|██████▋   | 67/100 [30:25<14:53, 27.06s/it]

	 validation RMSE Loss: 17.033680200576782

 training RMSE Loss: 29.633040606975555


 68%|██████▊   | 68/100 [30:52<14:26, 27.06s/it]

	 validation RMSE Loss: 16.93767547607422

 training RMSE Loss: 29.52157437801361


 69%|██████▉   | 69/100 [31:19<13:55, 26.97s/it]

	 validation RMSE Loss: 16.803272247314453

 training RMSE Loss: 29.50383973121643


 70%|███████   | 70/100 [31:46<13:28, 26.96s/it]

	 validation RMSE Loss: 16.779048919677734

 training RMSE Loss: 29.454331755638123


 71%|███████   | 71/100 [32:12<13:00, 26.92s/it]

	 validation RMSE Loss: 16.852941036224365

 training RMSE Loss: 29.393131256103516


 72%|███████▏  | 72/100 [32:39<12:33, 26.91s/it]

	 validation RMSE Loss: 16.771471738815308

 training RMSE Loss: 29.397045373916626


 73%|███████▎  | 73/100 [33:06<12:04, 26.84s/it]

	 validation RMSE Loss: 16.818716287612915

 training RMSE Loss: 29.32710701227188


 74%|███████▍  | 74/100 [33:33<11:38, 26.88s/it]

	 validation RMSE Loss: 16.670186519622803

 training RMSE Loss: 29.340539395809174


 75%|███████▌  | 75/100 [34:00<11:11, 26.86s/it]

	 validation RMSE Loss: 16.708282947540283

 training RMSE Loss: 29.262453973293304


 76%|███████▌  | 76/100 [34:26<10:42, 26.78s/it]

	 validation RMSE Loss: 16.419232606887817

 training RMSE Loss: 29.23619192838669


 77%|███████▋  | 77/100 [34:53<10:15, 26.76s/it]

	 validation RMSE Loss: 16.526241064071655

 training RMSE Loss: 29.17182457447052


 78%|███████▊  | 78/100 [35:20<09:48, 26.73s/it]

	 validation RMSE Loss: 16.43036198616028

 training RMSE Loss: 29.048288583755493


 79%|███████▉  | 79/100 [35:47<09:21, 26.75s/it]

	 validation RMSE Loss: 16.508248567581177

 training RMSE Loss: 29.103055953979492


 80%|████████  | 80/100 [36:14<08:55, 26.79s/it]

	 validation RMSE Loss: 16.449820041656494

 training RMSE Loss: 29.00503569841385


 81%|████████  | 81/100 [36:40<08:29, 26.79s/it]

	 validation RMSE Loss: 16.345178604125977

 training RMSE Loss: 29.034447729587555


 82%|████████▏ | 82/100 [37:07<08:01, 26.73s/it]

	 validation RMSE Loss: 16.357969999313354

 training RMSE Loss: 28.89510703086853


 83%|████████▎ | 83/100 [37:34<07:34, 26.76s/it]

	 validation RMSE Loss: 16.200063109397888

 training RMSE Loss: 28.86481887102127


 84%|████████▍ | 84/100 [38:00<07:07, 26.74s/it]

	 validation RMSE Loss: 16.394872069358826

 training RMSE Loss: 28.86031264066696


 85%|████████▌ | 85/100 [38:27<06:41, 26.75s/it]

	 validation RMSE Loss: 16.316006302833557

 training RMSE Loss: 28.82439684867859


 86%|████████▌ | 86/100 [38:54<06:14, 26.76s/it]

	 validation RMSE Loss: 16.237873673439026

 training RMSE Loss: 28.748775720596313


 87%|████████▋ | 87/100 [39:21<05:47, 26.76s/it]

	 validation RMSE Loss: 16.177836775779724

 training RMSE Loss: 28.739245235919952


 88%|████████▊ | 88/100 [39:47<05:20, 26.71s/it]

	 validation RMSE Loss: 16.180784106254578

 training RMSE Loss: 28.67156857252121


 89%|████████▉ | 89/100 [40:14<04:54, 26.80s/it]

	 validation RMSE Loss: 16.124260783195496

 training RMSE Loss: 28.683080732822418


 90%|█████████ | 90/100 [40:41<04:28, 26.88s/it]

	 validation RMSE Loss: 16.247516751289368

 training RMSE Loss: 28.58275157213211


 91%|█████████ | 91/100 [41:08<04:01, 26.87s/it]

	 validation RMSE Loss: 16.058409333229065

 training RMSE Loss: 28.539158523082733


 92%|█████████▏| 92/100 [41:35<03:34, 26.77s/it]

	 validation RMSE Loss: 16.094558715820312

 training RMSE Loss: 28.44906187057495


 93%|█████████▎| 93/100 [42:02<03:07, 26.81s/it]

	 validation RMSE Loss: 16.03903830051422

 training RMSE Loss: 28.450534880161285


 94%|█████████▍| 94/100 [42:28<02:40, 26.70s/it]

	 validation RMSE Loss: 15.826301097869873

 training RMSE Loss: 28.40821725130081


 95%|█████████▌| 95/100 [42:55<02:13, 26.68s/it]

	 validation RMSE Loss: 15.841913938522339

 training RMSE Loss: 28.40402388572693


 96%|█████████▌| 96/100 [43:21<01:46, 26.59s/it]

	 validation RMSE Loss: 15.86782681941986

 training RMSE Loss: 28.33410221338272


 97%|█████████▋| 97/100 [43:47<01:19, 26.51s/it]

	 validation RMSE Loss: 15.883020877838135

 training RMSE Loss: 28.255636870861053


 98%|█████████▊| 98/100 [44:14<00:53, 26.55s/it]

	 validation RMSE Loss: 15.839410662651062

 training RMSE Loss: 28.198251605033875


 99%|█████████▉| 99/100 [44:41<00:26, 26.58s/it]

	 validation RMSE Loss: 15.763977289199829

 training RMSE Loss: 28.21927011013031


100%|██████████| 100/100 [45:07<00:00, 27.08s/it]

	 validation RMSE Loss: 15.731103777885437





TODO: Now that you have selected your best model considering hte validation loss, please load the best saved model and test it on the test set

In [None]:
##your code goes here: