<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/ShopRunner/collie_recs/blob/main/tutorials/05_hybrid_model.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" /> Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/ShopRunner/collie_recs/blob/main/tutorials/05_hybrid_model.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source on GitHub</a>
  </td>
  <td>
    <a target="_blank" href="https://raw.githubusercontent.com/ShopRunner/collie_recs/main/tutorials/05_hybrid_model.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" /> Download notebook</a>
  </td>
</table>

In [None]:
# for Collab notebooks, we will start by installing the ``collie_recs`` library
!pip install collie_recs --quiet

In [2]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

%env DATA_PATH data/

env: DATA_PATH=data/


In [3]:
import os

import numpy as np
import pandas as pd
from pytorch_lightning.utilities.seed import seed_everything
from IPython.display import HTML
import joblib
import torch

from collie_recs.metrics import mapk, mrr, auc, evaluate_in_batches
from collie_recs.model import CollieTrainer, HybridPretrainedModel, MatrixFactorizationModel
from collie_recs.movielens import get_movielens_user_metadata, get_movielens_item_metadata, get_recommendation_visualizations

## Load Data From ``01_prepare_data`` Notebook 
If you're running this locally on Jupyter, you should be able to run the next cell quickly without a problem! If you are running this on Colab, you'll need to regenerate the data by running the cell below that, which should only take a few extra seconds to complete. 

In [4]:
try:
    # let's grab the ``Interactions`` objects we saved in the last notebook
    train_interactions = joblib.load(os.path.join(os.environ.get('DATA_PATH', 'data/'),
                                                  'train_interactions.pkl'))
    val_interactions = joblib.load(os.path.join(os.environ.get('DATA_PATH', 'data/'),
                                                'val_interactions.pkl'))
except FileNotFoundError:
    # we're running this notebook on Colab where results from the first notebook are not saved
    # regenerate this data below
    from collie_recs.cross_validation import stratified_split
    from collie_recs.interactions import Interactions
    from collie_recs.movielens import read_movielens_df
    from collie_recs.utils import convert_to_implicit, remove_users_with_fewer_than_n_interactions


    df = read_movielens_df(decrement_ids=True)
    implicit_df = convert_to_implicit(df, min_rating_to_keep=4)
    implicit_df = remove_users_with_fewer_than_n_interactions(implicit_df, min_num_of_interactions=3)

    interactions = Interactions(
        users=implicit_df['user_id'],
        items=implicit_df['item_id'],
        ratings=implicit_df['rating'],
        allow_missing_ids=True,
    )

    train_interactions, val_interactions = stratified_split(interactions, test_p=0.1, seed=42)


print('Train:', train_interactions)
print('Val:  ', val_interactions)

Checking for and removing duplicate user, item ID pairs...
Checking ``num_negative_samples`` is valid...
Maximum number of items a user has interacted with: 378
Generating positive items set...
Generating positive items set...
Generating positive items set...
Train: Interactions object with 49426 interactions between 943 users and 1674 items, returning 10 negative samples per interaction.
Val:   Interactions object with 5949 interactions between 943 users and 1674 items, returning 10 negative samples per interaction.


# Hybrid Collie Model 
In this notebook, we will use this same metadata and incorporate it directly into the model architecture with a hybrid Collie model. 

## Read in Data

In [5]:
# read in the same metadata used in notebooks ``03`` and ``04``
metadata_item_df = get_movielens_item_metadata()

metadata_user_df = get_movielens_user_metadata()


metadata_user_df.head()

Unnamed: 0,age,gender,occupation,zip code
0,24,M,technician,85711
1,53,F,other,94043
2,23,M,writer,32067
3,24,M,technician,43537
4,33,F,other,15213


In [6]:
# and, as always, set our random seed
seed_everything(22)

Global seed set to 22


22

## Train a ``MatrixFactorizationModel`` 

The first step towards training a Collie Hybrid model is to train a regular ``MatrixFactorizationModel`` to generate rich user and item embeddings. We'll use these embeddings in a ``HybridPretrainedModel`` a bit later. 

In [18]:
model = MatrixFactorizationModel(
    train=train_interactions,
    val=val_interactions,
    embedding_dim=30,
    lr=1e-2,
)

In [19]:
trainer = CollieTrainer(model=model, max_epochs=10, deterministic=True)

trainer.fit(model)

GPU available: False, used: False
TPU available: False, using: 0 TPU cores

  | Name            | Type            | Params
----------------------------------------------------
0 | user_biases     | ZeroEmbedding   | 943   
1 | item_biases     | ZeroEmbedding   | 1.7 K 
2 | user_embeddings | ScaledEmbedding | 28.3 K
3 | item_embeddings | ScaledEmbedding | 50.2 K
4 | dropout         | Dropout         | 0     
----------------------------------------------------
81.1 K    Trainable params
0         Non-trainable params
81.1 K    Total params
0.325     Total estimated model params size (MB)


Validation sanity check:   0%|          | 0/2 [00:00<?, ?it/s]



Validation sanity check:  50%|█████     | 1/2 [00:00<00:00,  1.29it/s]



                                                                      

Global seed set to 22


Epoch 0:   0%|          | 0/55 [00:00<?, ?it/s] 



Epoch 0:  89%|████████▉ | 49/55 [00:03<00:00, 13.85it/s, loss=1.94, v_num=3]



Epoch 1:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.94, v_num=3]



Epoch 1:  89%|████████▉ | 49/55 [00:03<00:00, 14.98it/s, loss=1.74, v_num=3]



Epoch 2:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.74, v_num=3]



Epoch 2:  89%|████████▉ | 49/55 [00:02<00:00, 17.88it/s, loss=1.59, v_num=3]



Epoch 2:  93%|█████████▎| 51/55 [00:02<00:00, 17.71it/s, loss=1.59, v_num=3]Epoch     3: reducing learning rate of group 0 to 1.0000e-03.
Epoch     3: reducing learning rate of group 0 to 1.0000e-03.
Epoch 3:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.59, v_num=3]



Epoch 3:  89%|████████▉ | 49/55 [00:02<00:00, 18.36it/s, loss=1.45, v_num=3]



Epoch 4:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.45, v_num=3]



Epoch 4:  89%|████████▉ | 49/55 [00:02<00:00, 18.62it/s, loss=1.43, v_num=3]



Epoch 4:  93%|█████████▎| 51/55 [00:02<00:00, 18.48it/s, loss=1.43, v_num=3]Epoch     5: reducing learning rate of group 0 to 1.0000e-04.
Epoch     5: reducing learning rate of group 0 to 1.0000e-04.
Epoch 5:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.43, v_num=3]



Epoch 5:  89%|████████▉ | 49/55 [00:02<00:00, 18.98it/s, loss=1.43, v_num=3]



Epoch 6:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.43, v_num=3]



Epoch 6:  89%|████████▉ | 49/55 [00:02<00:00, 19.59it/s, loss=1.43, v_num=3]



Epoch 6:  93%|█████████▎| 51/55 [00:02<00:00, 19.39it/s, loss=1.43, v_num=3]Epoch     7: reducing learning rate of group 0 to 1.0000e-05.
Epoch     7: reducing learning rate of group 0 to 1.0000e-05.
Epoch 7:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.43, v_num=3]



Epoch 7:  89%|████████▉ | 49/55 [00:02<00:00, 19.39it/s, loss=1.39, v_num=3]



Epoch 8:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.39, v_num=3]



Epoch 8:  89%|████████▉ | 49/55 [00:02<00:00, 19.26it/s, loss=1.42, v_num=3]



Epoch 8:  93%|█████████▎| 51/55 [00:02<00:00, 19.03it/s, loss=1.42, v_num=3]Epoch     9: reducing learning rate of group 0 to 1.0000e-06.
Epoch     9: reducing learning rate of group 0 to 1.0000e-06.
Epoch 9:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.42, v_num=3]



Epoch 9:  89%|████████▉ | 49/55 [00:02<00:00, 18.31it/s, loss=1.42, v_num=3]



Epoch 9: 100%|██████████| 55/55 [00:02<00:00, 19.31it/s, loss=1.42, v_num=3]




In [9]:
mapk_score, mrr_score, auc_score = evaluate_in_batches([mapk, mrr, auc], val_interactions, model)

print(f'Standard MAP@10 Score: {mapk_score}')
print(f'Standard MRR Score:    {mrr_score}')
print(f'Standard AUC Score:    {auc_score}')

100%|██████████| 48/48 [00:00<00:00, 60.83it/s]

Standard MAP@10 Score: 0.040716741048861166
Standard MRR Score:    0.14165364825928034
Standard AUC Score:    0.9064525723963534





## Train a ``HybridPretrainedModel`` 

With our trained ``model`` above, we can now use these embeddings and additional side data directly in a hybrid model. The architecture essentially takes our user embedding, item embedding, and item metadata for each user-item interaction, concatenates them, and sends it through a simple feedforward network to output a recommendation score. 

We can initially freeze the user and item embeddings from our previously-trained ``model``, train for a few epochs only optimizing our newly-added linear layers, and then train a model with everything unfrozen at a lower learning rate. We will show this process below. 

In [35]:
from collections import OrderedDict
import copy
from functools import partial
import os
from pathlib import Path
from typing import Callable, Dict, List, Optional, Union

import joblib
import numpy as np
import pandas as pd
import torch
from torch import nn
import torch.nn.functional as F
from torch.optim.lr_scheduler import ReduceLROnPlateau

from collie_recs.config import DATA_PATH
from collie_recs.model.base import (BasePipeline,
                                    INTERACTIONS_LIKE_INPUT,
                                    ScaledEmbedding)
from collie_recs.model.matrix_factorization import MatrixFactorizationModel
from collie_recs.utils import get_init_arguments, merge_docstrings


class HybridPretrainedModel(BasePipeline):
    # NOTE: the full docstring is merged in with ``BasePipeline``'s using ``merge_docstrings``.
    # Only the description of new or changed parameters are included in this docstring
    """
    Training pipeline for a hybrid recommendation model.

    ``HybridPretrainedModel`` models contain dense layers that process item metadata, concatenate
    this embedding with the user and item embeddings copied from a trained
    ``MatrixFactorizationModel``, and send this concatenated embedding through more dense layers to
    output a single float ranking / rating.

    All ``HybridPretrainedModel`` instances are subclasses of the ``LightningModule`` class
    provided by PyTorch Lightning. This means to train a model, you will need a
    ``collie_recs.model.CollieTrainer`` object, but the model can be saved and loaded without this
    ``Trainer`` instance. Example usage may look like:

    .. code-block:: python

        from collie_recs.model import CollieTrainer, HybridPretrainedModel, MatrixFactorizationModel


        # instantiate and fit a ``MatrixFactorizationModel`` as expected
        mf_model = MatrixFactorizationModel(train=train)
        mf_trainer = CollieTrainer(mf_model)
        mf_trainer.fit(mf_model)

        hybrid_model = HybridPretrainedModel(train=train,
                                             item_metadata=item_metadata,
                                             trained_model=mf_model)
        hybrid_trainer = CollieTrainer(hybrid_model)
        hybrid_trainer.fit(hybrid_model)
        hybrid_model.eval()

        # do evaluation as normal with ``hybrid_model``

        hybrid_model.save_model(path='model')
        new_hybrid_model = HybridPretrainedModel(load_model_path='model')

        # do evaluation as normal with ``new_hybrid_model``

    Parameters
    ----------
    item_metadata: torch.tensor, pd.DataFrame, or np.array, 2-dimensional
        The shape of the item metadata should be (num_items x metadata_features), and each item's
        metadata should be available when indexing a row by an item ID
    trained_model: ``collie_recs.model.MatrixFactorizationModel``
        Previously trained ``MatrixFactorizationModel`` model to extract embeddings from
    metadata_layers_dims: list
        List of linear layer dimensions to apply to the metadata only, starting with the dimension
        directly following ``metadata_features`` and ending with the dimension to concatenate with
        the item embeddings
    combined_layers_dims: list
        List of linear layer dimensions to apply to the concatenated item embeddings and item
        metadata, starting with the dimension directly following the shape of
        ``item_embeddings + metadata_features`` and ending with the dimension before the final
        linear layer to dimension 1
    freeze_embeddings: bool
        When initializing the model, whether or not to freeze ``trained_model``'s embeddings
    dropout_p: float
        Probability of dropout
    optimizer: torch.optim or str
        If a string, one of the following supported optimizers:

        * ``'sgd'`` (for ``torch.optim.SGD``)

        * ``'adam'`` (for ``torch.optim.Adam``)

    """
    def __init__(self,
                 train: INTERACTIONS_LIKE_INPUT = None,
                 val: INTERACTIONS_LIKE_INPUT = None,
                 item_metadata: Union[torch.tensor, pd.DataFrame, np.array] = None,
                 user_metadata: Union[torch.tensor, pd.DataFrame, np.array] = None,
                 trained_model: MatrixFactorizationModel = None,
                 item_metadata_layers_dims: Optional[List[int]] = None,
                 user_metadata_layers_dims: Optional[List[int]] = None,
                 combined_layers_dims: List[int] = [128, 64, 32],
                 freeze_embeddings: bool = True,
                 dropout_p: float = 0.0,
                 lr: float = 1e-3,
                 lr_scheduler_func: Optional[Callable] = partial(ReduceLROnPlateau,
                                                                 patience=1,
                                                                 verbose=True),
                 weight_decay: float = 0.0,
                 optimizer: Union[str, Callable] = 'adam',
                 loss: Union[str, Callable] = 'hinge',
                 metadata_for_loss: Optional[Dict[str, torch.tensor]] = None,
                 metadata_for_loss_weights: Optional[Dict[str, float]] = None,
                 # y_range: Optional[Tuple[float, float]] = None,
                 load_model_path: Optional[str] = None,
                 map_location: Optional[str] = None):
        item_metadata_num_cols = None
        if load_model_path is None:
            if trained_model is None:
                raise ValueError('Must provide ``trained_model`` for ``HybridPretrainedModel``.')

            if item_metadata is None:
                raise ValueError('Must provide item metadata for ``HybridPretrainedModel``.')
            elif isinstance(item_metadata, pd.DataFrame):
                item_metadata = torch.from_numpy(item_metadata.to_numpy())
            elif isinstance(item_metadata, np.ndarray):
                item_metadata = torch.from_numpy(item_metadata)

            item_metadata = item_metadata.float()

            item_metadata_num_cols = item_metadata.shape[1]

            if user_metadata is None:
                raise ValueError('Must provide user metadata for ``HybridPretrainedModel``.')
            elif isinstance(user_metadata, pd.DataFrame):
                user_metadata = torch.from_numpy(user_metadata.to_numpy())
            elif isinstance(item_metadata, np.ndarray):
                user_metadata = torch.from_numpy(user_metadata)

            user_metadata = user_metadata.float()

            user_metadata_num_cols = user_metadata.shape[1]

        super().__init__(**get_init_arguments(),
                         item_metadata_num_cols=item_metadata_num_cols,
                         user_metadata_num_cols=user_metadata_num_cols
                         )

    __doc__ = merge_docstrings(BasePipeline, __doc__, __init__)

    def _load_model_init_helper(self, load_model_path: str, map_location: str, **kwargs) -> None:
        self.item_metadata = joblib.load(os.path.join(load_model_path, 'item_metadata.pkl'))
        self.user_metadata = joblib.load(os.path.join(load_model_path, 'user_metadata.pkl'))
        super()._load_model_init_helper(load_model_path=os.path.join(load_model_path, 'model.pth'),
                                        map_location=map_location)

    def _setup_model(self, **kwargs) -> None:
        """
        Method for building model internals that rely on the data passed in.

        This method will be called after ``prepare_data``.

        """
        if self.hparams.load_model_path is None:
            if not hasattr(self, '_trained_model'):
                self._trained_model = kwargs.pop('trained_model')
            if not hasattr(self, 'item_metadata'):
                self.item_metadata = kwargs.pop('item_metadata')
            if not hasattr(self, 'user_metadata'):
                self.item_metadata = kwargs.pop('user_metadata')

            # we are not loading in a model, so we will create a new model from scratch
            # we don't want to modify the ``trained_model``'s weights, so we deep copy
            self.embeddings = nn.Sequential(
                copy.deepcopy(self._trained_model.user_embeddings),
                copy.deepcopy(self._trained_model.item_embeddings)
            )

            if self.hparams.freeze_embeddings:
                self.freeze_embeddings()
            else:
                self.unfreeze_embeddings()

            # save hyperparameters that we need to be able to rebuilt the embedding layers on load
            self.hparams.user_num_embeddings = self.embeddings[0].num_embeddings
            self.hparams.user_embeddings_dim = self.embeddings[0].embedding_dim
            self.hparams.item_num_embeddings = self.embeddings[1].num_embeddings
            self.hparams.item_embeddings_dim = self.embeddings[1].embedding_dim
        else:
            # assume we are loading in a previously-saved model
            # set up dummy embeddings with the correct dimensions so we can load weights in
            self.embeddings = nn.Sequential(
                ScaledEmbedding(self.hparams.user_num_embeddings, self.hparams.user_embeddings_dim),
                ScaledEmbedding(self.hparams.item_num_embeddings, self.hparams.item_embeddings_dim)
            )

        self.dropout = nn.Dropout(p=self.hparams.dropout_p)

        # set up metadata-only layers
        item_metadata_output_dim = self.hparams.item_metadata_num_cols
        self.item_metadata_layers = None
        if self.hparams.item_metadata_layers_dims is not None:
            item_metadata_layers_dims = (
                [self.hparams.item_metadata_num_cols] + self.hparams.item_metadata_layers_dims
            )
            self.item_metadata_layers = [
                nn.Linear(item_metadata_layers_dims[idx - 1], item_metadata_layers_dims[idx])
                for idx in range(1, len(item_metadata_layers_dims))
            ]
            for i, layer in enumerate(self.item_metadata_layers):
                nn.init.xavier_normal_(self.item_metadata_layers[i].weight)
                self.add_module('item_metadata_layer_{}'.format(i), layer)

            item_metadata_output_dim = item_metadata_layers_dims[-1]

        # set up metadata-only layers
        user_metadata_output_dim = self.hparams.user_metadata_num_cols
        self.user_metadata_layers = None
        if self.hparams.user_metadata_layers_dims is not None:
            user_metadata_layers_dims = (
                [self.hparams.user_metadata_num_cols] + self.hparams.user_metadata_layers_dims
            )
            self.user_metadata_layers = [
                nn.Linear(user_metadata_layers_dims[idx - 1], user_metadata_layers_dims[idx])
                for idx in range(1, len(user_metadata_layers_dims))
            ]
            for i, layer in enumerate(self.user_metadata_layers):
                nn.init.xavier_normal_(self.user_metadata_layers[i].weight)
                self.add_module('user_metadata_layer_{}'.format(i), layer)

            user_metadata_output_dim = user_metadata_layers_dims[-1]

        # set up combined layers
        combined_dimension_input = (
            self.hparams.user_embeddings_dim
            + self.hparams.item_embeddings_dim
            + item_metadata_output_dim
            + user_metadata_output_dim
        )
        combined_layers_dims = [combined_dimension_input] + self.hparams.combined_layers_dims + [1]
        self.combined_layers = [
            nn.Linear(combined_layers_dims[idx - 1], combined_layers_dims[idx])
            for idx in range(1, len(combined_layers_dims))
        ]
        for i, layer in enumerate(self.combined_layers):
            nn.init.xavier_normal_(self.combined_layers[i].weight)
            self.add_module('combined_layer_{}'.format(i), layer)

    def forward(self, users: torch.tensor, items: torch.tensor) -> torch.tensor:
        """
        Forward pass through the model.

        Parameters
        ----------
        users: tensor, 1-d
            Array of user indices
        items: tensor, 1-d
            Array of item indices

        Returns
        -------
        preds: tensor, 1-d
            Predicted ratings or rankings

        """
        # TODO: remove self.device and let lightning do it
        item_metadata_output = self.item_metadata[items, :].to(self.device)
        if self.item_metadata_layers is not None:
            for item_metadata_nn_layer in self.item_metadata_layers:
                item_metadata_output = self.dropout(
                    F.leaky_relu(
                        item_metadata_nn_layer(item_metadata_output)
                    )
                )
        user_metadata_output = self.user_metadata[users, :].to(self.device)
        if self.user_metadata_layers is not None:
            for user_metadata_nn_layer in self.user_metadata_layers:
                user_metadata_output = self.dropout(
                    F.leaky_relu(
                        user_metadata_nn_layer(user_metadata_output)
                    )
                )

        combined_output = torch.cat((self.embeddings[0](users),
                                     self.embeddings[1](items),
                                     item_metadata_output,
                                     user_metadata_output
                                     ), 1)
        for combined_nn_layer in self.combined_layers[:-1]:
            combined_output = self.dropout(
                F.leaky_relu(
                    combined_nn_layer(combined_output)
                )
            )

        pred_scores = self.combined_layers[-1](combined_output)

        return pred_scores.squeeze()

    def _get_item_embeddings(self) -> np.array:
        """Get item embeddings."""
        # TODO: update this to get the embeddings post-MLP
        return self.embeddings[1](
            torch.arange(self.hparams.num_items, device=self.device)
        ).detach().cpu()

    def freeze_embeddings(self) -> None:
        """Remove gradient requirement from the embeddings."""
        self.embeddings[0].weight.requires_grad = False
        self.embeddings[1].weight.requires_grad = False

    def unfreeze_embeddings(self) -> None:
        """Require gradients for the embeddings."""
        self.embeddings[0].weight.requires_grad = True
        self.embeddings[1].weight.requires_grad = True

    def save_model(self,
                   path: Union[str, Path] = os.path.join(DATA_PATH / 'model'),
                   overwrite: bool = False) -> None:
        """
        Save the model's state dictionary, hyperparameters, and item metadata.

        While PyTorch Lightning offers a way to save and load models, there are two main reasons
        for overriding these:

        1) To properly save and load a model requires the ``Trainer`` object, meaning that all
           deployed models will require Lightning to run the model, which is not actually needed
           for inference.

        2) In the v0.8.4 release, loading a model back in leads to a ``RuntimeError`` unable to load
           in weights.

        Parameters
        ----------
        path: str or Path
            Directory path to save model and data files
        overwrite: bool
            Whether or not to overwrite existing data

        """
        path = str(path)

        if os.path.exists(path):
            if os.listdir(path) and overwrite is False:
                raise ValueError(f'Data exists in ``path`` at {path} and ``overwrite`` is False.')

        Path(path).mkdir(parents=True, exist_ok=True)
        joblib.dump(self.item_metadata, os.path.join(path, 'item_metadata.pkl'))
        joblib.dump(self.user_metadata, os.path.join(path, 'user_metadata.pkl'))

        # preserve ordering while extracting the state dictionary without the ``_trained_model``
        # component
        state_dict_keys_to_save = [
            k for k, _ in self.state_dict().items() if '_trained_model' not in k
        ]
        state_dict_vals_to_save = [
            v for k, v in self.state_dict().items() if '_trained_model' not in k
        ]
        state_dict_to_save = OrderedDict(zip(state_dict_keys_to_save, state_dict_vals_to_save))

        dict_to_save = {'state_dict': state_dict_to_save, 'hparams': self.hparams}
        torch.save(dict_to_save, os.path.join(path, 'model.pth'))

    def load_from_hybrid_model(self, hybrid_model) -> None:
        """
        Copy hyperparameters and state dictionary from an existing ``HybridPretrainedModel``
        instance.

        This is particularly useful for creating another PyTorch Lightning trainer object to
        fine-tune copied-over embeddings from a ``MatrixFactorizationModel`` instance.

        Parameters
        ----------
        hybrid_model: ``collie_recs.model.HybridPretrainedModel``
            HybridPretrainedModel containing hyperparameters and state dictionary to copy over

        """
        for key, value in hybrid_model.hparams.items():
            self.hparams[key] = value

        self._setup_model()
        self.load_state_dict(state_dict=hybrid_model.state_dict())
        self.eval()


In [36]:
# we will apply a linear layer to the metadata with ``metadata_layers_dims`` and
# a linear layer to the combined embeddings and metadata data with ``combined_layers_dims``
hybrid_model = HybridPretrainedModel(
    train=train_interactions,
    val=val_interactions,
    item_metadata=metadata_item_df,
    user_metadata=metadata_user_df[['age']],
    trained_model=model,
    item_metadata_layers_dims=[8],
    combined_layers_dims=[16],
    lr=1e-2,
    freeze_embeddings=True,
)

In [37]:
hybrid_trainer = CollieTrainer(model=hybrid_model, max_epochs=10, deterministic=True)

hybrid_trainer.fit(hybrid_model)

GPU available: False, used: False
TPU available: False, using: 0 TPU cores

  | Name                  | Type                     | Params
-------------------------------------------------------------------
0 | _trained_model        | MatrixFactorizationModel | 81.1 K
1 | embeddings            | Sequential               | 78.5 K
2 | dropout               | Dropout                  | 0     
3 | item_metadata_layer_0 | Linear                   | 232   
4 | combined_layer_0      | Linear                   | 1.1 K 
5 | combined_layer_1      | Linear                   | 17    
-------------------------------------------------------------------
82.5 K    Trainable params
78.5 K    Non-trainable params
161 K     Total params
0.644     Total estimated model params size (MB)


Validation sanity check:   0%|          | 0/2 [00:00<?, ?it/s]



IndexError: index 1011 is out of bounds for dimension 0 with size 943

In [None]:
#### TO DO TOMORROW - think this has arisen as dataset is deduplicated, maybe need to reindex of deduped interactions set

In [12]:
mapk_score, mrr_score, auc_score = evaluate_in_batches([mapk, mrr, auc], val_interactions, hybrid_model)

print(f'Hybrid MAP@10 Score: {mapk_score}')
print(f'Hybrid MRR Score:    {mrr_score}')
print(f'Hybrid AUC Score:    {auc_score}')

100%|██████████| 48/48 [00:00<00:00, 50.65it/s]

Hybrid MAP@10 Score: 0.045323794238148196
Hybrid MRR Score:    0.1632128057174875
Hybrid AUC Score:    0.9050095811517882





In [14]:
hybrid_model_unfrozen = HybridPretrainedModel(
    train=train_interactions,
    val=val_interactions,
    item_metadata=metadata_item_df,
    trained_model=model,
    metadata_layers_dims=[8],
    combined_layers_dims=[16],
    lr=1e-4,
    freeze_embeddings=False,
)

hybrid_model.unfreeze_embeddings()
hybrid_model_unfrozen.load_from_hybrid_model(hybrid_model)

In [15]:
hybrid_trainer_unfrozen = CollieTrainer(model=hybrid_model_unfrozen, max_epochs=10, deterministic=True)

hybrid_trainer_unfrozen.fit(hybrid_model_unfrozen)

GPU available: False, used: False
TPU available: False, using: 0 TPU cores

  | Name             | Type                     | Params
--------------------------------------------------------------
0 | _trained_model   | MatrixFactorizationModel | 81.1 K
1 | embeddings       | Sequential               | 78.5 K
2 | dropout          | Dropout                  | 0     
3 | metadata_layer_0 | Linear                   | 232   
4 | combined_layer_0 | Linear                   | 1.1 K 
5 | combined_layer_1 | Linear                   | 17    
--------------------------------------------------------------
82.5 K    Trainable params
78.5 K    Non-trainable params
160 K     Total params
0.644     Total estimated model params size (MB)


Validation sanity check:   0%|          | 0/2 [00:00<?, ?it/s]



                                                                      

Global seed set to 22


Epoch 0:   0%|          | 0/55 [00:00<?, ?it/s] 



Epoch 0:  87%|████████▋ | 48/55 [00:00<00:00, 48.51it/s, loss=1.63, v_num=2]



Epoch 0:  89%|████████▉ | 49/55 [00:01<00:00, 48.22it/s, loss=1.62, v_num=2]







Epoch 1:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.62, v_num=2]



Epoch 1:  87%|████████▋ | 48/55 [00:00<00:00, 57.28it/s, loss=1.6, v_num=2]



Epoch 1:  89%|████████▉ | 49/55 [00:00<00:00, 57.18it/s, loss=1.58, v_num=2]



Epoch 2:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.58, v_num=2]



Epoch 2:  91%|█████████ | 50/55 [00:00<00:00, 61.14it/s, loss=1.53, v_num=2]



Epoch     3: reducing learning rate of group 0 to 1.0000e-03.
Epoch 2: 100%|██████████| 55/55 [00:01<00:00, 53.10it/s, loss=1.53, v_num=2]



Epoch 3:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.53, v_num=2]



Epoch 3:   5%|▌         | 3/55 [00:00<00:04, 12.94it/s, loss=1.55, v_num=2]



Epoch 3:  91%|█████████ | 50/55 [00:00<00:00, 57.87it/s, loss=1.58, v_num=2]



Epoch 4:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.58, v_num=2]



Epoch 4:  91%|█████████ | 50/55 [00:00<00:00, 66.09it/s, loss=1.59, v_num=2]



Epoch     5: reducing learning rate of group 0 to 1.0000e-04.
Epoch 5:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.59, v_num=2]



Epoch 5:   4%|▎         | 2/55 [00:00<00:05,  9.40it/s, loss=1.59, v_num=2]



Epoch 5:  87%|████████▋ | 48/55 [00:00<00:00, 57.61it/s, loss=1.6, v_num=2]



Epoch 5:  91%|█████████ | 50/55 [00:00<00:00, 58.08it/s, loss=1.6, v_num=2]



Epoch 6:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.6, v_num=2]



Epoch 6:  91%|█████████ | 50/55 [00:00<00:00, 63.12it/s, loss=1.61, v_num=2]



Epoch     7: reducing learning rate of group 0 to 1.0000e-05.
Epoch 7:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.61, v_num=2]



Epoch 7:  91%|█████████ | 50/55 [00:00<00:00, 60.05it/s, loss=1.58, v_num=2]



Epoch 8:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.58, v_num=2]



Epoch 8:  89%|████████▉ | 49/55 [00:00<00:00, 60.64it/s, loss=1.61, v_num=2]



Epoch 8:  93%|█████████▎| 51/55 [00:00<00:00, 53.94it/s, loss=1.61, v_num=2]Epoch     9: reducing learning rate of group 0 to 1.0000e-06.
Epoch 9:   0%|          | 0/55 [00:00<?, ?it/s, loss=1.61, v_num=2]



Epoch 9:  89%|████████▉ | 49/55 [00:00<00:00, 61.07it/s, loss=1.6, v_num=2]



Epoch 9: 100%|██████████| 55/55 [00:00<00:00, 56.15it/s, loss=1.6, v_num=2]




In [None]:
mapk_score, mrr_score, auc_score = evaluate_in_batches([mapk, mrr, auc],
                                                       val_interactions,
                                                       hybrid_model_unfrozen)

print(f'Hybrid Unfrozen MAP@10 Score: {mapk_score}')
print(f'Hybrid Unfrozen MRR Score:    {mrr_score}')
print(f'Hybrid Unfrozen AUC Score:    {auc_score}')

  0%|          | 0/48 [00:00<?, ?it/s]

Hybrid Unfrozen MAP@10 Score: 0.048911285792920596
Hybrid Unfrozen MRR Score:    0.16522896186213332
Hybrid Unfrozen AUC Score:    0.9048254064693573


Note here that while our ``MAP@10`` and ``MRR`` scores went down slightly from the frozen version of the model above, our ``AUC`` score increased. For implicit recommendation models, each evaluation metric is nuanced in what it represents for real world recommendations. 

You can read more about each evaluation metric by checking out the [Mean Average Precision at K (MAP@K)](https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Mean_average_precision), [Mean Reciprocal Rank](https://en.wikipedia.org/wiki/Mean_reciprocal_rank), and [Area Under the Curve (AUC)](https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve) Wikipedia pages. 

In [16]:
user_id = np.random.randint(0, train_interactions.num_users)

display(
    HTML(
        get_recommendation_visualizations(
            model=hybrid_model_unfrozen,
            user_id=user_id,
            filter_films=True,
            shuffle=True,
            detailed=True,
        )
    )
)

URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1091)>

The metrics and results look great, and we should only see a larger difference compared to a standard model as our data becomes more nuanced and complex (such as with MovieLens 10M data). 

If we're happy with this model, we can go ahead and save it for later! 

## Save and Load a Hybrid Model 

In [None]:
# we can save the model with...
os.makedirs('models', exist_ok=True)
hybrid_model_unfrozen.save_model('models/hybrid_model_unfrozen')

In [None]:
# ... and if we wanted to load that model back in, we can do that easily...
hybrid_model_loaded_in = HybridPretrainedModel(load_model_path='models/hybrid_model_unfrozen')


hybrid_model_loaded_in

HybridPretrainedModel(
  (embeddings): Sequential(
    (0): ScaledEmbedding(943, 30)
    (1): ScaledEmbedding(1674, 30)
  )
  (dropout): Dropout(p=0.0, inplace=False)
  (metadata_layer_0): Linear(in_features=28, out_features=8, bias=True)
  (combined_layer_0): Linear(in_features=68, out_features=16, bias=True)
  (combined_layer_1): Linear(in_features=16, out_features=1, bias=True)
)

That's the end of our tutorials, but it's not the end of the awesome features available in Collie. Check out all the different available architectures in the documentation [here](https://collie.readthedocs.io/en/latest/index.html)! 

----- 