# NGCF with MSD Dataset and StrongGeneralization Scenario

In this notebook, the implementation of NGCF in RecPack and the experimental part to generate the results of the algorithm will be presented. 
The notebook contains:
1. The implementation of NGCF in RecPack.
2. The 10% of MSD Dataset from RecPack and the StrongGeneralization Scenario has been used to split the data.
3. The StrongGeneralization Scenario to split the data.
4. The RecPack Pipeline Builder to run the experiments, including the splitted dataset, the algorithms and metrics to run. Hyperparameter has been performed in the Pipeline.

Please make sure you have installed all the latest libraries in your Python environment, in order to have a successful run of the code.

## NGCF implementation in RecPack

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch_sparse import SparseTensor, matmul
from typing import List, Tuple, Optional
from recpack.algorithms.base import TorchMLAlgorithm
from recpack.matrix.interaction_matrix import InteractionMatrix
from recpack.algorithms.loss_functions import bpr_loss
from recpack.algorithms.samplers import PositiveNegativeSampler
from scipy.sparse import csr_matrix
import logging

logger = logging.getLogger(__name__)

# Neural Graph Collaborative Filtering (NGCF) model implementation
class NGCF(nn.Module):
    def __init__(self, num_users, num_items, embedding_dim=64, n_layers=3, dropout=0.0, node_dropout=0.0, message_dropout=0.0):
        """
        Initialize the NGCF model with user and item embeddings.

        Args:
            num_users (int): Number of users.
            num_items (int): Number of items.
            embedding_dim (int): Dimension of the embedding vectors.
            n_layers (int): Number of hidden layers.
            dropout (float): Dropout rate for layers.
            node_dropout (float): Dropout rate applied to node embeddings.
            message_dropout (float): Dropout rate applied during message passing.
        """
        super(NGCF, self).__init__()
        self.num_users = num_users
        self.num_items = num_items
        self.embedding_dim = embedding_dim
        self.n_layers = n_layers
        self.dropout = dropout
        self.node_dropout = node_dropout
        self.message_dropout = message_dropout

        # Initialize user and item embeddings
        self.user_embedding = nn.Embedding(num_users, embedding_dim)
        self.item_embedding = nn.Embedding(num_items, embedding_dim)
        
        # Define a list of linear layers for each propagation layer
        self.layers = nn.ModuleList([nn.Linear(embedding_dim, embedding_dim) for _ in range(n_layers)])
        
        # Dropout layer for regularization
        self.dropout_layer = nn.Dropout(dropout)
        
        # Initialize the parameters of the model
        self.reset_parameters()

    def reset_parameters(self):
        """
        Initialize model parameters using Xavier uniform initialization.
        """
        nn.init.xavier_uniform_(self.user_embedding.weight)
        nn.init.xavier_uniform_(self.item_embedding.weight)
        for layer in self.layers:
            nn.init.xavier_uniform_(layer.weight)

    def message_dropout_func(self, graph):
        """
        Apply message dropout during graph convolution.

        Args:
            graph (SparseTensor): The graph's sparse adjacency matrix.

        Returns:
            SparseTensor: The adjacency matrix after applying message dropout.
        """
        if self.message_dropout > 0:
            row, col, value = graph.coo()
            mask = torch.rand(row.size(0)) > self.message_dropout
            row, col, value = row[mask], col[mask], value[mask]
            graph = SparseTensor(row=row, col=col, value=value, sparse_sizes=graph.sparse_sizes())
        return graph

    def node_dropout_func(self, embeddings):
        """
        Apply node dropout to the embeddings.

        Args:
            embeddings (torch.Tensor): The node embeddings.

        Returns:
            torch.Tensor: The embeddings after applying node dropout.
        """
        if self.node_dropout > 0:
            mask = (torch.rand(embeddings.size(0)) > self.node_dropout).float().to(embeddings.device)
            embeddings = embeddings * mask.unsqueeze(1)
        return embeddings

    def forward(self, graph):
        """
        Forward pass for the NGCF model.

        Args:
            graph (SparseTensor): The graph's sparse adjacency matrix.

        Returns:
            Tuple[torch.Tensor, torch.Tensor]: Final user and item embeddings.
        """
        user_emb = self.user_embedding.weight
        item_emb = self.item_embedding.weight
        
        # Apply node dropout to user and item embeddings
        user_emb = self.node_dropout_func(user_emb)
        item_emb = self.node_dropout_func(item_emb)
        
        # Concatenate user and item embeddings
        all_emb = torch.cat([user_emb, item_emb], dim=0)
        embs = [all_emb]

        # Perform message passing and propagate embeddings through layers
        for layer in self.layers:
            graph = self.message_dropout_func(graph)
            # Preventing CUDA/Library version error
            try:
                all_emb = matmul(graph, all_emb)
            except RuntimeError as e:
                break
            all_emb = layer(all_emb)
            all_emb = torch.relu(all_emb)
            all_emb = self.dropout_layer(all_emb)
            embs.append(all_emb)

        # Compute the final embeddings by averaging the embeddings across layers
        final_embedding = torch.mean(torch.stack(embs, dim=1), dim=1)
        
        # Split the final embeddings back into user and item embeddings
        user_emb_final, item_emb_final = torch.split(final_embedding, [self.num_users, self.num_items])

        return user_emb_final, item_emb_final

In [2]:
from recpack.algorithms.base import TorchMLAlgorithm
from recpack.matrix import Matrix
from recpack.matrix.interaction_matrix import InteractionMatrix
from recpack.algorithms.loss_functions import bpr_loss
from recpack.algorithms.samplers import PositiveNegativeSampler
from recpack.algorithms.stopping_criterion import (
    EarlyStoppingException,
    StoppingCriterion,
)
from typing import List, Tuple, Optional
import numpy as np
from scipy.sparse import csr_matrix, lil_matrix, coo_matrix
import torch
import torch.optim as optim
import tempfile
import time
import logging

logger = logging.getLogger(__name__)

# NGCFAlgorithm: An implementation of the NGCF algorithm using TorchMLAlgorithm as a base class
class NGCFAlgorithm(TorchMLAlgorithm):
    def __init__(
        self,
        batch_size: int = 256,
        max_epochs: int = 100,
        learning_rate: float = 0.001,
        embedding_dim: int = 64,
        n_layers: int = 3,
        dropout: float = 0.1,
        node_dropout: float = 0.0,
        message_dropout: float = 0.0,
        stopping_criterion: str = "bpr",
        stop_early: bool = True,
        max_iter_no_change: int = 5,
        min_improvement: float = 0.01,
        seed: Optional[int] = None,
        save_best_to_file: bool = False,
        keep_last: bool = False,
        predict_topK: Optional[int] = None,
        validation_sample_size: Optional[int] = None,
        grad_clip: float = 1.0,  # Gradient clipping value
    ):
        """
        Initialize the NGCFAlgorithm with various hyperparameters.

        Args:
            batch_size (int): Number of samples per batch.
            max_epochs (int): Maximum number of training epochs.
            learning_rate (float): Learning rate for the optimizer.
            embedding_dim (int): Dimension of the embedding vectors.
            n_layers (int): Number of hidden layers in the NGCF model.
            dropout (float): Dropout rate for regularization.
            node_dropout (float): Dropout rate applied to node embeddings.
            message_dropout (float): Dropout rate applied during message passing.
            stopping_criterion (str): Criterion to stop training early.
            stop_early (bool): Whether to enable early stopping.
            max_iter_no_change (int): Maximum iterations with no improvement for early stopping.
            min_improvement (float): Minimum improvement required for early stopping.
            seed (Optional[int]): Random seed for reproducibility.
            save_best_to_file (bool): Whether to save the best model to a file.
            keep_last (bool): Whether to keep the last model.
            predict_topK (Optional[int]): Number of top-K predictions to consider.
            validation_sample_size (Optional[int]): Size of the validation sample.
            grad_clip (float): Maximum gradient norm for clipping.
        """
        self.embedding_dim = embedding_dim
        self.n_layers = n_layers
        self.dropout = dropout
        self.node_dropout = node_dropout
        self.message_dropout = message_dropout
        self.grad_clip = grad_clip
        super().__init__(
            batch_size=batch_size,
            max_epochs=max_epochs,
            learning_rate=learning_rate,
            stopping_criterion=stopping_criterion,
            stop_early=stop_early,
            max_iter_no_change=max_iter_no_change,
            min_improvement=min_improvement,
            seed=seed,
            save_best_to_file=save_best_to_file,
            keep_last=keep_last,
            predict_topK=predict_topK,
            validation_sample_size=validation_sample_size,
        )
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

    def _init_model(self, train: InteractionMatrix) -> None:
        """
        Initialize the NGCF model and optimizer.

        Args:
            train (InteractionMatrix): The training interaction matrix.
        """
        num_users, num_items = train.shape
        self.model_ = NGCF(num_users, num_items, self.embedding_dim, self.n_layers, self.dropout, self.node_dropout, self.message_dropout).to(self.device)
        self.optimizer = optim.Adam(self.model_.parameters(), lr=self.learning_rate)

    def _create_sparse_graph(self, interaction_matrix: csr_matrix, num_users: int, num_items: int) -> SparseTensor:
        """
        Create a sparse graph from the interaction matrix.

        Args:
            interaction_matrix (csr_matrix): The interaction matrix in CSR format.
            num_users (int): Number of users.
            num_items (int): Number of items.

        Returns:
            SparseTensor: A sparse tensor representing the graph.
        """
        coo = interaction_matrix.tocoo()
        row = torch.tensor(coo.row, dtype=torch.long)
        col = torch.tensor(coo.col, dtype=torch.long)
        value = torch.tensor(coo.data, dtype=torch.float32)
        shape = (num_users + num_items, num_users + num_items)
        graph = SparseTensor(row=row, col=col, value=value, sparse_sizes=shape).to(self.device)
        return graph

    def _train_epoch(self, train: InteractionMatrix) -> List[float]:
        """
        Train the model for one epoch.

        Args:
            train (InteractionMatrix): The training interaction matrix.

        Returns:
            List[float]: A list of losses for each batch.
        """
        self.model_.train()
        interaction_matrix = train  # Get the sparse matrix directly
        graph = self._create_sparse_graph(interaction_matrix, train.shape[0], train.shape[1])
        total_loss = 0
        losses = []

        sampler = PositiveNegativeSampler(num_negatives=1, batch_size=self.batch_size)

        # Iterate over samples generated by the PositiveNegativeSampler
        for user_indices, pos_item_indices, neg_item_indices in sampler.sample(interaction_matrix):
            user_indices = torch.tensor(user_indices).to(self.device)
            pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
            neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()

            self.optimizer.zero_grad()
            user_emb_final, item_emb_final = self.model_(graph)  # Call model only once
            pos_scores = user_emb_final[user_indices] @ item_emb_final[pos_item_indices].t()
            neg_scores = user_emb_final[user_indices] @ item_emb_final[neg_item_indices].t()

            loss = bpr_loss(pos_scores, neg_scores)

            if torch.isnan(loss).any() or torch.isinf(loss).any():
                continue

            loss.backward()
            torch.nn.utils.clip_grad_norm_(self.model_.parameters(), max_norm=self.grad_clip)  # Gradient clipping
            self.optimizer.step()

            total_loss += loss.item()
            losses.append(loss.item())

        if len(losses) == 0:
            return [float('nan')]

        return losses

    def _batch_predict(self, X: InteractionMatrix, users: List[int]) -> csr_matrix:
        """
        Make batch predictions for a list of users.

        Args:
            X (InteractionMatrix): The interaction matrix.
            users (List[int]): List of user indices to make predictions for.

        Returns:
            csr_matrix: A sparse matrix with the prediction scores.
        """
        self.model_.eval()
        graph = self._create_sparse_graph(X, X.shape[0], X.shape[1])
        user_indices = torch.tensor(users).to(self.device)
        item_indices = torch.arange(X.shape[1]).to(self.device)
        
        with torch.no_grad():
            user_emb_final, item_emb_final = self.model_(graph)
            scores = user_emb_final[user_indices] @ item_emb_final.t()
            scores = scores.cpu().numpy()
        
        result = lil_matrix((X.shape[0], X.shape[1]))
        for i, user in enumerate(users):
            result[user] = scores[i]
        
        return result.tocsr()

In [3]:
from recpack.datasets import Netflix, DummyDataset
from recpack.pipelines import PipelineBuilder
from recpack.scenarios import StrongGeneralization
from recpack.pipelines import ALGORITHM_REGISTRY
import pandas as pd

In [4]:
ALGORITHM_REGISTRY.register("NGCFAlgorithm", NGCFAlgorithm)

## RecPack Dataset Importing

In [13]:
from recpack.datasets import MillionSongDataset
dataset = MillionSongDataset()

In [14]:
dataset.fetch_dataset()

In [15]:
dataset

<recpack.datasets.million_song_dataset.MillionSongDataset at 0x7fd431e55250>

In [16]:
df = dataset._load_dataframe()

## Datasets without Timestamps sampling

In [17]:
# Count interactions per user and per song
user_interactions = df['userId'].value_counts().reset_index()
user_interactions.columns = ['userId', 'user_interactions']

song_interactions = df['songId'].value_counts().reset_index()
song_interactions.columns = ['songId', 'song_interactions']

# Merge the interaction counts back to the original dataframe
df = df.merge(user_interactions, on='userId')
df = df.merge(song_interactions, on='songId')

# Calculate a combined interaction score
df['interaction_score'] = df['user_interactions'] + df['song_interactions']

# Rank based on the interaction score
df['rank'] = df['interaction_score'].rank(method='first', ascending=False)

# Select the top 10%
filtered_df = df[df['rank'] <= len(df) * 0.1]

# Drop helper columns
filtered_df = filtered_df.drop(columns=['user_interactions', 'song_interactions', 'interaction_score', 'rank'])

In [18]:
df

Unnamed: 0,userId,songId,user_interactions,song_interactions,interaction_score,rank
0,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOAKIMP12A8C130995,142,6698,6840,55439149.0
1,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOAPDEY12A81C210A9,142,2012,2154,91117622.0
2,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBBMDR12A8C13253B,142,6383,6525,56840187.0
3,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBBMDR12A8C13253B,142,6383,6525,56840188.0
4,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBFNSP12AF72A0E22,142,687,829,117735248.0
...,...,...,...,...,...,...
138680238,b7815dbb206eb2831ce0fe040d0aa537e2e800f7,SOUSMXX12AB0185C24,56,155529,155585,6266409.0
138680239,b7815dbb206eb2831ce0fe040d0aa537e2e800f7,SOWYSKH12AF72A303A,56,3306,3362,77443084.0
138680240,b7815dbb206eb2831ce0fe040d0aa537e2e800f7,SOWYSKH12AF72A303A,56,3306,3362,77443085.0
138680241,b7815dbb206eb2831ce0fe040d0aa537e2e800f7,SOWYSKH12AF72A303A,56,3306,3362,77443086.0


In [19]:
filtered_df

Unnamed: 0,userId,songId
28,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOFRQTD12A81C233C0
59,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOMGIYR12AB0187973
60,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOMGIYR12AB0187973
61,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOMGIYR12AB0187973
62,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOMGIYR12AB0187973
...,...,...
138680210,b7815dbb206eb2831ce0fe040d0aa537e2e800f7,SOOFYTN12A6D4F9B35
138680211,b7815dbb206eb2831ce0fe040d0aa537e2e800f7,SOOFYTN12A6D4F9B35
138680212,b7815dbb206eb2831ce0fe040d0aa537e2e800f7,SOOFYTN12A6D4F9B35
138680237,b7815dbb206eb2831ce0fe040d0aa537e2e800f7,SOUJVIT12A8C1451C1


## Dataset Preprocessing to Interaction Matrix

In [20]:
from recpack.matrix import InteractionMatrix
from recpack.preprocessing.preprocessors import DataFramePreprocessor
item_ix = 'songId'
user_ix = 'userId'

preprocessor = DataFramePreprocessor(item_ix=item_ix, user_ix=user_ix)
interaction_matrix = preprocessor.process(filtered_df)

  0%|          | 0/13868024 [00:00<?, ?it/s]

  0%|          | 0/13868024 [00:00<?, ?it/s]

## StrongGeneralization Scenario Splitting of Data

In [21]:
scenario = StrongGeneralization(frac_users_train=0.7, frac_interactions_in=0.8, validation=True)
scenario.split(interaction_matrix)

0it [00:00, ?it/s]

0it [00:00, ?it/s]

## Experimental RecPack Pipeline

In [22]:
pipeline_builder = PipelineBuilder()
ok = (scenario._validation_data_in, scenario._validation_data_out)
pipeline_builder.set_data_from_scenario(scenario)


# Add the baseline algorithms
#pipeline_builder.add_algorithm('ItemKNN', grid={'K': [100, 200, 400, 800]})
#pipeline_builder.add_algorithm('EASE', grid={'l2': [10, 100, 1000], 'alpha': [0, 0.1, 0.5]})

# Add our LightGCN algorithm
pipeline_builder.add_algorithm(
    'NGCFAlgorithm',
    grid={
        'learning_rate': [0.1, 0.01, 0.001],
        'dropout': [0.0, 0.1, 0.2]
    },
    params={
        'max_epochs': 5,
        'batch_size': 1024,
        'n_layers': 3,
        'stop_early': True,
        'max_iter_no_change': 5,
        'min_improvement': 0.01,
        'save_best_to_file': True,
        'keep_last': True
    }
)

# Add NDCG, Recall, and HR metrics to be evaluated at 10, 20, and 50
pipeline_builder.add_metric('NDCGK', [10, 20, 50])
pipeline_builder.add_metric('RecallK', [10, 20, 50])
pipeline_builder.add_metric('HitK', [10, 20, 50])

# Set the optimisation metric
pipeline_builder.set_optimisation_metric('RecallK', 20)

# Construct pipeline
pipeline = pipeline_builder.build()

# Debugging: Output the shape of the training data
#print(f"Training data shape: {im.shape}")

# Run pipeline, will first do optimisation, and then evaluation
pipeline.run()



  0%|          | 0/1 [00:00<?, ?it/s]

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 19:34:49,846 - base - recpack - INFO - Processed epoch 0 in 20.36 s.Batch Training Loss = 0.6367
2024-07-31 19:37:43,609 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931591869917388, which is better than previous iterations.
2024-07-31 19:37:43,612 - base - recpack - INFO - Evaluation at end of 0 took 173.76 s.
2024-07-31 19:37:50,432 - base - recpack - INFO - Processed epoch 1 in 6.82 s.Batch Training Loss = 0.9321
2024-07-31 19:40:52,663 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931874539860765, which is worse than previous iterations.
2024-07-31 19:40:52,666 - base - recpack - INFO - Evaluation at end of 1 took 182.23 s.
2024-07-31 19:40:59,759 - base - recpack - INFO - Processed epoch 2 in 7.09 s.Batch Training Loss = 0.7992
2024-07-31 19:44:03,072 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931576623125174, which is worse than previous iterations.
2024-07-31 19:44:03,074 - base - recpack -

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 19:53:59,205 - base - recpack - INFO - Processed epoch 0 in 20.78 s.Batch Training Loss = 0.5572
2024-07-31 19:57:06,346 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931514638177617, which is better than previous iterations.
2024-07-31 19:57:06,349 - base - recpack - INFO - Evaluation at end of 0 took 187.14 s.
2024-07-31 19:57:27,712 - base - recpack - INFO - Processed epoch 1 in 21.36 s.Batch Training Loss = 0.5309
2024-07-31 20:00:34,255 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931341606943303, which is worse than previous iterations.
2024-07-31 20:00:34,257 - base - recpack - INFO - Evaluation at end of 1 took 186.54 s.
2024-07-31 20:00:55,931 - base - recpack - INFO - Processed epoch 2 in 21.67 s.Batch Training Loss = 0.5281
2024-07-31 20:04:04,748 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931276807848411, which is worse than previous iterations.
2024-07-31 20:04:04,750 - base - recpack

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 20:14:40,246 - base - recpack - INFO - Processed epoch 0 in 20.88 s.Batch Training Loss = 0.5731
2024-07-31 20:17:45,747 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931688405041898, which is better than previous iterations.
2024-07-31 20:17:45,751 - base - recpack - INFO - Evaluation at end of 0 took 185.50 s.
2024-07-31 20:18:06,209 - base - recpack - INFO - Processed epoch 1 in 20.46 s.Batch Training Loss = 0.5275
2024-07-31 20:21:14,022 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931765630724335, which is worse than previous iterations.
2024-07-31 20:21:14,025 - base - recpack - INFO - Evaluation at end of 1 took 187.81 s.
2024-07-31 20:21:34,326 - base - recpack - INFO - Processed epoch 2 in 20.30 s.Batch Training Loss = 0.5247
2024-07-31 20:24:40,946 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931704202228989, which is worse than previous iterations.
2024-07-31 20:24:40,949 - base - recpack

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 20:35:09,198 - base - recpack - INFO - Processed epoch 0 in 21.30 s.Batch Training Loss = 0.6325
2024-07-31 20:38:15,028 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931999929876389, which is better than previous iterations.
2024-07-31 20:38:15,031 - base - recpack - INFO - Evaluation at end of 0 took 185.83 s.
2024-07-31 20:38:22,349 - base - recpack - INFO - Processed epoch 1 in 7.32 s.Batch Training Loss = 0.7512
2024-07-31 20:41:28,326 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931906824336803, which is worse than previous iterations.
2024-07-31 20:41:28,328 - base - recpack - INFO - Evaluation at end of 1 took 185.98 s.
2024-07-31 20:41:35,555 - base - recpack - INFO - Processed epoch 2 in 7.23 s.Batch Training Loss = nan
2024-07-31 20:44:42,391 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931914418503827, which is worse than previous iterations.
2024-07-31 20:44:42,393 - base - recpack - IN

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 20:54:40,331 - base - recpack - INFO - Processed epoch 0 in 20.89 s.Batch Training Loss = 0.5576
2024-07-31 20:57:47,175 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931292418260323, which is better than previous iterations.
2024-07-31 20:57:47,177 - base - recpack - INFO - Evaluation at end of 0 took 186.84 s.
2024-07-31 20:58:08,128 - base - recpack - INFO - Processed epoch 1 in 20.95 s.Batch Training Loss = 0.5311
2024-07-31 21:01:13,510 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931056038968073, which is worse than previous iterations.
2024-07-31 21:01:13,513 - base - recpack - INFO - Evaluation at end of 1 took 185.38 s.
2024-07-31 21:01:34,018 - base - recpack - INFO - Processed epoch 2 in 20.50 s.Batch Training Loss = 0.5279
2024-07-31 21:04:40,576 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.693120298879582, which is worse than previous iterations.
2024-07-31 21:04:40,579 - base - recpack 

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 21:15:10,068 - base - recpack - INFO - Processed epoch 0 in 20.87 s.Batch Training Loss = 0.5735
2024-07-31 21:18:14,744 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931524677199823, which is better than previous iterations.
2024-07-31 21:18:14,747 - base - recpack - INFO - Evaluation at end of 0 took 184.68 s.
2024-07-31 21:18:36,046 - base - recpack - INFO - Processed epoch 1 in 21.30 s.Batch Training Loss = 0.5275
2024-07-31 21:21:43,644 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.693139799592038, which is worse than previous iterations.
2024-07-31 21:21:43,646 - base - recpack - INFO - Evaluation at end of 1 took 187.60 s.
2024-07-31 21:22:05,261 - base - recpack - INFO - Processed epoch 2 in 21.61 s.Batch Training Loss = 0.5239
2024-07-31 21:25:12,388 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931604858544702, which is worse than previous iterations.
2024-07-31 21:25:12,391 - base - recpack 

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 21:35:41,013 - base - recpack - INFO - Processed epoch 0 in 21.13 s.Batch Training Loss = 0.6392
2024-07-31 21:38:47,296 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931647181427589, which is better than previous iterations.
2024-07-31 21:38:47,299 - base - recpack - INFO - Evaluation at end of 0 took 186.28 s.
2024-07-31 21:38:54,605 - base - recpack - INFO - Processed epoch 1 in 7.31 s.Batch Training Loss = 0.9943
2024-07-31 21:42:02,473 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931770050752495, which is worse than previous iterations.
2024-07-31 21:42:02,475 - base - recpack - INFO - Evaluation at end of 1 took 187.87 s.
2024-07-31 21:42:09,315 - base - recpack - INFO - Processed epoch 2 in 6.84 s.Batch Training Loss = 5.3003
2024-07-31 21:45:16,838 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.693171817214475, which is worse than previous iterations.
2024-07-31 21:45:16,840 - base - recpack - 

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 21:55:20,478 - base - recpack - INFO - Processed epoch 0 in 20.87 s.Batch Training Loss = 0.5574
2024-07-31 21:58:26,602 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931645683413428, which is better than previous iterations.
2024-07-31 21:58:26,605 - base - recpack - INFO - Evaluation at end of 0 took 186.12 s.
2024-07-31 21:58:46,937 - base - recpack - INFO - Processed epoch 1 in 20.33 s.Batch Training Loss = 0.5309
2024-07-31 22:01:53,224 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931382097047518, which is worse than previous iterations.
2024-07-31 22:01:53,227 - base - recpack - INFO - Evaluation at end of 1 took 186.29 s.
2024-07-31 22:02:14,019 - base - recpack - INFO - Processed epoch 2 in 20.79 s.Batch Training Loss = 0.5284
2024-07-31 22:05:20,329 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931323749614252, which is worse than previous iterations.
2024-07-31 22:05:20,332 - base - recpack

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 22:15:48,088 - base - recpack - INFO - Processed epoch 0 in 20.60 s.Batch Training Loss = 0.5736
2024-07-31 22:18:54,167 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931282562395807, which is better than previous iterations.
2024-07-31 22:18:54,171 - base - recpack - INFO - Evaluation at end of 0 took 186.08 s.
2024-07-31 22:19:15,168 - base - recpack - INFO - Processed epoch 1 in 21.00 s.Batch Training Loss = 0.5271
2024-07-31 22:22:21,066 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931282122325959, which is worse than previous iterations.
2024-07-31 22:22:21,070 - base - recpack - INFO - Evaluation at end of 1 took 185.90 s.
2024-07-31 22:22:41,831 - base - recpack - INFO - Processed epoch 2 in 20.76 s.Batch Training Loss = 0.5240
2024-07-31 22:25:47,075 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931483843051541, which is worse than previous iterations.
2024-07-31 22:25:47,081 - base - recpack

  user_indices = torch.tensor(user_indices).to(self.device)
  pos_item_indices = torch.tensor(pos_item_indices).to(self.device)
  neg_item_indices = torch.tensor(neg_item_indices).to(self.device).squeeze()


2024-07-31 22:36:14,717 - base - recpack - INFO - Processed epoch 0 in 21.05 s.Batch Training Loss = 0.5579
2024-07-31 22:39:21,114 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931708111444905, which is better than previous iterations.
2024-07-31 22:39:21,117 - base - recpack - INFO - Evaluation at end of 0 took 186.40 s.
2024-07-31 22:39:41,727 - base - recpack - INFO - Processed epoch 1 in 20.61 s.Batch Training Loss = 0.5309
2024-07-31 22:42:48,701 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931616392749403, which is worse than previous iterations.
2024-07-31 22:42:48,704 - base - recpack - INFO - Evaluation at end of 1 took 186.97 s.
2024-07-31 22:43:09,666 - base - recpack - INFO - Processed epoch 2 in 20.96 s.Batch Training Loss = 0.5284
2024-07-31 22:46:15,694 - stopping_criterion - recpack - INFO - StoppingCriterion has value 0.6931613852865922, which is worse than previous iterations.
2024-07-31 22:46:15,696 - base - recpack

## Results

In [23]:
pipeline.get_metrics()

Unnamed: 0,NDCGK_10,NDCGK_20,NDCGK_50,RecallK_10,RecallK_20,RecallK_50,HitK_10,HitK_20,HitK_50
"NGCFAlgorithm(batch_size=1024,dropout=0.1,embedding_dim=64,grad_clip=1.0,keep_last=True,learning_rate=0.01,max_epochs=5,max_iter_no_change=5,message_dropout=0.0,min_improvement=0.01,n_layers=3,node_dropout=0.0,predict_topK=None,save_best_to_file=True,seed=1009965771,stop_early=True,stopping_criterion=<recpack.algorithms.stopping_criterion.StoppingCriterion object at 0x7fd44b231090>,validation_sample_size=None)",0.01494,0.019542,0.032195,0.021574,0.036233,0.084923,0.067487,0.116263,0.276654
