# 🚀 Introduction to Hopwise Library

In this notebook, you'll explore how the Hopwise library works, its core components, and how to use it effectively in your own projects.

### ⚙️ Setup Workspace

Before starting let's prepare our workspace

1. Import the necessary module to access Google Drive from Colab and mount you Google Drive to the Colab enviroment. This allows you to access files and folders stored in your Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


2. Install the `hopwise` libray

In [None]:
%%capture
!uv pip install hopwise[cli]
!uv pip install "dgl>=2.4.0" -f https://data.dgl.ai/wheels/torch-2.4/cu124/repo.html --no-deps  # for KGAT

3. Verify if the hopwise library has been installed correctly, running the following command:

In [None]:
%%capture
!uv run hopwise train --epochs=1

4.  To view the installed libraries in the right sidebar, run the following command:

In [None]:
!ln -s /usr/local/lib/python3.11/dist-packages /content/dist-packages

5. To check if you are using the GPU, run the following code:

In [None]:
import torch
if torch.cuda.is_available():
    device_id = torch.cuda.current_device()
    device_name = torch.cuda.get_device_name(device_id)
    print(f"CUDA Device ID: {device_id}")
    print(f"CUDA Device Name: {device_name}")
else:
    print("No CUDA device is available.")

No CUDA device is available.



## 🏗️ Framework Architecture

<div style="background-color: #fff8e1; border-left: 5px solid #ffc107; padding: 16px 20px; margin: 15px 0; border-radius: 8px; box-shadow: 0 2px 5px rgba(0,0,0,0.05); font-family: sans-serif; line-height: 1.6;">
  <p style="margin: 0;">
    <b>🚀</b>
    <b>Hopwise</b> is an advanced extension of the <a href="https://recbole.io/index.html" target="_blank">RecBole</a> library, designed to enhance recommendation systems with the power of <b>knowledge graphs</b>.
    By integrating <b>knowledge embedding models</b>, <b>path-based reasoning methods</b>, and <b>path language modeling approaches</b>, hopwise supports both <b>recommendation</b> and <b>link prediction</b> tasks with a focus on <b>explainability</b>.
  </p>
</div>

The hopwise library follows a modular design, where each module named layer, is dedicated to a specific task to ensure clarity, reusability, and ease of maintenance. The library consists of **four** modules:
* **Data and Configuration Management**: Manages the full lifecycle of data processing and experiment setup, including loading, preprocessing, sampling, and configuration management.
* **Model**: Defines and implements various recommender algorithms
* **Evaluation**:Includes different evaluation protocols
* **Utilities**: Defines auxiliary functions and helpers for managing the pipeline.


<img src="https://raw.githubusercontent.com/mallociFrancesca/XAIKGRLGM/a77f9ea5633475efe43038ef2a11e1341342e0ef/hands-on-session/hopwise-architecture.png" alt="Model Execution Interfaces" width="600">



---

## Mode Execution Interfaces

<img src="https://raw.githubusercontent.com/mallociFrancesca/XAIKGRLGM/a77f9ea5633475efe43038ef2a11e1341342e0ef/hands-on-session/hopwise-interfaces.png" alt="Hopwise Interfaces" width="600" height="350">


Hopwise supports two main interfaces for running recommendation models:

* 🖥️ **Command-Line Interface (CLI)**: the entire pipeline can be directly executed from the command line by running the command `uv run run_hopwise.py` with the desired arguments (e.g.,`-–parameter_name=[parameter_value]`).

* 🐍 **Python API Interface**: the entire pipeline can be programmatically executed by invoking functions and classes directly in Python code.

## The entry Point

The entry point of Hopwise is the method  `run_hopwise()` located in: [hopwise/quick_start/quick_start.py](https://github.com/tail-unica/hopwise/blob/main/hopwise/quick_start/quick_start.py)


The `run_hopwise()` method manages the entire recommendation pipeline with the following workflow:

```yaml
Configuration → Dataset → Model → Training → Evaluation
```

The `run_hopwise()` manages the entire recommendation pipeline through a structured 7-steps:

<img src="https://raw.githubusercontent.com/mallociFrancesca/XAIKGRLGM/a77f9ea5633475efe43038ef2a11e1341342e0ef/hands-on-session/quick-start.png" alt="Quick Start Diagram">


1. **Configuration**  
   Initializes a unified configuration object by combining model and dataset settings. This object governs the entire pipeline.

2. **Dataset Creation**  
   Loads and preprocesses the dataset based on the configuration, applying necessary filtering and formatting.

3. **Dataset Preparation**  
   Splits the dataset into training, validation, and test sets according to the configuration specified in the Config object.

4. **Model Initialization**  
   Selects the model class (e.g., general_recommender, etc) specified in the configuration.

5. **Trainer Setup**  
   Selects and initializes the appropriate trainer class based on the task type and model.

6. **Training**  
   Runs the training process using the training and validation datasets, optionally displaying progress and saving intermediate results.

7. **Evaluation**  
   Evaluates the trained model on the test set and returns performance metrics.


## 1️⃣  Configuration Management

The Hopwise pipeline is controlled via a set of **configurations** grouped into several categories:

- **Environment settings**
- **Data settings**
- **Training settings**
- **Evaluation settings**

The configurations are assigned via the Configuration module.



<img src="https://raw.githubusercontent.com/mallociFrancesca/XAIKGRLGM/a77f9ea5633475efe43038ef2a11e1341342e0ef/hands-on-session/configuration-management.png" alt="Configuration Management" width="600" height="350">

### Ways to Provide Configuration Settings

Hopwise supports three methods for defining these configurations:

1. **Configuration File**  
   You can define all settings in a `.yaml` file and pass it to `run_hopwise.py` using the `--config_files` argument.

2. **Command-Line Arguments**  
   Configuration parameters can be passed directly as command-line arguments (e.g.,`-–parameter_name=[parameter_value]`) when executing the script, overriding defaults or YAML values.

3. **Parameter Dictionary**  
   When using the API (i.e., running models programmatically in Python), you can provide configurations as a Python dictionary object (e.g., `parameter_dict = {'parameter_name':parameter_value}` .

The way you provide configuration settings in Hopwise depends on the interface you're using:

`🖥️ Command-Line` or  `🐍 Python API`.

<div style="background-color:#fff1d7; padding:15px;">
<b>📌 </b> <b>Note</b> <br>
Hopwise supports the combination of three types of parameter configurations.<br><br>
The priority of the configuration methods is: <br>
<code>
Command Line > Parameter Dicts > Config Files > Default Settings
</code>
</div>


Based on the two main modes interfaces:

#### **🖥️ Command-Line Interface (CLI)**:


When running hopwise from the terminal (e.g., using `run_hopwise.py`), you can configure the system in two ways:

1. **Command-line Arguments**  
  Example:

  ```bash
  uv run run_hopwise.py --model=BPR --dataset=ml-100k --learning_rate=0.01
  ```

2. **Config file(s)**
  
  Example:
  
  * Create a `.yaml` file named `config.yaml` file, and include the settings:
    ```yaml
    model: BPR
    dataset: ml-100k
    learning_rate: 0.01
    ```
  * Run the following command:
    ```bash
    uv run run_hopwise.py --config_files=config.yaml
    ```

---
#### **🐍 Python API Interface**

When running hopwise in a Python script, you can configure the system in two ways:


1. **Parameter dictionary**:
    ``` python
    # Define a dictionary with configuration settings
    config = {
        'model': 'BPR',
        'dataset': 'ml-100k',
        'learning_rate': 0.01
    }
    # Execute the recommendation pipeline using the specified configuration
    run_hopwise(config_dict=config)
    ```

2. **Config file(s)**:

  * Create a `.yaml` file named `config.yaml` file, and include the settings:
    ```yaml
    model: BPR
    dataset: ml-100k
    learning_rate: 0.01
    ```
  * Execute the recommendation pipeline using the specified configuration:
    ```python
    run_hopwise(config_file_list=['config.yaml'])
    ```



---

## 2️⃣ Dataset Creation

`dataset = create_dataset(config)`

<div style="background-color:#f0f4f8; border-left: 5px solid #4a90e2; padding:15px; margin:10px 0; border-radius:8px;">
  <p><b>Dataset Module:</b><br><br>
   is responsible for loading data and processing it into a format suitable for recommendation tasks. Hopwise supports various data types, including interaction data, knowledge graphs, and contextual information.
  </p>
</div>

After configuring our experiment, the next step is creating the dataset. This involves:

1. Loading raw data files
2. Preprocessing and filtering data
3. Creating internal data structures for efficient access

These steps are performed as soon as the need dataset class is initialized. The directive in the quick start example:
```python
dataset = create_dataset(config)
```
Depending on the dataset and model type, specific classes will be imported to ensure proper processing of the dataset. For instance, BPR model needs `hopwise.data.dataset.Dataset`, KGAT model needs `hopwise.data.dataset.KnowledgeBasedDataset`.

Let's see which information will be provided by the created dataset:

In [None]:
# --------------------------------------------
# Import necessary libraries
# --------------------------------------------

import os
import tempfile

from hopwise.config import Config
from hopwise.data import create_dataset
from hopwise.data.dataset import KnowledgeBasedDataset

In [None]:
# --------------------------------------------
# Specify the model and dataset
# --------------------------------------------
model_name = "KGAT"  # A classic recommendation model
dataset_name = "ml-100k"  # MovieLens 100K dataset

In [None]:
# --------------------------------------------
# Define the configuration
# --------------------------------------------


# Example YAML configuration content
yaml_file_content = """
epochs: 1                     # Number of training epochs (use more for full training)
learning_rate: 0.01           # Learning rate for optimization
embedding_size: 64            # Size of latent embeddings for users/items
train_batch_size: 2048        # Batch size during training
eval_batch_size: 8192         # Batch size during evaluation
stopping_step: 5              # Early stopping patience (stop if no improvement after 5 validations)

load_col:                     # Specify which columns to load from dataset files
  inter: ["user_id", "item_id", "rating", "timestamp"]    # Interaction data: who rated what and when
  item: ["item_id", "movie_title"]                        # Item data: ID and movie title

data_path: /content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/dataset/      # Path to input dataset
checkpoint_dir: /content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/  # Where to save model checkpoints

metrics:                     # Metrics used for evaluation
  - Recall                   # Recall measures hit rate in top-k recommendations
  - NDCG                     # NDCG considers the ranking of relevant items

valid_metric: Recall@10      # Metric used for validation and early stopping (Recall at 10)
topk: [5, 10, 20]            # Top-k values for evaluation metrics
"""



# Create a temporary YAML file to simulate the config file
# In a real-world scenario, you would have this file saved on disk
with tempfile.NamedTemporaryFile(mode="w", encoding="utf-8", suffix=".yaml", delete=False) as temp_file:
    temp_file.write(yaml_file_content)
    temp_file.flush()
    temp_file_path = temp_file.name

# Optional: config file list for custom settings
config_file_list = [temp_file_path]


# Create the configuration object
config = Config(
    model=model_name,
    dataset=dataset_name,
    config_file_list=config_file_list
)


In [None]:
# --------------------------------------------
# Load the dataset
# --------------------------------------------

# Loading the dataset using the defined configuration
dataset = create_dataset(config)

Let's see the outputs:

In [None]:

# --------------------------------------------
# Display Dataset & KG statistics
# --------------------------------------------


# Display dataset statistics
print(f"\n\n  📊 Dataset: {dataset.dataset_name}")
print(f"Number of users: {dataset.user_num}")
print(f"Number of items: {dataset.item_num}")
print(f"Number of interactions: {dataset.inter_num}")



# For knowledge graph datasets, show additional information
if isinstance(dataset, KnowledgeBasedDataset):
    print("\n 🌐Knowledge Graph:")
    print(f"Number of entities: {dataset.entity_num}")
    print(f"Number of relations: {dataset.relation_num}")
    print(f"Number of KG triplets: {len(dataset.kg_feat)}")



  📊 Dataset: ml-100k
Number of users: 944
Number of items: 1683
Number of interactions: 97554

 🌐Knowledge Graph:
Number of entities: 34713
Number of relations: 26
Number of KG triplets: 91631


**🔄 ID Mapping in Hopwise**

As part of preprocessing, **Hopwise maps the original IDs** (called _tokens_) to a new set of **internal IDs** ranging from `1` to `N`.  
This remapping simplifies algorithm design by **reserving `0` as a special padding ID**.

Padding is commonly used in various models to handle sequences of **variable length**, ensuring consistent input shapes.

To support this mechanism, Hopwise provides **conversion functions** to map between tokens and internal IDs — **in both directions**.




In [None]:
print(f"Mapping of user original IDs to internal IDs: {dataset.field2id_token[dataset.uid_field].tolist()}")
#print(f"Mapping of item original IDs to internal IDs: {dataset.field2id_token[dataset.iid_field].tolist()}")

print(f"\nMapping of user internal IDs to original IDs: {dataset.field2token_id[dataset.uid_field]}")
#print(f"Mapping of item internal IDs to original IDs: {dataset.field2token_id[dataset.iid_field]}")

# Example usage of the mapping
print("\n\nExample of mapping user IDs:")
# Assuming user_id is an integer representing the original user ID
user_id = 30
# Convert original user ID to internal ID
orig_user_id = dataset.field2id_token[dataset.uid_field][user_id]

# Print the mapping of the original user ID to the internal ID
print(f"\nOriginal ID:   '{user_id}'  -->    Internal ID: '{orig_user_id}'")
print(f"Internal ID:   '{orig_user_id}'  -->    Original ID: '{dataset.field2token_id[dataset.uid_field][orig_user_id]}'")

Mapping of user original IDs to internal IDs: ['[PAD]', '196', '186', '22', '244', '166', '298', '115', '253', '305', '6', '62', '286', '200', '210', '224', '303', '122', '194', '291', '234', '119', '167', '299', '308', '95', '38', '102', '63', '160', '50', '301', '225', '290', '97', '157', '181', '278', '276', '7', '10', '284', '201', '287', '246', '242', '249', '99', '178', '251', '81', '260', '25', '59', '72', '87', '42', '292', '20', '13', '138', '60', '57', '223', '189', '243', '92', '241', '254', '293', '127', '222', '267', '11', '8', '162', '279', '145', '28', '135', '32', '90', '216', '250', '271', '265', '198', '168', '110', '58', '237', '94', '128', '44', '264', '41', '82', '262', '174', '43', '84', '269', '259', '85', '213', '121', '49', '68', '172', '19', '268', '5', '80', '66', '18', '26', '130', '256', '1', '56', '15', '207', '232', '52', '161', '148', '125', '83', '272', '151', '54', '16', '294', '229', '36', '70', '14', '295', '233', '214', '192', '100', '307', '297', '

## 3️⃣ Data Preparation

`train_data, valid_data, test_data = data_preparation(config,dataset)`

<div style="background-color:#f0f4f8; border-left: 5px solid #4a90e2; padding:15px; margin:10px 0; border-radius:8px;">
  <p><b>Dataset Preparation:</b><br><br>
    Prepare the data for training and evaluating the models by splitting the dataset, creating <em> samplers </em> for negative sampling, and building <em> dataloaders </em> to efficiently feed data into the models.
  </p>
</div>

Once we have our dataset, we need to prepare it for training and evaluation. This includes:

1. Splitting the data into train/validation/test sets
2. Setting up samplers for negative sampling for training
3. Masking training interactions for evaluation
4. Setting up batches using  dataloaders and auxiliary functions

<div style="background-color:#fff1d7; padding:15px;">
<b>📌 </b> <b>Remember: What is <em> Negative Sampling </em>? </b> <br>
  <p>
    In recommender systems, we usually know what users <strong>liked</strong> — for example, a movie they watched or rated.
  </p>

  <p>
    But we don’t know what they <strong>didn’t like</strong>, because people don’t usually say what they ignored or skipped.
  </p>

  <p>
    To train a model, we need both positive and negative examples. So we create <em>negative samples</em> by picking random items that the user
    <strong>has not interacted with</strong>.
  </p>

  <p>
    These items are used as "negatives" during training.
  </p>

  <p>
    This helps the model learn to recommend only the items the user is likely to enjoy — and <strong>not</strong> recommend the others.
  </p>
</div>

We’ll now inspect how the dataset is split and how the dataloader is built.

In [None]:
from hopwise.data import KGSampler, KnowledgeBasedDataLoader, create_samplers, data_preparation, get_dataloader

train_data, valid_data, test_data = data_preparation(config, dataset)

# Check if the training data is using a knowledge-based dataloader
# (i.e., for models that use a knowledge graph)
if isinstance(train_data, KnowledgeBasedDataLoader):
    # If true, get the batch size from the internal general_dataloader
    train_batch_size = train_data.general_dataloader.batch_size
    # Also get the total number of batches from the internal dataloader
    len_train_data = len(train_data.general_dataloader)
else:
    # If it's a regular dataloader, access batch size directly
    train_batch_size = train_data.batch_size
    # And get the number of batches directly
    len_train_data = len(train_data)


print("\n\n  📦 Training DataLoader:")
print(f"  Training batch size: {train_batch_size}")
print(f"  Evaluation batch size: {test_data.batch_size}")
print(f"  Number of training batches: {len_train_data}")
print(f"  Number of validation batches: {len(valid_data)}")
print(f"  Number of test batches: {len(test_data)}")



  📦 Training DataLoader:
  Training batch size: 2048
  Evaluation batch size: 4
  Number of training batches: 39
  Number of validation batches: 236
  Number of test batches: 236


The evaluation batch size does not correspond to the one we set in the configuration because of the inherent way recommender system are evaluated.  
We need to predict the score for each <user, item> pair and the evaluation batch size that we decide in configuration corresponds to the number of <user, item> pairs.  

> So, what is the value returned by the print above?

For a fixed user _u_, we need to predict the score between _u_ and all items. The new value of the evaluation batch size represents the number of users for which the score with all items can be computed, such that the amount of <user, item> pairs is less than the original value in the configuration.  

Remind that MovieLens-100k has 943 users and 1682 items.  
If we set `eval_batch_size = 8192`, we can only compute the prediction scores for up to 4 users.

Selecting 5 users would result in 5 * 1682 = 8410 <user, item> pairs, which is larger than the `eval_batch_size` we set.

<div style="background-color:#fff1d7; padding:15px;">
  <b>📌 </b> <b>What does the <code>batch_size</code> really mean?</b> <br>
  <p>
    The <code>eval_batch_size</code> in the config doesn't mean number of users. It means maximum number of <code>&lt;user,item&gt;</code> pairs per batch.
  </p>
</div>

## 4️⃣ Model Inizialitation

In this step, Hopwise selects and builds the model specified in the configuration.

In [None]:
from hopwise.utils import get_model

model = get_model(config["model"])(config, train_data.dataset).to(config["device"])
model

DGL backend not selected or invalid.  Assuming PyTorch for now.


Setting the default backend to "pytorch". You can change it in the ~/.dgl/config.json file or export the DGLBACKEND environment variable.  Valid options are: pytorch, mxnet, tensorflow (all lowercase)


  d_inv = np.power(rowsum, -1).flatten()
  indices = torch.LongTensor([final_adj_matrix.row, final_adj_matrix.col])
  adj_matrix_tensor = torch.sparse.FloatTensor(indices, values, self.matrix_size)


KGAT(
  (user_embedding): Embedding(944, 64)
  (entity_embedding): Embedding(34713, 64)
  (relation_embedding): Embedding(26, 64)
  (trans_w): Embedding(26, 4096)
  (aggregator_layers): ModuleList(
    (0): Aggregator(
      (message_dropout): Dropout(p=0.1, inplace=False)
      (W1): Linear(in_features=64, out_features=64, bias=True)
      (W2): Linear(in_features=64, out_features=64, bias=True)
      (activation): LeakyReLU(negative_slope=0.01)
    )
  )
  (tanh): Tanh()
  (mf_loss): BPRLoss()
  (reg_loss): EmbLoss()
)

Models in Hopwise are PyTorch Modules, enabling strong independence from the rest of the framework.

In [None]:
print("\n\n  📦 Model:")
print(f"  Model name: {model.__class__.__name__}")
print(f"  Model parameters: {model.parameters()}")
print(f"  Model state: {model.state_dict()}")
print(f"  Model device: {model.device}")



  📦 Model:
  Model name: KGAT
  Model parameters: <generator object Module.parameters at 0x7e91e066e180>
  Model state: OrderedDict([('user_embedding.weight', tensor([[ 0.0013,  0.0673, -0.0327,  ...,  0.0147,  0.0554, -0.0553],
        [-0.0768, -0.0809,  0.0495,  ...,  0.0072,  0.0053,  0.0589],
        [-0.0683,  0.0373,  0.0707,  ..., -0.0598, -0.0191,  0.0719],
        ...,
        [ 0.0052,  0.0282, -0.0067,  ..., -0.0116,  0.0348, -0.0223],
        [ 0.0199, -0.0180, -0.0350,  ..., -0.0206, -0.1051,  0.0087],
        [ 0.0555,  0.1143,  0.0100,  ...,  0.0231, -0.0745, -0.0534]])), ('entity_embedding.weight', tensor([[-0.0058,  0.0156,  0.0077,  ...,  0.0007, -0.0063,  0.0029],
        [ 0.0021, -0.0169,  0.0020,  ...,  0.0037,  0.0057,  0.0014],
        [ 0.0102, -0.0098, -0.0103,  ...,  0.0065,  0.0221,  0.0062],
        ...,
        [-0.0074,  0.0039,  0.0113,  ...,  0.0054,  0.0021,  0.0071],
        [ 0.0129, -0.0036,  0.0149,  ..., -0.0037, -0.0039, -0.0005],
        [ 0.

Most of the models include the following functions:
- `forward`: Processes input data through the model's core architecture. In BPR, for example, it retrieves user and item embeddings and returns them for further processing. This function defines how data flows through the model.

- `calculate_loss`: Computes the training objective to optimize. For BPR, it calculates the Bayesian Personalized Ranking loss between positive and negative item pairs. This loss guides the model to rank preferred items higher than non-preferred ones.

- `predict`: Calculates a recommendation score for a specific user-item pair. In BPR, it computes the dot product between user and item embeddings. This function is used during inference for individual predictions.


## 5️⃣ - 6️⃣ : Trainer Setup & Training

5. `trainer = get_trainer(config["MODEL_TYPE"], config["model"])(config, model)`


6. `trainer.fit( train_data, valid_data, saved=True, show_progress=config["show_progress"])`

The training phase includes two main steps:
- **Instantiation**: we prepare the Trainer, which internally creates the supporting objects to perform the iterative training process and the models' evaluation
- **Fit**: executes the proper training phase, expecting the training data and validation data (optional) to verify the model' generalization ability on the validation set, fundamental to monitor the performance and enable the early stopping functionality

#### **5. Trainer Setup**

In [None]:
from hopwise.utils import get_trainer

trainer = get_trainer(config["MODEL_TYPE"], config["model"])(config, model)
print("\n\n  📦 Trainer:")
print(f"  Trainer name: {trainer.__class__.__name__}")
print(f"  Optimizer: {trainer.optimizer}")
print(f"  Learning rate: {trainer.learning_rate}")
print(f"  Epochs: {trainer.epochs}")
print(f"  Model file path: {trainer.saved_model_file}")
print(f"  Stopping step: {trainer.stopping_step}")
print(f"  Tensorboard: {trainer.tensorboard}")
print(f"  WandbLogger: {trainer.wandblogger}")



  📦 Trainer:
  Trainer name: KGATTrainer
  Optimizer: Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: None
    lr: 0.01
    maximize: False
    weight_decay: 0.0
)
  Learning rate: 0.01
  Epochs: 1
  Model file path: /content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/KGAT-Jul-08-2025_18-39-43.pth
  Stopping step: 5
  Tensorboard: <torch.utils.tensorboard.writer.SummaryWriter object at 0x7e91e362dc10>
  WandbLogger: <hopwise.utils.wandblogger.WandbLogger object at 0x7e91e06611d0>


#### **6. Training**

In [None]:
trainer.fit( train_data, valid_data, saved=True, show_progress=config["show_progress"])

Output()

  rich.tqdm(


Output()

Output()

  rich.tqdm(


(np.float64(0.0837),
 OrderedDict([('recall@5', np.float64(0.0454)),
              ('recall@10', np.float64(0.0837)),
              ('recall@20', np.float64(0.1364)),
              ('ndcg@5', np.float64(0.0941)),
              ('ndcg@10', np.float64(0.0991)),
              ('ndcg@20', np.float64(0.1106))]))

## 7️⃣ Evaluation

`test_result = trainer.evaluate(test_data, load_best_model=True, model_file=None, show_progress=config["show_progress"])`

The evaluation phase uses again the Trainer to extract the prediction scores from the users in the test data and consequently compute the metrics.

In [None]:
print(f"  Evaluation step: {trainer.eval_step}")
print(f"  Evaluation metric: {trainer.valid_metric}")
print(f"  Evaluation metric (bigger): {trainer.valid_metric_bigger}")
print(f"  Evaluation batch size: {trainer.test_batch_size}")
print(f"  Evaluator: {trainer.evaluator}")
print(f"  Collector: {trainer.eval_collector}")

  Evaluation step: 1
  Evaluation metric: recall@10
  Evaluation metric (bigger): True
  Evaluation batch size: 8192
  Evaluator: <hopwise.evaluator.evaluator.Evaluator object at 0x7e91c8406b90>
  Collector: <hopwise.evaluator.collector.Collector object at 0x7e91c8429d10>


In [None]:
test_result = trainer.evaluate(test_data, load_best_model=True, model_file=None, show_progress=config["show_progress"])
test_result


Output()

OrderedDict([('recall@5', np.float64(0.039)),
             ('recall@10', np.float64(0.0813)),
             ('recall@20', np.float64(0.1352)),
             ('ndcg@5', np.float64(0.0991)),
             ('ndcg@10', np.float64(0.1006)),
             ('ndcg@20', np.float64(0.1094))])