# PGPR:  Policy Guided Path Reasoning


In this notebook, you will learn how to  train and evaluate PGPR model [[1]](#r1).


What you’ll do:
* 1️⃣ Load pretrained TransE embeddings computed with the previous notebook;
* 2️⃣ Map TransE embeddings to a PGPR readable format
* 3️⃣ Train the model
* 4️⃣ Evaluate the model
* 5️⃣ Generate Explanations

### ⚙️ Setup Workspace

1. Import the necessary module to access Google Drive from Colab and mount you Google Drive to the Colab enviroment. This allows you to access files and folders stored in your Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


2. Install the `hopwise` libray

In [None]:
%%capture
!uv pip install hopwise[cli]

3.  To view the installed libraries in the right sidebar, run the following command:

In [None]:
!ln -s /usr/local/lib/python3.11/dist-packages /content/dist-packages

4. To check if you are using the GPU, run the following code:

In [None]:
import torch
if torch.cuda.is_available():
    device_id = torch.cuda.current_device()
    device_name = torch.cuda.get_device_name(device_id)
    print(f"CUDA Device ID: {device_id}")
    print(f"CUDA Device Name: {device_name}")
else:
    print("No CUDA device is available.")

CUDA Device ID: 0
CUDA Device Name: Tesla T4


## 📝 Introduction

<div style="background-color:#f0f4f8; border-left: 5px solid #4a90e2; padding:15px; margin:10px 0; border-radius:8px;">
  <p><b>📖 Paper Excerpt:</b><br><br>
  The algorithm aims to learn a policy that navigates from a user to potential items of interest by interacting with the knowledge graph environment. The trained policy is then adopted for the path reasoning phase to make recommendations to the user [1].
  </p>
</div>

<img src="https://raw.githubusercontent.com/mallociFrancesca/XAIKGRLGM/a77f9ea5633475efe43038ef2a11e1341342e0ef/hands-on-session/0_pgpr_architecture.png" alt="PGPR Architecture" width="1100" height="350">


The PGPR **pipeline** is composed by **three** main components:

- **KG Enviroment**: It is the underlying *Knowledge Graph* that contains entities and their relations.
- **Train of the Policy/Value Network**: It is a neural network that learns a *policy* and *value function* used to guide the agent to suggest which paths (sequence of actions) to follow.
- **Policy-Guided Path Reasoning**: Through *the policy* learned in the previous step a *beam search algorithm* is employed to produce the final paths and associated recommendations.

## 📦 Packages

In [None]:
import os
import pandas as pd
import torch
from hopwise.data import create_dataset
from hopwise.quick_start import run_hopwise

## 1️⃣ Load TransE Embeddings

The PGPR  model requires as input the embeddings previously computed using the TransE model. TransE embeddings represent users, entities, and relations as vectors in a continuous space, and they are essential for PGPR to function correctly.


We have already computed the TransE embeddings in a previous notebook and saved them to a checkpoint file. Now we load them using the `torch.load()` function:

In [None]:
# Path to the saved checkpoint (update with your actual filename)
checkpoint_name = "/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/TransE-Jul-08-2025_19-42-10.pth"

# Load the checkpoint
checkpoint = torch.load(checkpoint_name,  weights_only=False)

# if you don't have the gpu, you can load the weights with the following code
#checkpoint = torch.load(checkpoint_name, weights_only=False, map_location=torch.device('cpu'))


# Remember: the learned parameters are stored in the `state_dict`
# Display which weights (learned embeddings) were saved in the 'state_dict'
checkpoint["state_dict"].keys()

odict_keys(['user_embedding.weight', 'entity_embedding.weight', 'relation_embedding.weight'])

The three `.weight` entries in the `state_dict` (`user_embedding.weight`, `entity_embedding.weight`, and `relation_embedding.weight`) are the **embedding matrices** learned during training of the TransE model.

Each matrix contains a dense vector representation for a specific type of node in the knowledge graph:

- `user_embedding.weight`: embeddings for all users  
- `entity_embedding.weight`: embeddings for all entities (items, tags, etc.)  
- `relation_embedding.weight`: embeddings for all relation types (e.g., "watched", "belongs_to")

---
The typical shape of each embedding matrix is: `(num_items, embedding_dim)`

For example:
- If you have 10,000 entities and the embedding size is 100, then:  
  `entity_embedding.weight` → shape = `(10000, 100)`

---

The `.weight` files contain only numerical indices.

Example:

```bash
tensor([
    [0.11, -0.22, 0.33],  # entity ID 0
    [-0.44, 0.55, -0.66], # entity ID 1
])
```
These embeddings are numeric representations used internally by the machine learning model.


<p style="background-color:#fff6ff; padding:15px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">
    <b>📌⚠️ </b> <b>However, we need to map IDs back to names. WHY?</b> <br>
Machine learning models operate using numeric IDs. But once training is done, we need to <strong>translate those IDs back to real names</strong> so that we can understand the results.<br>
We do that using a <strong>mapping dictionary</strong>, like this:

<code>
id2entity = {
    0: "Movie_A",
    1: "Movie_B"
}
</code>
</p>

> 👉  PGPR expects embeddings to be stored in a **specific format**.
>
> **This helps it correctly link the numeric IDs to their real-world meanings using the mapping file**.

## 2️⃣ Map TransE Embeddings

#### Requirements

To function properly, PGPR model requires the embeddings to be saved in three separate files:

- `.useremb`: contains the user embeddings.  
- `.entityemb`: contains the entity embeddings.  
- `.relationemb`: contains the relation embeddings.  

These files must follow a specific structure to be compatible with PGPR model.


#### Required File Structure

Each file must be a tab-delimited file with two columns:

- **First column**: contains the original identifier (token) of the user, entity, or relation.  
- **Second column**: contains the associated embedding, represented as a sequence of float numbers.


**Example File Structure**

- File **`.useremb`**:
    - Structure:
      - First Column: `user_embedding_id:token` - The original identifier (token) of each user.
      - Second Column: `user_embedding:float_seq` - The embedding vector for each user, represented as a sequence of float numbers.
```bash
user_embedding_id:token    user_embedding:float_seq
user1                      0.1 0.2 0.3
user2                      0.4 0.5 0.6
```

---
-  File **`.entityemb`**:
    - Structure:
      - First Column: `entity_embedding_id:token` - The original identifier (token) of each entity.
      - Second Column: `entity_embedding:float_seq` - The embedding vector for each entity, represented as a sequence of float numbers.
```bash
entity_embedding_id:token    entity_embedding:float_seq
movie1                       0.1 0.2 0.3
movie2                       0.4 0.5 0.6
```

---
- File **`.relationemb`**:
    - Structure:
      - First Column: `relation_embedding_id:token` - The original identifier (token) of each relation.
      - Second Column: `relation_embedding:float_seq` - The embedding vector for each relation, represented as a sequence of float numbers.
```bash
relation_embedding_id:token    relation_embedding:float_seq
likes                          0.1 0.2 0.3
dislikes                       0.4 0.5 0.6
```

#### Creating Files

To generate these files, we define the `format_embedding()` method, which:

- Maps the numerical indices of the embeddings back to their original tokens (e.g., `user1`, `movie1`, `likes`).
- Saves the embeddings in a files following the required structure.


In [None]:
def format_embedding(weight, columns, emb_type, data_path):
    weight = weight.detach().cpu().numpy()
    new_emb_dict = {columns[0]: list(), columns[1]: list()}

    if emb_type == "entity":
        mapping = eid2token
    elif emb_type == "relation":
        mapping = rid2token
    elif emb_type == "user":
        mapping = uid2token

    # Create index
    new_emb_dict[columns[0]] = [mapping[id] if mapping is not None else id for id in range(1, weight.shape[0])]

    # Create embedding
    new_emb_dict[columns[1]] = [" ".join(f"{x}" for x in row) for row in weight[1:]]

    filename = f"{dataset_name}.{emb_type}emb"
    df = pd.DataFrame(new_emb_dict)
    print(f"[+] Saving the new {dataset_name} {columns[0]} embedding in {data_path}/{filename}!")
    df.to_csv(os.path.join(data_path, filename), sep="\t", index=False)

The following code iterates over the embeddings saved in the checkpoint and formats them correctly:


In [None]:
from hopwise.data import create_dataset


# Extract the dataset name from the checkpoint configuration
dataset_name = checkpoint["config"]["dataset"]
print("Dataset:", dataset_name)

# Define the path where the formatted embeddings will be saved
data_path = checkpoint["config"]["data_path"]
print("Folder Path:",data_path)

# Load dataset
dataset = create_dataset(checkpoint["config"])

uid2token = {id: token for token, id in dataset.field2token_id["user_id"].items()}
print(uid2token)
eid2token = {id: token for token, id in dataset.field2token_id["tail_id"].items()}
print(eid2token)
rid2token = {id: token for token, id in dataset.field2token_id["relation_id"].items()}
print(rid2token)
# List of embedding names to exclude from processing
excluded = ["relation_bias_embedding.weight", "norm_vec.weight", "proj_mat_e.weight"]

# Iterate over all embeddings in the checkpoint's state dictionary
for emb_name, emb in checkpoint["state_dict"].items():
    # Skip processing for embeddings listed in the excluded list
    if emb_name in excluded:
        continue

    # Determine the type of embedding (e.g., entity, user, relation) based on the embedding name
    emb_type = emb_name.split("_")[0]

    # Define the column names for the output file
    # The first column contains the token (identifier), and the second column contains the embedding vector
    columns = [f"{emb_type}_embedding_id:token", f"{emb_type}_embedding:float_seq"]

    # Print the embedding name and the corresponding columns for debugging
    print(f"[+] Formatting {emb_name} with columns {columns}")

    # Format and save the embedding using the `format_embedding` function
    format_embedding(emb, columns, emb_type, data_path)


Dataset: ml-100k
Folder Path: /content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/dataset/ml-100k
{0: np.str_('[PAD]'), 1: np.str_('196'), 2: np.str_('186'), 3: np.str_('22'), 4: np.str_('244'), 5: np.str_('166'), 6: np.str_('298'), 7: np.str_('115'), 8: np.str_('253'), 9: np.str_('305'), 10: np.str_('6'), 11: np.str_('62'), 12: np.str_('286'), 13: np.str_('200'), 14: np.str_('210'), 15: np.str_('224'), 16: np.str_('303'), 17: np.str_('122'), 18: np.str_('194'), 19: np.str_('291'), 20: np.str_('234'), 21: np.str_('119'), 22: np.str_('167'), 23: np.str_('299'), 24: np.str_('308'), 25: np.str_('95'), 26: np.str_('38'), 27: np.str_('102'), 28: np.str_('63'), 29: np.str_('160'), 30: np.str_('50'), 31: np.str_('301'), 32: np.str_('225'), 33: np.str_('290'), 34: np.str_('97'), 35: np.str_('157'), 36: np.str_('181'), 37: np.str_('278'), 38: np.str_('276'), 39: np.str_('7'), 40: np.str_('10'), 41: np.str_('284'), 42: np.str_('201'), 43: np.str_('287'), 44: np.str_('246'), 45: np.str_('242')

In [None]:
# Check the saved embeddings
os.listdir(data_path)

['ml-100k.item',
 'ml-100k.link',
 'ml-100k.user',
 'ml-100k.kg',
 'ml-100k.inter',
 'ml-100k.useremb',
 'ml-100k.entityemb',
 'ml-100k.relationemb']

The folder contains the dataset files and the embeddings files:

**Dataset Files**:

* `.inter`	User-item interaction
* `.user`	User features
* `.item`	Item features
* `.kg`	    Triplets in the Knowledge Graph
* `.link`	Item-entity linkage data

**Embeddings Files**:
* `.useremb`: contains the user embeddings.  
* `.entityemb`: contains the entity embeddings.  
* `.relationemb`: contains the relation embeddings.  


The final result is shown below:


Sample of **Knowledge Graph**

```
| head_id | relation_id              | tail_id  |
|---------|--------------------------|----------|
| 196     | film.producer.film       | m.028r88 |
| 186     | film.film.actor          | m.05xss5 |
| 244     | film.film.genre          | m.04j34g3|
| 166     | film.writer.film         | m.071v0j |
```


File **`.useremb`**:
```
user_embedding_id:token    user_embedding:float_seq
196                        -0.0489 0.0863 0.0459 ... -0.0352 -0.0176 -0.0604
186                        -0.0801 0.0724 0.0398 ... -0.0287 -0.0143 -0.0521
```

File **`.entityemb`**:
```
entity_embedding_id:token    entity_embedding:float_seq
m.028r88                     -0.0078 -0.0095 -0.0038 ... -0.0061 0.0023 -0.0020
m.05xss5                     0.2331 -0.1123 0.0456 ... 0.0472 0.0174 0.0032
```

File **`.relationemb`**:

```
relation_embedding_id:token    relation_embedding:float_seq
film.producer.film             -0.0761 -0.0710 -0.0138 ... -0.1096 0.1310 0.0904
film.film.actor                0.3541 -0.2279 -0.0942 ... 0.2381 -0.8148 -0.2886
```


✅ Done! Now, the TransE embeddings are ready!

## 3️⃣ Training

### Introduction

<img src="https://raw.githubusercontent.com/mallociFrancesca/XAIKGRLGM/a77f9ea5633475efe43038ef2a11e1341342e0ef/hands-on-session/train-pgpr.png" alt="Train PGPR" width="800" height="350">



We are now going to learn the policy $\pi$. The goal is to find a **policy that maximizes the cumulative reward**. This is done mainly by this steps:

- In each episode, the **agent** will **start** from a **user state** (node) randomly sampled.

- For that step, we evaluate all the **possible valid actions (out-going edges)** from that state, sort them by **multi-hop scoring** and take the ones with the **higher score**. This will cause multiple paths getting generated at the same time.

- We repeat the previous step until we reach the **desidered length** and the **current state (node) is an item**. We compute there the reward for that path which is given to the **policy**.

The policy $\pi$ will store a probability score of taking an action $a_t$ given the current state $s_t$ at step $t$ and the action space $A_{{s_t},{s_t+1}}$.

### Configuration

Hopwise library provides the `run_hopwise()` function to train a model. To train the PGPR model, you need to provide two inputs: a configuration dictionary that specifies the TransE embedding settings, and a meta-paths file.

- The **TransE embeddings**, which include user, entity, and relation embeddings, are specified in the *configuration dictionary* under the `additional_feat_suffix` parameter. These embeddings are essential for initializing the model and guiding the agent's reasoning process.  

- The **Meta-paths**, which define sequences of relations for the agent to follow in the Knowledge Graph, are specified in a separate configuration file. This file, located in the [hopwise/properties/quick_start_config/](https://github.com/tail-unica/hopwise/blob/main/hopwise/properties/quick_start_config/knowledge_base_on_ml-100k.yaml) directory, is passed as an argument (`config_file_list`) to the `run_hopwise()` function.

#### Configuration Python Dictionary

In the configuration Python dictionary, we specify the environment, detailing how to load data (particularly embeddings) and defining the evaluation metrics for the models.

In [None]:
config_dict = {

   #--- General Settings ---#

   #'gpu_id': 0,  # ID of the GPU to use for training (if available)
  'epochs': 1,  # Total number of epochs for training the model
  'show_progress': True,  # Whether to display the progress bar during training and evaluation
  'data_path': '/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/dataset/',  # Path to the dataset,
  'checkpoint_dir':'/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/',  # Directory to save model checkpoints,
  'eval_step': 50,  # Frequency (in steps) to evaluate the model during training



  #--- Dataset Settings ---#

  # Configuration for loading pre-trained embeddings
  # This configuration allows the system to know
  # which columns to read from the .useremb, .entityemb, and .relationemb files.

  'additional_feat_suffix': ['useremb', 'entityemb', 'relationemb'],  # File extensions for pre-trained embeddings
  'load_col': {  # Specifies the columns to load from the embedding files
    'useremb': ['user_embedding_id', 'user_embedding'],  # Columns for user embeddings
    'entityemb': ['entity_embedding_id', 'entity_embedding'],  # Columns for entity embeddings
    'relationemb': ['relation_embedding_id', 'relation_embedding']  # Columns for relation embeddings
  },

  # Mapping configurations for embeddings
  # This configuration is necessary to create a mapping between the identifiers (IDs)
  # and their corresponding embedding vectors, so that the model can correctly access the data.

  'alias_of_user_id': ['user_embedding_id'],  # Mapping between user IDs and user embeddings
  'alias_of_entity_id': ['entity_embedding_id'],  # Mapping between entity IDs and entity embeddings
  'alias_of_relation_id': ['relation_embedding_id'],  # Mapping between relation IDs and relation embeddings



  # Preloading configurations for embeddings
  # This configuration allows the embedding vectors to be preloaded into their respective memory spaces,
  # so that the model can use them directly during training or inference.

  'preload_weight': {
    'user_embedding_id': 'user_embedding',  # Preload user embeddings
    'entity_embedding_id': 'entity_embedding',  # Preload entity embeddings
    'relation_embedding_id': 'relation_embedding'  # Preload relation embeddings
  },


  #--- Metrics Settings ---#

  # Metrics to calculate during evaluation
  'metrics': [
    'NDCG',  # Normalized Discounted Cumulative Gain
    'MRR',  # Mean Reciprocal Rank
    'Hit',  # Hit rate
    'Precision',  # Precision metric
    'Recall',  # Recall metric
  ]
}



📌Note that the *dataset settings* are specified for educational purposes only. In practice, these settings are already defined in the default configuration file [PGPR.yaml](https://github.com/tail-unica/hopwise/blob/main/hopwise/properties/model/PGPR.yaml)

#### Meta-Path

<p style="background-color:#fff1d7; padding:15px;">
The <strong> Meta-paths </strong> guide the agent's reasoning process by defining sequences of relations to follow in the Knowledge Graph.
</p>

The meta-paths are defined in the configuration files located in the [hopwise/properties/quick_start_config/](https://github.com/tail-unica/hopwise/blob/main/hopwise/properties/quick_start_config/knowledge_base_on_ml-100k.yaml) directory. Specifically, for the MovieLens 100k dataset, the meta-paths are specified in the `knowledge_base_on_ml-100k.yaml` file.  This file is passed to the `run_hopwise()` function via the `config_file_list` parameter.

The contents of the file are provided as follows:

**File**: `knowledge_base_on_ml-100k.yaml`
```bash
path_constraint: [
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.actor","entity"],["film.actor.film","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.directed_by","entity"],["film.director.film","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.genre","entity"],["film.film_genre.films_in_this_genre","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["[UI-Relation]","entity"], ["[UI-Relation]","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.prequel","entity"], ["film.film.prequel","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.sequel","entity"], ["film.film.sequel","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.actor.film","entity"],["film.actor.film","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film_subject.films","entity"],["film.film_subject.films","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.subjects","entity"],["film.film.subjects","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.rating","entity"],["film.content_rating.film","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.written_by","entity"],["film.writer.film","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.cinematography","entity"],["film.cinematographer.film","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.produced_by","entity"],["film.producer.film","entity"]],
  [[null, 'user'], ["[UI-Relation]","entity"], ["film.film.production_companies","entity"],["film.production_company.films","entity"]],
]
```

**General structure of the meta-path**:

- Each meta-path in this file is of **length 4**, meaning:
  - It involves **4 nodes** (entities)
  - Connected by **3 relations** (edges)

- **Node 1**: Always a **`user`**, representing the starting point of the recommendation reasoning.


- **Relation 1**: A **`[UI-Relation]`**, denoting a **user-item interaction** such as:
  - `"watched"`, `"liked"`, `"rated"`, or the general `"interacted_with"`
  - This relation originates from the **user-item interaction matrix** (e.g., `ml-100k.inter`)


- **Node 2**: An **`entity`**, typically a **movie** (the item that the user interacted with).


- **Relation 2** and **Relation 3**: Domain-specific semantic relations that connect the movie to other related entities, such as:
  - `film.film.actor` → links to an actor in the film
  - `film.film.genre` → links to the genre
  - `film.film.directed_by` → links to the director
  - and other relations like `written_by`, `produced_by`, `rating`, `subject`, etc.


- **Nodes 3 and 4**: Depend on the specific relations used, and may represent:
  - Other movies (e.g., prequels, sequels)
  - Actors, genres, writers, directors, production companies, etc.

**Example**

<img src="https://raw.githubusercontent.com/mallociFrancesca/XAIKGRLGM/a77f9ea5633475efe43038ef2a11e1341342e0ef/hands-on-session/meta-paths.png" alt="Meta-Paths Diagram" width="800" height="350">



**Step-by-step Interpretation:**
- The user interacted with a movie (**Movie_A**).
- From **Movie_A**, the path leads to one of its actors (**Actor_A**), via the `film.film.actor` relation.
- From **Actor_A**, it then leads to another movie (**Movie_B**) that the same actor starred in, via the `film.actor.film` relation.

**Explanation / Reasoning:**
>*"You have you seen the Movie_A where Actor_A starred. I suggest another movie, the Movie_B,  in which Actor_A also starred."*

This meta-path captures **actor-based collaborative filtering** — recommending a movie based on shared actors with movies the user has already interacted with.


### Train

Now, we are ready to train the model. Using the following command.

> ⚠️ We already **trained** it. So you **don't need** to run this command now.



In [None]:
run_hopwise(model='PGPR',
            dataset='ml-100k',
            config_dict=config_dict,
            config_file_list=['/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/config/knowledge_base_on_ml-100k.yaml'])

  rich.tqdm(


Output()

  rich.tqdm(


Output()

Output()

{'best_valid_score': np.float64(0.0983),
 'valid_score_bigger': True,
 'best_valid_result': OrderedDict([('ndcg@10', np.float64(0.0983)),
              ('mrr@10', np.float64(0.1985)),
              ('hit@10', np.float64(0.4666)),
              ('precision@10', np.float64(0.0721)),
              ('recall@10', np.float64(0.0837))]),
 'test_result': OrderedDict([('ndcg@10', np.float64(0.0951)),
              ('mrr@10', np.float64(0.2067)),
              ('hit@10', np.float64(0.4454)),
              ('precision@10', np.float64(0.0681)),
              ('recall@10', np.float64(0.0761))])}

At the end of the training, you should get an output like this:

```bash
{'best_valid_score': 0.0951,
 'valid_score_bigger': True,
 'best_valid_result': OrderedDict([('ndcg@10', 0.0951),
              ('mrr@10', 0.1926),
              ('hit@10', 0.4507),
              ('precision@10', 0.0689),
              ('recall@10', 0.0809)]),
 'test_result': OrderedDict([('ndcg@10', 0.0994),
              ('mrr@10', 0.2096),
              ('hit@10', 0.4571),
              ('precision@10', 0.0679),
              ('recall@10', 0.0839)])}


💾 **Saving Model Outputs**

By default, the trained model checkpoint will be saved in the `hopwise/saved` directory as a `.pth` file. The filename includes the model name and a timestamp, following this format:

`PGPR-Month-day-year_timestamp.pth`




If you need to change the output directory, you have two options:

1. **Override it dynamically** using the `checkpoint_dir` parameter in your configuration python dictionary.
2. **Modify the default path** in the global config file:  
   📄 [`hopwise/properties/overall.yaml`](https://github.com/tail-unica/hopwise/blob/main/hopwise/properties/overall.yaml)


## 4️⃣ Evaluation

To evaluate the model trought different metrics, execute the `run_hopwise()` function and specify the saved checkpoint.

In [None]:
model = 'PGPR'
dataset = 'ml-100k'
pgpr_checkpoint = '/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/PGPR-Jul-08-2025_19-52-14.pth'

In [None]:
# Configuration
config_dict = {

    #--- General Settings ---#
     #'gpu_id': 0,
    'show_progress': True,
    'data_path': '/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/dataset/',  # Path to the dataset,
    'checkpoint_dir':'/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/',  # Directory to save model checkpoints,

    #--- Metrics Settings ---#
    'metrics': [
        'NDCG',
        'MRR',
        'Hit',
        'Precision',
        'Recall',
    ]
}

Run the evaluation

In [None]:
run_hopwise(model=model,
            run='evaluate',
            checkpoint=pgpr_checkpoint,
            dataset=dataset,
            config_dict=config_dict)

  rich.tqdm(


Output()

Output()

{'best_valid_score': np.float64(0.0898),
 'valid_score_bigger': True,
 'best_valid_result': OrderedDict([('ndcg@10', np.float64(0.0898)),
              ('mrr@10', np.float64(0.1847)),
              ('hit@10', np.float64(0.4284)),
              ('precision@10', np.float64(0.0648)),
              ('recall@10', np.float64(0.0816))]),
 'test_result': OrderedDict([('ndcg@10', np.float64(0.1032)),
              ('mrr@10', np.float64(0.2091)),
              ('hit@10', np.float64(0.4401)),
              ('precision@10', np.float64(0.0701)),
              ('recall@10', np.float64(0.0846))])}

From the output you should obtain the evaluation metrics like this:
```bash
{'best_valid_score': 0.0987,
 'valid_score_bigger': True,
 'best_valid_result': OrderedDict([('ndcg@10', 0.0987),
              ('mrr@10', 0.2062),
              ('hit@10', 0.456),
              ('precision@10', 0.0701),
              ('recall@10', 0.0804)]),
 'test_result': OrderedDict([('ndcg@10', 0.0996),
              ('mrr@10', 0.2046),
              ('hit@10', 0.4433),
              ('precision@10', 0.0685),
              ('recall@10', 0.0817)])}
```

If you're interested in learning more about metrics, you can review how they work in [this PDF](https://drive.google.com/file/d/1LYGGCEhtGd0uWOShA1bFKqY_j-YOo0vf/view?usp=sharing).


## 5️⃣ Generate Explanations

This section generates explanations for the recommended items. The explanations are **template-based** and are derived from the meta-paths used by the model during inference.

Steps:

* 1️⃣ Load the trained model
* 2️⃣ Define target users
* 3️⃣ Generate recommendations
* 4️⃣ Generate textual explanations

#### 1️⃣ Load the trained model

In [None]:
from hopwise.quick_start import load_data_and_model
from hopwise.utils.case_study import full_sort_explanations
import pandas as pd
pd.set_option('display.max_colwidth', None) # visualize full pandas dataframe output
import os

In [None]:
# Define model checkpoint
checkpoint = '/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/PGPR-Jul-08-2025_19-52-14.pth'

In [None]:
# Load model checkpoint with its configuration
config, model, dataset, train_data, valid_data, test_data = load_data_and_model(model_file=checkpoint)



#### 2️⃣ Define target users

We will generate recommendations for 5 users.

In [None]:
# Map user tokens (as represented in the original dataset) to the internal user IDs used by the model

# Example: convert only two specific user tokens to internal IDs
# uid_series = dataset.token2id(dataset.uid_field, ["196", "186"])

# Example: convert the first 5 user tokens (IDs 1 to 6) to their original tokens
user_tokens = [dataset.id2token(dataset.uid_field, i) for i in range(1, 6)]
print(user_tokens)  # Display the list of user tokens

# Convert the list of user tokens back into the internal user ID format required by the model
# Remember: Hopwise maps the original IDs (called tokens) to a new set of internal IDs, which range from 1 to N.

uid_series = dataset.token2id(dataset.uid_field, user_tokens)
uid_series  # Display the resulting internal user IDs


[np.str_('196'), np.str_('186'), np.str_('22'), np.str_('244'), np.str_('166')]


array([1, 2, 3, 4, 5])

#### 3️⃣ Generate recommendations

In [None]:
# Run inference to get top-N item recommendations and their meta-path for each user
# Setting explain=True returns the reasoning paths used by the model

#  Output:
#  A pandas DataFrame with columns:
#     'user'      → internal user ID
#     'item'      → recommended item ID
#     'score'     → relevance score for the recommendation
#     'path'      → list representing the reasoning path in the KG (e.g., [(self_loop, user, 1), (25, entity, 514), ...])

explanations = full_sort_explanations(
    uid_series,
    model,
    test_data,
    device=config["device"]
)


In [None]:
explanations.head(5)

Unnamed: 0,user,item,score,path
0,1,24,0.480103,"[(self_loop, user, 1), (25, entity, 377), (25, user, 504), (25, entity, 24)]"
1,1,216,0.481138,"[(self_loop, user, 1), (25, entity, 916), (25, user, 838), (25, entity, 216)]"
2,1,175,0.483854,"[(self_loop, user, 1), (25, entity, 374), (25, user, 428), (25, entity, 175)]"
3,1,345,0.487998,"[(self_loop, user, 1), (25, entity, 220), (25, user, 375), (25, entity, 345)]"
4,1,253,0.491865,"[(self_loop, user, 1), (25, entity, 93), (25, user, 342), (25, entity, 253)]"


The explanations follow the format adopted by PGPR authors, consisting of a list of _hops_ made by the agent to traverse the KG.  
Each _hop_ is a triple `(relation, entity type, entity ID)`, where:
- `relation` represents the relation incident to the node in the current hop. As the first hop always starts from a user, `relation = "self_loop"`, otherwise `relation` is the ID of the corresponding KG relation
- `entity type` denotes the type of entity reached by the current hop. PGPR adopts only `"user"` and `"entity"` as entity types to distinguish between user nodes and entity nodes (which include both items and other entities)
- `entity ID` is the ID of the entity node reached by the current hop. Based on `entity type` it denotes the ID of a user or entity node

Example:

```python
example_explanation = [
    (self_loop, user, 1),  # hop 0 (fake hop): denotes that the path starts from a user with ID 1
    (25, entity, 427),     # hop 1: user 1 (from hop 0) is connected to the entity with ID 427 by the relation with ID 25
    (25, user, 803),       # hop 2: entity 427 (from hop 1) is connected to the user with ID 803 by the relation with ID 25
    (25, entity, 246)      # hop 3: user 803 (from hop 2) is connected to the entity with ID 246 by the relation with ID 25
]

user_1 --relation_25--> entity_427 --relation_25--> user_803 --relation_25--> entity_246
```

**Visualize top-N recommended items for each user**

In this case `N=10`

In [None]:
# Load items data like movie name, type of film, release year ecc.
items_data = pd.read_csv(os.path.join(config['data_path'],f"{config['dataset']}.item"), sep="\t")

# we map back the entities and users to their original values in the dataset
# we put the type before the index to better visualize the explanation
def eid2entity(x): return f'entity {dataset.id2token(dataset.entity_field, x)}'
def uid2user(x): return f'user {dataset.id2token(dataset.uid_field, x)}'
def rid2relation(x): return 'watched' if dataset.id2token(dataset.relation_field,
                                                          x) == dataset.ui_relation else dataset.id2token(dataset.relation_field, x)
def iid2movie_name(x): return items_data[items_data['item_id:token']== x]['movie_title:token_seq'].iloc[0]



# Converts a list of internal item IDs to their corresponding movie names
def get_movie_names(ids):
    names = []
    for internal_id in ids:
        name = iid2movie_name(internal_id)
        names.append(name)
    return names


# Groups the explanations DataFrame by user and collects the list of recommended item IDs for each user
topk = explanations.groupby('user')['item'].apply(list).reset_index()
# Converts the internal item IDs to movie names for each user
topk['item_names'] = topk['item'].apply(get_movie_names)


# Displays the resulting table with users and their recommended item names
display(topk)

Unnamed: 0,user,item,item_names
0,1,"[24, 216, 175, 345, 253, 208, 640, 364, 50, 354]","[Rumble in the Bronx, When Harry Met Sally..., Brazil, Deconstructing Harry, Pillow Book, The, Young Frankenstein, Cook the Thief His Wife & Her Lover, The, Ace Ventura: When Nature Calls, Star Wars, Wedding Singer, The]"
1,2,"[255, 611, 700, 361, 345, 166, 640, 354, 157, 288]","[My Best Friend's Wedding, Laura, Miami Rhapsody, Incognito, Deconstructing Harry, Manon of the Spring (Manon des sources), Cook the Thief His Wife & Her Lover, The, Wedding Singer, The, Platoon, Scream]"
2,3,"[174, 238, 2, 635, 25, 215, 700, 11, 640, 161]","[Raiders of the Lost Ark, Raising Arizona, GoldenEye, Fog, The, Birdcage, The, Field of Dreams, Miami Rhapsody, Seven (Se7en), Cook the Thief His Wife & Her Lover, The, Top Gun]"
3,4,"[450, 220, 191, 2, 156, 345, 253, 640, 700, 288]","[Star Trek V: The Final Frontier, Mirror Has Two Faces, The, Amadeus, GoldenEye, Reservoir Dogs, Deconstructing Harry, Pillow Book, The, Cook the Thief His Wife & Her Lover, The, Miami Rhapsody, Scream]"
4,5,"[174, 182, 44, 2, 352, 246, 319, 216, 700, 11]","[Raiders of the Lost Ark, GoodFellas, Dolores Claiborne, GoldenEye, Spice World, Chasing Amy, Everyone Says I Love You, When Harry Met Sally..., Miami Rhapsody, Seven (Se7en)]"


#### 4️⃣ Generate textual explanations

We define the `generate_explanation()` function to map meta-paths to their corresponding explanation templates.

In [None]:

# Mapping entity type prefixes (U, E, I, R) to decoding functions.
# These help convert internal dataset IDs into human-readable values.
e_type2mapping = {
    'user': uid2user,           # user ID to user name
    'entity': eid2entity,       # entity ID to entity name
    'item': iid2movie_name,     # item ID to movie title
    'relation': rid2relation    # relation ID to readable relation name
}


# Default explanation template for the meta-path [[null, 'user'], ["[UI-Relation]","entity"], ["[UI-Relation]","entity"], ["[UI-Relation]","entity"]]
default_template = "{item} is recommend to you because you {relation1} {entity1} also {relation2} by {entity2}"

# Mapping each meta-paths to custom explanation templates
relation2template = {
    "film.film.prequel__film.film.prequel": "I recommend you {item} because it is a prequel of a prequel of a film you liked ({ref}).",
    "film.film.sequel__film.film.sequel": "I recommend you {item} because it is a sequel of a sequel of a film you liked ({ref}).",
    "film.film.actor__film.actor.film": "I recommend you {item} because it stars an actor from a film you liked ({ref}).",
    "film.actor.film__film.actor.film": "I recommend you {item} because it features an actor who worked in multiple films you've seen ({ref}).",
    "film.film_subject.films__film.film_subject.films": "I recommend you {item} because it shares the same topic/subject with a film you liked ({ref}).",
    "film.film.subjects__film.film.subjects": "I recommend you {item} because it covers similar themes to a film you liked ({ref}).",
    "film.film.rating__film.content_rating.film": "I recommend you {item} because it has the same content rating as a film you liked ({ref}).",
    "film.film.genre__film.film_genre.films_in_this_genre": "I recommend you {item} because it belongs to the same genre as a film you liked ({ref}).",
    "film.film.written_by__film.writer.film": "I recommend you {item} because it was written by the same writer as a film you liked ({ref}).",
    "film.film.directed_by__film.director.film": "I recommend you {item} because it was directed by the same person as a film you liked ({ref}).",
    "film.film.cinematography__film.cinematographer.film": "I recommend you {item} because it was shot by the same cinematographer as a film you liked ({ref}).",
    "film.film.produced_by__film.producer.film": "I recommend you {item} because it was produced by the same producer as a film you liked ({ref}).",
    "film.film.production_companies__film.production_company.films": "I recommend you {item} because it was made by the same production company as a film you liked ({ref})."
}



# Function to generate a natural language explanation for a recommendation
def generate_explanation(row):
    path = row["path"]
    readable = []

    # Decode all elements of the path (users, entities, relations) to readable names
    for hop in path:
        relation_id, entity_type, entity_id = hop

        # Decode relation
        if relation_id != "self_loop":
            decode_fn = e_type2mapping.get("relation")
            decoded = decode_fn(relation_id) if decode_fn else f"relation{relation_id}"
            readable.append(decoded)

        # Decode other entity (user, item, other entity)
        if entity_type == "entity" and dataset.id2token(dataset.entity_field, entity_id) in dataset.entity2item:
            entity_type = "item"
        decode_fn = e_type2mapping.get(entity_type)
        decoded = decode_fn(entity_id) if decode_fn else f"{entity_type}{entity_id}"
        readable.append(decoded)

    # Extract the relations from the path
    relation_tokens = [dataset.id2token(dataset.relation_field, hop[0]) for hop in path if hop[0] != "self_loop"]

    # Use default template if all relations are R25
    if all(r == dataset.ui_relation for r in relation_tokens):
        return default_template.format(
            item=readable[-1],
            relation1="watched",
            entity1=readable[2],
            relation2="watched",
            entity2=readable[4]
        )

    # Check if the relations are in the relation2template mapping
    key = "__".join([r for r in relation_tokens if r != dataset.ui_relation])

    # Get the corresponding template from the mapping
    template = relation2template.get(key)

    # If a specific template is found, use it
    # Otherwise, use the default template
    if template:

        # Debug output for non-standard templates
        print("NON STANDARD TEMPLATE")
        print(f"User: {row['user']}")
        print(f"Item: {row['item']}")
        print(f"Path: {row['path']}")
        print(f"Relations: {relation_tokens}")
        print(f"Key not found: {key}")
        print("Full row:")
        print(row)
        print("-" * 60)

        return template.format(
            item=readable[-1],
            ref=readable[2]
        )
    else:
        # Fall back to default template if no specific template matches
        return default_template.format(
            item=readable[-1],
            relation1="watched",
            entity1=readable[2],
            relation2=relation_tokens[-1],
            entity2=readable[4]
        )



We apply the `generate_explanation()` function to the meta-paths stored in the `explanations` variable to generate and visualize corresponding explanations in natural language format.

In [None]:
# Construct the final DataFrame with readable values and explanations
user_ids = []
user_names = []
item_ids = []
item_names = []
scores = []
paths = []
readable_paths = []
explanations_col = []

# Iterate over each explanation and populate the final DataFrame rows
for idx, path in enumerate(explanations['path']):
    self_loop, hop1, hop2, hop3 = path
    _, _, user = self_loop
    relation1, _, entity1 = hop1
    relation2, _, entity2 = hop2
    relation3, _, item = hop3

    # Raw IDs
    user_id = user
    item_id = item

    # Decode readable values
    user_val = e_type2mapping["user"](user)
    relation1_val = e_type2mapping["relation"](relation1)
    entity_mapping = "item" if dataset.id2token(dataset.entity_field, entity1) in dataset.entity2item else "entity"
    entity1_val = e_type2mapping[entity_mapping](entity1)
    relation2_val = e_type2mapping["relation"](relation2)
    entity_mapping = "item" if dataset.id2token(dataset.entity_field, entity2) in dataset.entity2item else "entity"
    entity2_val = e_type2mapping[entity_mapping](entity2)
    relation3_val = e_type2mapping["relation"](relation3)
    item_val = e_type2mapping["item"](item)

    # Construct readable path string
    readable_path = f"{user_val} --{relation1_val}--> {entity1_val} --{relation2_val}--> {entity2_val} --{relation3_val}--> {item_val}"

    # Create data row for explanation generation
    row_data = {
        "user": user,
        "item": item,
        "path": path
    }

    # Generate natural language explanation
    explanation = generate_explanation(row_data)

    # Get recommendation score
    score = explanations.loc[idx, 'score']

    # Append all fields to their respective lists
    user_ids.append(user_id)
    user_names.append(user_val)
    item_ids.append(item_id)
    item_names.append(item_val)
    scores.append(score)
    paths.append(path)
    readable_paths.append(readable_path)
    explanations_col.append(explanation)

# Create the final enriched DataFrame
final_explanations = pd.DataFrame({
    'user name': user_names,
    'Recommended item (name)': item_names,
    'user ID': user_ids,
    'Recommended item (ID)': item_ids,
    'score': scores,
    'path': paths,
    'readable path': readable_paths,
    'explanation': explanations_col
})

# Uncomment this line to save the DataFrame to a CSV file
#final_explanations.to_csv('finalexplanations.csv', sep=";", index=False)


NON STANDARD TEMPLATE
User: 2
Item: 640
Path: [('self_loop', 'user', 2), (np.int64(25), 'entity', np.int64(437)), (np.int64(13), 'entity', np.int64(3596)), (np.int64(5), 'entity', np.int64(640))]
Relations: [np.str_('[UI-Relation]'), np.str_('film.film.written_by'), np.str_('film.writer.film')]
Key not found: film.film.written_by__film.writer.film
Full row:
{'user': 2, 'item': np.int64(640), 'path': [('self_loop', 'user', 2), (np.int64(25), 'entity', np.int64(437)), (np.int64(13), 'entity', np.int64(3596)), (np.int64(5), 'entity', np.int64(640))]}
------------------------------------------------------------
NON STANDARD TEMPLATE
User: 4
Item: 2
Path: [('self_loop', 'user', 4), (np.int64(25), 'entity', np.int64(454)), (np.int64(10), 'entity', np.int64(3464)), (np.int64(10), 'entity', np.int64(2))]
Relations: [np.str_('[UI-Relation]'), np.str_('film.film.subjects'), np.str_('film.film.subjects')]
Key not found: film.film.subjects__film.film.subjects
Full row:
{'user': 4, 'item': np.int64

<div style="background-color:#fff6ff; padding:15px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">
  <b>📌⚠️ </b>  <strong>Remember:</strong><br><br>

  <strong>The PGPR model</strong> uses the <strong>meta-paths provided during training</strong> as guidance to learn how to navigate the knowledge graph and generate recommendations.

  <ul>
    <li>The <strong>meta-paths</strong> are not strict constraints; rather, they serve as <strong>examples of semantically meaningful paths</strong> that connect users and items in the graph.</li>
    <li>During training, the model learns to <strong>recognize and prefer path structures</strong> that resemble those provided in the training data.</li>
    <li>However, the model is <strong>not required</strong> to follow the exact meta-paths during inference.</li>
  </ul>

  <blockquote>
    This means that the model can <strong>explore and generate alternative reasoning paths</strong> that were not explicitly seen during training.
  </blockquote>

  <p>In practice:</p>
  <ul>
    <li>Paths that <strong>align with the structure of the training meta-paths</strong> are more likely to be selected as recommendation candidates.</li>
    <li>At the same time, the model keeps the ability to <strong>generalize and propose novel paths</strong> that are still valid within the context of the graph.</li>
  </ul>

  <blockquote>
    Under the title <code>NON STANDARD TEMPLATE</code> in the output, you can see the meta-paths that <strong>do not match the ones we defined</strong>, and therefore <strong>do not have an associated explanation template</strong>.
  </blockquote>
</div>


**We are finally ready to read the explanations 😊**

In [None]:
final_explanations.head(5)

Unnamed: 0,user name,Recommended item (name),user ID,Recommended item (ID),score,path,readable path,explanation
0,user 196,Rumble in the Bronx,1,24,0.480103,"[(self_loop, user, 1), (25, entity, 377), (25, user, 504), (25, entity, 24)]",user 196 --watched--> Heavyweights --watched--> Bonnie and Clyde --watched--> Rumble in the Bronx,Rumble in the Bronx is recommend to you because you watched Heavyweights also watched by user 507
1,user 196,When Harry Met Sally...,1,216,0.481138,"[(self_loop, user, 1), (25, entity, 916), (25, user, 838), (25, entity, 216)]",user 196 --watched--> Lost in Space --watched--> In the Line of Duty 2 --watched--> When Harry Met Sally...,When Harry Met Sally... is recommend to you because you watched Lost in Space also watched by user 846
2,user 196,Brazil,1,175,0.483854,"[(self_loop, user, 1), (25, entity, 374), (25, user, 428), (25, entity, 175)]",user 196 --watched--> Mighty Morphin Power Rangers: The Movie --watched--> Harold and Maude --watched--> Brazil,Brazil is recommend to you because you watched Mighty Morphin Power Rangers: The Movie also watched by user 435
3,user 196,Deconstructing Harry,1,345,0.487998,"[(self_loop, user, 1), (25, entity, 220), (25, user, 375), (25, entity, 345)]","user 196 --watched--> Mirror Has Two Faces, The --watched--> Showgirls --watched--> Deconstructing Harry","Deconstructing Harry is recommend to you because you watched Mirror Has Two Faces, The also watched by user 379"
4,user 196,"Pillow Book, The",1,253,0.491865,"[(self_loop, user, 1), (25, entity, 93), (25, user, 342), (25, entity, 253)]","user 196 --watched--> Welcome to the Dollhouse --watched--> Man Who Knew Too Little, The --watched--> Pillow Book, The","Pillow Book, The is recommend to you because you watched Welcome to the Dollhouse also watched by user 345"


**Visualizing Explanations for Specific Users**

This section filters the final recommendations to show only those related to a specific user, identified by their user ID.

First, it displays the filtered recommendations as a **pandas DataFrame** for a quick overview.

Then, it iterates through each recommendation and prints them as a **list**.

Both approaches allow for a clearer understanding of the reasoning behind each suggestion, using two different reading formats

**Pandas Dataframe Visualization**

In [None]:
# Filter the final_explanations DataFrame to retrieve only the rows where the user ID is 2
user_rows = final_explanations[final_explanations["user ID"] == 2]

# Display the filtered rows for user 2
user_rows

Unnamed: 0,user name,Recommended item (name),user ID,Recommended item (ID),score,path,readable path,explanation
10,user 186,My Best Friend's Wedding,2,255,0.460147,"[(self_loop, user, 2), (25, entity, 395), (25, user, 480), (25, entity, 255)]",user 186 --watched--> Robin Hood: Men in Tights --watched--> North by Northwest --watched--> My Best Friend's Wedding,My Best Friend's Wedding is recommend to you because you watched Robin Hood: Men in Tights also watched by user 487
11,user 186,Laura,2,611,0.460849,"[(self_loop, user, 2), (25, entity, 469), (25, user, 252), (25, entity, 611)]","user 186 --watched--> Short Cuts --watched--> Lost World: Jurassic Park, The --watched--> Laura",Laura is recommend to you because you watched Short Cuts also watched by user 106
12,user 186,Miami Rhapsody,2,700,0.469396,"[(self_loop, user, 2), (25, entity, 469), (25, user, 793), (25, entity, 700)]",user 186 --watched--> Short Cuts --watched--> Crooklyn --watched--> Miami Rhapsody,Miami Rhapsody is recommend to you because you watched Short Cuts also watched by user 796
13,user 186,Incognito,2,361,0.471615,"[(self_loop, user, 2), (25, entity, 575), (25, user, 492), (25, entity, 361)]",user 186 --watched--> City Slickers II: The Legend of Curly's Gold --watched--> East of Eden --watched--> Incognito,Incognito is recommend to you because you watched City Slickers II: The Legend of Curly's Gold also watched by user 488
14,user 186,Deconstructing Harry,2,345,0.474656,"[(self_loop, user, 2), (25, entity, 145), (25, user, 729), (25, entity, 345)]","user 186 --watched--> Lawnmower Man, The --watched--> Nell --watched--> Deconstructing Harry","Deconstructing Harry is recommend to you because you watched Lawnmower Man, The also watched by user 738"
15,user 186,Manon of the Spring (Manon des sources),2,166,0.474782,"[(self_loop, user, 2), (25, entity, 728), (25, user, 651), (25, entity, 166)]",user 186 --watched--> Junior --watched--> Glory --watched--> Manon of the Spring (Manon des sources),Manon of the Spring (Manon des sources) is recommend to you because you watched Junior also watched by user 655
16,user 186,"Cook the Thief His Wife & Her Lover, The",2,640,0.479939,"[(self_loop, user, 2), (25, entity, 437), (13, entity, 3596), (5, entity, 640)]","user 186 --watched--> Amityville 1992: It's About Time --film.film.written_by--> entity m.06nwyw --film.writer.film--> Cook the Thief His Wife & Her Lover, The","I recommend you Cook the Thief His Wife & Her Lover, The because it was written by the same writer as a film you liked (Amityville 1992: It's About Time)."
17,user 186,"Wedding Singer, The",2,354,0.490425,"[(self_loop, user, 2), (25, entity, 395), (25, user, 352), (25, entity, 354)]","user 186 --watched--> Robin Hood: Men in Tights --watched--> Spice World --watched--> Wedding Singer, The","Wedding Singer, The is recommend to you because you watched Robin Hood: Men in Tights also watched by user 352"
18,user 186,Platoon,2,157,0.490767,"[(self_loop, user, 2), (25, entity, 201), (25, user, 248), (25, entity, 157)]",user 186 --watched--> Evil Dead II --watched--> Grosse Pointe Blank --watched--> Platoon,Platoon is recommend to you because you watched Evil Dead II also watched by user 53
19,user 186,Scream,2,288,0.507487,"[(self_loop, user, 2), (25, entity, 871), (25, user, 826), (25, entity, 288)]","user 186 --watched--> Vegas Vacation --watched--> Phantom, The --watched--> Scream",Scream is recommend to you because you watched Vegas Vacation also watched by user 835


**List Visualization**

In [None]:
# Set the target user ID or username to inspect their recommendations
target_user_id = 2   # You can also use another ID, e.g., "U1"

# Filter the DataFrame to include only the rows corresponding to the selected user
user_recs = final_explanations[final_explanations['user ID'] == target_user_id]

# Determine how many recommendations to print (currently prints all for the user)
# You can limit the output by using: n = min(3, len(user_recs))
n = len(user_recs)

# Loop through the selected user's recommendations and print details for each
for i in range(n):
    row = user_recs.iloc[i]

    print(f"--- {i+1}# recommendation for user {row['user name']} ---\n")
    print(f"user ID: {row['user ID']}")
    print(f"user name: {row['user name']} \n")
    print(f"Recommended item (ID): {row['Recommended item (ID)']}")
    print(f"Recommended item (name): {row['Recommended item (name)']}\n")
    print(f"score: {row['score']} \n")
    print(f"path: {row['path']}")
    print("readable path:")
    print(row['readable path'])
    print("explanation:")
    print(row['explanation'])
    print("\n----------\n")


--- 1# recommendation for user user 186 ---

user ID: 2
user name: user 186 

Recommended item (ID): 255
Recommended item (name): My Best Friend's Wedding

score: 0.4601468005226948 

path: [('self_loop', 'user', 2), (np.int64(25), 'entity', np.int64(395)), (np.int64(25), 'user', np.int64(480)), (np.int64(25), 'entity', np.int64(255))]
readable path:
user 186 --watched--> Robin Hood: Men in Tights --watched--> North by Northwest --watched--> My Best Friend's Wedding
explanation:
My Best Friend's Wedding is recommend to you because you watched Robin Hood: Men in Tights also watched by user 487

----------

--- 2# recommendation for user user 186 ---

user ID: 2
user name: user 186 

Recommended item (ID): 611
Recommended item (name): Laura

score: 0.4608488254782428 

path: [('self_loop', 'user', 2), (np.int64(25), 'entity', np.int64(469)), (np.int64(25), 'user', np.int64(252)), (np.int64(25), 'entity', np.int64(611))]
readable path:
user 186 --watched--> Short Cuts --watched--> Lost Wo

# 📄 Copyright Notice

This notebook was authored by [**Francesca Maridina Malloci**](https://www.linkedin.com/in/francescamalloci/) for the course  
**"Explainable Artificial Intelligence over Knowledge Graphs: from Reinforcement Learning to Generative Modeling"**,  
held at **Boise State University**, from **July 21 to July 25, 2025**.

The material is released exclusively as part of this course to support student learning and study.  
All content, including text and original figures, is protected by copyright and may not be reproduced,  
distributed, or used without the explicit written permission of the author.

Some images were created by the author. Others, sourced from academic literature or online,  
are properly cited and credited within the notebook.

If you find this material useful for your research, you are kindly invited to cite a related publication:

> Balloccu, G., Boratto, L., Fenu, G., Malloci, F. M., & Marras, M. (2024, March).  
> *Explainable recommender systems with knowledge graphs and language models*.  
> In *European Conference on Information Retrieval* (pp. 352–357). Cham: Springer Nature Switzerland.


# References

<a name= "r1">[1] </a> Yikun Xian, Zuohui Fu, S. Muthukrishnan, Gerard de Melo, Yongfeng Zhang: Reinforcement Knowledge Graph Reasoning for Explainable Recommendation. SIGIR 2019: 285-294