# PEARLM: Overall Explainable Pipeline

In this notebook, you will learn how to  train and evaluate the PEARLM model [[1]](#r1).

You will:
* 1️⃣ Train the model;
* 2️⃣ Evaluate the model;
* 3️⃣ Generate recommendations and explanations.

## ⚙️ Setup Workspace

1. Import the necessary module to access Google Drive from Colab and mount you Google Drive to the Colab enviroment. This allows you to access files and folders stored in your Google Drive

In [15]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


2. Install the `hopwise` libray

In [16]:
%%capture
!uv pip install hopwise[pathlm,cli]
# !uv pip install transformers==4.53.2

3.  To view the installed libraries in the right sidebar, run the following command:

In [None]:
!ln -s /usr/local/lib/python3.11/dist-packages /content/dist-packages

4. To check if you are using the GPU, run the following code:

In [None]:
import torch
if torch.cuda.is_available():
    device_id = torch.cuda.current_device()
    device_name = torch.cuda.get_device_name(device_id)
    print(f"CUDA Device ID: {device_id}")
    print(f"CUDA Device Name: {device_name}")
else:
    print("No CUDA device is available.")

CUDA Device ID: 0
CUDA Device Name: Tesla T4


## 📝 Introduction

<div style="background-color:#f0f4f8; border-left: 5px solid #4a90e2; padding:15px; margin:10px 0; border-radius:8px;">
<strong> PEARLM </strong> is a path-language-modeling recommender. It learns the sequence of entity-relation triplets from a knowledge graph as a next-token prediction task.</div>

<img src="https://raw.githubusercontent.com/mallociFrancesca/XAIKGRLGM/a77f9ea5633475efe43038ef2a11e1341342e0ef/hands-on-session/pearlm-arch.png" alt="PEARLM Architecture" width="900" height="200">

## 📦 Packages

In [None]:
import os
import pandas as pd
import torch
from hopwise.quick_start import run_hopwise

## 1️⃣ Training

To train the model, we first need to define the configuration parameters for both the model architecture and the training pipeline.

Using Hopwise, you can configure the model in one of the following ways:

- **Python Dictionary configuration**: provide the settings directly using a Python dictionary.
- **YAML configuration**: load settings from a `.yaml` file.
- **Default configuration**: use the predefined configuration available at: [hopwise/properties/model/PEARLM.yaml](https://github.com/tail-unica/hopwise/blob/main/hopwise/properties/model/PEARLM.yaml)

**Configuration**

In [None]:
# Python Dictionary
config_dict = {

     #--- General Settings ---#
    'epochs': 1,  # Total number of epochs for training the model
    'show_progress': True,  # Whether to display the progress bar during training and evaluation
    'data_path': '/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/dataset/',  # Path to the dataset
    'checkpoint_dir':'/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/',  # Directory to save model checkpoints

    #--- Architecture Model Settings ---#
    'embedding_size': 4,               # (int) Size of the embeddings.
    'num_heads': 2,                    # (int) Number of heads in the multi-head attention. 8,12,16
    'num_layers': 2,                   # (int) Number of layers in the transformer.
    'use_kg_token_types': True,        # (bool) Whether to use token types for the knowledge graph.

    #--- Path Settings ---#
    'path_sample_args': {
        'strategy': 'constrained-rw',  # Strategy for sampling paths, e.g., constrained random walk.
        'parallel_max_workers': 0      # Maximum number of workers for parallel processing (0 means sequential execution).
    },

    'path_generation_args': {
        'paths_per_user': 5,           # (int) Number of paths generated per user.
        'num_beams': 8,                # (int) Number of beams for beam search.
        'num_beam_groups': 2           # (int) Number of groups for diverse beam search.
    },

    #--- Metrics Settings ---#
    'metrics': [
        'NDCG',
        'MRR',
        'Hit',
        'Precision',
        'Recall',
    ]

}

🔁 **Train**

Now, we are ready to train the model using the `run_hopwise()` function.

⚠️ We already **trained** it. So you **don't need** to run this command now.

In [None]:
model = 'PEARLM'
dataset = 'ml-100k'

In [None]:
run_hopwise(model= model,
            dataset= dataset,
            config_dict=config_dict)

  return self.progress_bar(*args, **kwargs)


Output()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
  return self.progress_bar(*args, **kwargs)


Output()

  return self.progress_bar(*args, **kwargs)


Output()

Group Beam Search is scheduled to be moved to a `custom_generate` repository in v4.55.0. To prevent loss of backward compatibility, add `trust_remote_code=True` to your `generate` call.


There were missing keys in the checkpoint model loaded: ['lm_head.weight'].
There were missing keys in the checkpoint model loaded: ['lm_head.weight'].
  return self.progress_bar(*args, **kwargs)


Output()

{'best_valid_score': np.float64(0.0103),
 'valid_score_bigger': True,
 'best_valid_result': OrderedDict([('recall@10', np.float64(0.0113)),
              ('mrr@10', np.float64(0.0193)),
              ('ndcg@10', np.float64(0.0103)),
              ('hit@10', np.float64(0.0923)),
              ('precision@10', np.float64(0.0101))]),
 'test_result': OrderedDict([('recall@10', np.float64(0.0081)),
              ('mrr@10', np.float64(0.0163)),
              ('ndcg@10', np.float64(0.0081)),
              ('hit@10', np.float64(0.0817)),
              ('precision@10', np.float64(0.0083))])}

💾 **Saving Model Outputs**

By default, the trained model checkpoint will be saved in the `hopwise/saved` directory as a `.pth` file. The filename includes the model name and a timestamp, following this format:

`PEARLM-Month-day-year_timestamp.pth`

If you need to change the output directory, you have two options:

1. **Override it dynamically** using the `checkpoint_dir` parameter in your configuration dictionary.
2. **Modify the default path** in the global config file:  
   📄 [hopwise/properties/overall.yaml](https://github.com/tail-unica/hopwise/blob/main/hopwise/properties/overall.yaml)

## 2️⃣ Evaluation

To evaluate the model trought different [metrics](https://github.com/tail-unica/hopwise/blob/main/hopwise/evaluator/metrics.py) (read the entire list from `hopwise/evaluator/metrics.py`) , execute the `run_hopwise` function and specify the saved checkpoint.

In [None]:
# import libraries
from hopwise.quick_start import run_hopwise

# saved checkpoint
hopwise_checkpoint = '/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/hopwise-distilgpt2-PEARLM-Jul-15-2025_06-30-29.pth'

In [None]:
run_hopwise(model=model,
            run='evaluate',
            checkpoint= hopwise_checkpoint,
            dataset= dataset,
            config_dict=config_dict)

  return self.progress_bar(*args, **kwargs)


Output()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

The new embeddings will be initialized from a multivariate normal distribution that has old embeddings' mean and covariance. As described in this article: https://nlp.stanford.edu/~johnhew/vocab-expansion.html. To disable this, use `mean_resizing=False`
  return self.progress_bar(*args, **kwargs)


Output()

Group Beam Search is scheduled to be moved to a `custom_generate` repository in v4.55.0. To prevent loss of backward compatibility, add `trust_remote_code=True` to your `generate` call.


Output()

{'best_valid_score': np.float64(0.0103),
 'valid_score_bigger': True,
 'best_valid_result': OrderedDict([('ndcg@10', np.float64(0.0103)),
              ('mrr@10', np.float64(0.0193)),
              ('hit@10', np.float64(0.0923)),
              ('precision@10', np.float64(0.0101)),
              ('recall@10', np.float64(0.0113))]),
 'test_result': OrderedDict([('ndcg@10', np.float64(0.0081)),
              ('mrr@10', np.float64(0.0163)),
              ('hit@10', np.float64(0.0817)),
              ('precision@10', np.float64(0.0083)),
              ('recall@10', np.float64(0.0081))])}

## 3️⃣ Generate Explanations for Recommendations

In this section, we produce human-readable explanations for the recommended items.

*  1️⃣ **Load the Trained Model**: Load the trained PEARLM model.

*  2️⃣ **Generate Recommendations**: Use the model to generate recommendation (reasoning paths), guided by the KG constraints.

*  3️⃣ **Generate Textual Explanations**: Transform the reasoning paths into natural language explanations using predefined templates.

----


Import Libraries

In [None]:
#--- Load Packages ---#
import os
import pandas as pd
from safetensors.torch import load_file
from transformers import AutoTokenizer

from hopwise.quick_start import load_data_and_model

from hopwise.model.sequence_postprocessor import BeamSearchSequenceScorePostProcessor
from hopwise.model.logits_processor import ConstrainedLogitsProcessorWordLevel, LogitsProcessorList

from hopwise.utils.enum_type import KnowledgeEvaluationType, PathLanguageModelingTokenType

Define some auxiliary functions

In [None]:
def pre_process_input(users, tokenizer):
    r"""
    Prepares input strings for the PEARLM model from raw user IDs.

    Args:
        users (list of str): List of user IDs (e.g., ["1", "2"]).
        tokenizer (transformers.PreTrainedTokenizer): HuggingFace tokenizer with BOS token.

    Returns:
        list of str: Tokenized input strings with user and relation tokens,
                     e.g., ["<BOS> U1 R25", "<BOS> U2 R25"].
    """

    # Create a special token representing the user-item interaction relation
    # e.g., "R25" if 'buy' has token ID 25
    ui_relation_token = f'{PathLanguageModelingTokenType.RELATION.token}{dataset.field2token_id[dataset.relation_field][dataset.ui_relation]}'

    # Add prefix "U" to each user ID to match model token format (e.g., "1" → "U1")
    users_token = [f'{PathLanguageModelingTokenType.USER.token}{user}' for user in users]

    # Compose input strings for each user: "<BOS> U1 R25"
    users = [f'{tokenizer.bos_token} {user} {ui_relation_token}' for user in users_token]
    return users

#### 1️⃣ Load the Trained Model

Define Paths Checkpoint

In [17]:
weights_checkpoint = "/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/huggingface-distilgpt2-PEARLM-Jul-15-2025_06-30-29.pth" #  weights trained during the training
checkpoint = "/content/drive/MyDrive/XAIKGRLGM/hands-on-session/data/checkpoint/hopwise-distilgpt2-PEARLM-Jul-15-2025_06-30-29.pth" # hopwise metadata

Load model checkpoint with its configuration

In [18]:
config, model, dataset, train_data, valid_data, test_data = load_data_and_model(model_file=checkpoint)

  return self.progress_bar(*args, **kwargs)


Output()

----

#### 2️⃣ Generate Recommendations

To generate the recommendations the hopwise library provides the `explain()` method.

Before calling `model.explain()`, several steps are required to prepare the model and the input data properly:

* 1️⃣ Set the model to evaluation mode
* 2️⃣ Define target users
* 3️⃣ Pre-process user-item input
* 4️⃣ Tokenize user-item input



In [19]:
#Set the model to evaluation mode
model.eval()

#Define target user
users = ['1', '2']

#Pre-process the input  --> e.g., ["[BOS] U1 R25", "[BOS] U2 R25"].
users = pre_process_input(users, dataset.tokenizer)
print(users)

#Tokenize the input --> tensor data
input_ids = dataset.tokenizer(users, return_tensors="pt", add_special_tokens=False).to(model.device)
print(input_ids)



['[BOS] U1 R25', '[BOS] U2 R25']
{'input_ids': tensor([[    4, 34663, 34654],
        [    4, 34774, 34654]], device='cuda:0'), 'token_type_ids': tensor([[0, 0, 0],
        [0, 0, 0]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1],
        [1, 1, 1]], device='cuda:0')}


----
Now, we are ready to call the  `model.explain()` to get top-N item recommendations and their path for each user

This method internally calls HuggingFace Generation API, which expects a set of parameters to control how the model should generate the sequences. We have already modified a few of them to reduce computation burden, and _hopwise_ provides the following default values:
```yaml
path_generation_args: {
    paths_per_user: 10          # (int) Number of paths generated per user. (transformed into the HuggingFace parameter `num_return_sequences`)
    num_beams: 20               # (int) Number of beams for beam search.
    num_beam_groups: 5          # (int) Number of groups for diverse beam search.
    diversity_penalty: 0.3      # (float) Diversity penalty for beam search.
    length_penalty: 0.0         # (float) Length penalty for beam search.
    top_k: ~                    # (int) The number of highest probability vocabulary tokens to keep for top-k-filtering.
    top_p: ~                    # (float) The cumulative probability for top-p-filtering.
    do_sample: False            # (bool) Whether to use sampling ; use greedy decoding otherwise.
}
```

In [20]:
# Run inference to get top-N item recommendations for each user

#  Output:
#  - scores: A 2D tensor (n_users X n_items) of prediction scores
#  - explanations: A list of quadruples, where each includes:
#     'user'      → internal user ID
#     'item'      → recommended item ID
#     'score'     → relevance score for the recommendation
#     'path'      → list representing the reasoning path in the KG (e.g., [U1, R25, I432, ...])

scores, explanations = model.explain(
    input_ids,
    **config["path_generation_args"],
    return_dict_in_generate=True,            # HuggingFace output as a dictionary including `input_ids`, `sequences`. If also `output_scores=True`, then it also includes `scores` and `sequence_scores`
    output_scores=True,                      # Includes `scores` in the returned dictionary
)

In [21]:
explanations

[[1,
  1290,
  0.0,
  [('self_loop', 'user', 1),
   (25, 'item', 377),
   (13, 'entity', 3921),
   (13, 'item', 1290)]],
 [1,
  105,
  0.0,
  [('self_loop', 'user', 1),
   (25, 'item', 377),
   (13, 'entity', 3921),
   (13, 'item', 105)]],
 [2,
  423,
  0.0,
  [('self_loop', 'user', 2),
   (25, 'item', 575),
   (12, 'entity', 3937),
   (18, 'item', 423)]]]

----
For convenience, we'll create a Pandas DataFrame to store the explanations

In [22]:
explanations = pd.DataFrame(explanations, columns=['user', 'item', 'score', 'path'])
explanations

Unnamed: 0,user,item,score,path
0,1,1290,0.0,"[(self_loop, user, 1), (25, item, 377), (13, e..."
1,1,105,0.0,"[(self_loop, user, 1), (25, item, 377), (13, e..."
2,2,423,0.0,"[(self_loop, user, 2), (25, item, 575), (12, e..."


The format of the explanations consisting of a list of _hops_.  
Each _hop_ is a triple `(relation, entity type, entity ID)`, where:
- `relation` represents the relation incident to the node in the current hop. As the first hop always starts from a user, `relation = "self_loop"`, otherwise `relation` is the ID of the corresponding KG relation
- `entity type` denotes the type of entity reached by the current hop.
- `entity ID` is the ID of the entity node reached by the current hop.

Example:

```python

example_explanation = [
    ('self_loop', 'user', 2),   # hop 0 (fake hop): denotes that the path starts from a user with ID 2
    (25, 'item', 575),          # hop 1: user 2 (from hop 0) is connected to the item with ID 575 by the relation with ID 25
    (12, 'entity', 3937),       # hop 2: item 575 (from hop 1) is connected to the entity with ID 3937 by the relation with ID 12
    (18, 'item', 423)           # hop 3: entity 3937 (from hop 2) is connected to the item with ID 423 by the relation with ID 18
]

user_2 --relation_25--> item_575 --relation_12--> entity_3937 --relation_18--> item_423
```

----

#### 3️⃣ Generate Textual Explanations

We define the `generate_explanation()` function to map the paths to their corresponding explanation templates.

In [35]:
import pandas as pd
pd.set_option('display.max_colwidth', None)
import os

# Mapping functions to get original tokens
def eid2entity(x): return dataset.id2token(dataset.entity_field, x)
def uid2user(x): return dataset.id2token(dataset.uid_field, x)
def rid2relation(x): return dataset.id2token(dataset.relation_field, x)

items_data = pd.read_csv(os.path.join(config['data_path'], f"{config['dataset']}.item"), sep="\t")

def iid2movie_name(x):
    try:
        return items_data[items_data['item_id:token'] == x]['movie_title:token_seq'].iloc[0]
    except IndexError:
        return f"item{x}"

# Mapping entity type prefixes (U, E, I, R) to decoding functions
e_type2mapping = {
    'U': uid2user,
    'E': eid2entity,
    'I': iid2movie_name,
    'R': rid2relation,
    'user': uid2user,
    'entity': eid2entity,
    'item': iid2movie_name,
    'relation': rid2relation
}

# Default explanation template
default_template = "{item} is recommended to you because you {relation1} {entity1} also {relation2} by {entity2}"

# Explanation generator
def generate_explanation(row):
    path = row["path"]
    readable = []

    for hop in path:
        relation_id, entity_type, entity_id = hop

        # Decode relation
        if relation_id != "self_loop":
            decode_fn = e_type2mapping.get('relation')
            decoded_relation = decode_fn(relation_id) if decode_fn else f"relation{relation_id}"
            readable.append(decoded_relation)

        # Decode entity
        # Check if it's actually an item entity
        if entity_type == "entity" and dataset.id2token(dataset.entity_field, entity_id) in dataset.entity2item:
            entity_type = "item"
        decode_fn = e_type2mapping.get(entity_type)
        decoded_entity = decode_fn(entity_id) if decode_fn else f"{entity_type}{entity_id}"
        readable.append(decoded_entity)

    # Extract the relations from the path
    relation_tokens = [dataset.id2token(dataset.relation_field, hop[0]) for hop in path if hop[0] != "self_loop"]

    return default_template.format(
        item=readable[-1] if len(readable) > 0 else "unknown",
        relation1=relation_tokens[0] if len(relation_tokens) > 0 and relation_tokens[0] != dataset.ui_relation else "interacted with",
        entity1=readable[2] if len(readable) > 2 else "unknown",
        relation2=relation_tokens[1] if len(relation_tokens) > 1 else "interacted with",
        entity2=readable[4] if len(readable) > 4 else "unknown"
    )

# Main function: add explanation, user name, item name
def get_table_explanations(explanations_df: pd.DataFrame) -> pd.DataFrame:
    df = explanations_df.copy()

    # Generate explanations
    df["explanation"] = df.apply(generate_explanation, axis=1)

    # Add readable user and item names
    # df["user name"] = df["user"].apply(lambda x: uid2user(int(x)))
    df["Recommended item (name)"] = df["item"].apply(lambda x: iid2movie_name(int(x)))

    # Rename ID columns
    df.rename(columns={
        "user": "User ID",
        "item": "Recommended item (ID)"
    }, inplace=True)

    # Reorder final columns
    final_columns = [
        "User ID",
        "Recommended item (ID)",
        "Recommended item (name)",
        "path",
        "explanation"
    ]
    df = df[final_columns]

    return df


**We are finally ready to read the explanations 😊**

In [36]:
final_explanations = get_table_explanations(explanations)
final_explanations.head()

Unnamed: 0,User ID,Recommended item (ID),Recommended item (name),path,explanation
0,1,1290,Country Life,"[(self_loop, user, 1), (25, item, 377), (13, entity, 3921), (13, item, 1290)]",Country Life is recommended to you because you interacted with Heavyweights also film.film.written_by by m.0368ny
1,1,105,Sgt. Bilko,"[(self_loop, user, 1), (25, item, 377), (13, entity, 3921), (13, item, 105)]",Sgt. Bilko is recommended to you because you interacted with Heavyweights also film.film.written_by by m.0368ny
2,2,423,E.T. the Extra-Terrestrial,"[(self_loop, user, 2), (25, item, 575), (12, entity, 3937), (18, item, 423)]",E.T. the Extra-Terrestrial is recommended to you because you interacted with City Slickers II: The Legend of Curly's Gold also film.production_company.films by m.0283xx2


**Visualizing Explanations for Specific Users**

This section filters the final recommendations to show only those related to a specific user, identified by their user ID.

First, it displays the filtered recommendations as a pandas **DataFrame** for a quick overview.

Then, it iterates through each recommendation and prints them as a **List**.

Both approaches allow for a clearer understanding of the reasoning behind each suggestion, using two different reading formats

**Pandas Dataframe Visualization**

In [40]:
# Filter the final_explanations DataFrame to retrieve only the rows where the user ID is "U10"
user_rows = final_explanations[final_explanations["User ID"] == 2]

# Display the filtered rows for user U10
user_rows.head(10)

Unnamed: 0,User ID,Recommended item (ID),Recommended item (name),path,explanation
2,2,423,E.T. the Extra-Terrestrial,"[(self_loop, user, 2), (25, item, 575), (12, entity, 3937), (18, item, 423)]",E.T. the Extra-Terrestrial is recommended to you because you interacted with City Slickers II: The Legend of Curly's Gold also film.production_company.films by m.0283xx2


**List Visualization**

In [39]:
# Set the target user ID or username to inspect their recommendations
target_user_id = 2   # You can also use the name

# Filter the DataFrame to include only the rows corresponding to the selected user
user_recs = final_explanations[final_explanations['User ID'] == target_user_id]

# Determine how many recommendations to print (currently prints all for the user) and a maximum of 10
n = min(len(user_recs), 10)

# Loop through the selected user's recommendations and print details for each
for i in range(n):
    row = user_recs.iloc[i]

    print(f"--- {i+1}# recommendation for user {row['User ID']} ---\n")
    print(f"User ID: {row['User ID']}")
    #print(f"user name: {row['user name']}")
    print(f"item ID: {row['Recommended item (ID)']}")
    print(f"item name: {row['Recommended item (name)']}")
    # print(f"score: {row['score']} \n")
    print(f"path: {row['path']}")
    print("explanation:")
    print(row['explanation'])
    print("\n----------\n")


--- 1# recommendation for user 2 ---

User ID: 2
item ID: 423
item name: E.T. the Extra-Terrestrial
path: [('self_loop', 'user', 2), (25, 'item', 575), (12, 'entity', 3937), (18, 'item', 423)]
explanation:
E.T. the Extra-Terrestrial is recommended to you because you interacted with City Slickers II: The Legend of Curly's Gold also film.production_company.films by m.0283xx2

----------



# 📄 Copyright Notice

This notebook was authored by [**Francesca Maridina Malloci**](https://www.linkedin.com/in/francescamalloci/) for the course  
**"Explainable Artificial Intelligence over Knowledge Graphs: from Reinforcement Learning to Generative Modeling"**,  
held at **Boise State University**, from **July 21 to July 25, 2025**.

The material is released exclusively as part of this course to support student learning and study.  
All content, including text and original figures, is protected by copyright and may not be reproduced,  
distributed, or used without the explicit written permission of the author.

Some images were created by the author. Others, sourced from academic literature or online,  
are properly cited and credited within the notebook.

If you find this material useful for your research, you are kindly invited to cite a related publication:

> Balloccu, G., Boratto, L., Fenu, G., Malloci, F. M., & Marras, M. (2024, March).  
> *Explainable recommender systems with knowledge graphs and language models*.  
> In *European Conference on Information Retrieval* (pp. 352–357). Cham: Springer Nature Switzerland.


# References

<a name= "r1">[1] </a> Balloccu, Giacomo, et al. "Faithful Path Language Modeling for Explainable Recommendation over Knowledge Graph." arXiv preprint arXiv:2310.16452 (2023).