# Train a Hypergraph Message Passing Neural Network (HMPNN)

In this notebook, we will create and train a Hypergraph Message Passing Neural Network in the hypergraph domain. This method is introduced in the paper [Message Passing Neural Networks for
Hypergraphs](https://arxiv.org/abs/2203.16995) by Heydari et Livi 2022. We will use a benchmark dataset, Cora, a collection of 2708 academic papers and 5429 citation relations, to do the task of node classification. There are 7 category labels, namely `Case_Based`, `Genetic_Algorithms`, `Neural_Networks`, `Probabilistic_Methods`, `Reinforcement_Learning`, `Rule_Learning` and `Theory`.

Each document is initially represented as a binary vector of length 1433, standing for a unique subset of the words within the papers, in which a value of 1 means the presence of its corresponding word in the paper.

In [1]:
import torch
import numpy as np
from torch_geometric.utils import to_undirected
import torch_geometric.datasets as geom_datasets

from topomodelx.nn.hypergraph.hmpnn import HMPNN

If GPU's are available, we will make use of them. Otherwise, this will run on CPU.

In [2]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cpu


# Pre-processing

Here we download the dataset. It contains initial representation of nodes, the adjacency information, category labels and train-val-test masks.

In [3]:
cora = geom_datasets.Planetoid(root="/TopoModelX/data/cora", name="Cora")
data = cora.data

x_0s = data.x
y = data.y
edge_index = data.edge_index

train_mask = data.train_mask
val_mask = data.val_mask
test_mask = data.test_mask



## Define neighborhood structures and lift into hypergraph domain. ##

Now we retrieve the neighborhood structure (i.e. their representative matrice) that we will use to send messges from node to hyperedges. In the case of this architecture, we need the boundary matrix (or incidence matrix) $B_1$ with shape $n_\text{nodes} \times n_\text{edges}$.

In citation Cora dataset we lift graph structure to the hypergraph domain by creating hyperedges from 1-hop graph neighbourhood of each node. 


In [4]:
# Ensure the graph is undirected (optional but often useful for one-hop neighborhoods).
edge_index = to_undirected(edge_index)

# Create a list of one-hop neighborhoods for each node.
one_hop_neighborhoods = []
for node in range(data.num_nodes):
    # Get the one-hop neighbors of the current node.
    neighbors = data.edge_index[1, data.edge_index[0] == node]

    # Append the neighbors to the list of one-hop neighborhoods.
    one_hop_neighborhoods.append(neighbors.numpy())

# Detect and eliminate duplicate hyperedges.
unique_hyperedges = set()
hyperedges = []
for neighborhood in one_hop_neighborhoods:
    # Sort the neighborhood to ensure consistent comparison.
    neighborhood = tuple(sorted(neighborhood))
    if neighborhood not in unique_hyperedges:
        hyperedges.append(list(neighborhood))
        unique_hyperedges.add(neighborhood)    

Additionally we print the statictis associated with obtained incidence matrix

In [5]:

# Calculate hyperedge statistics.
hyperedge_sizes = [len(he) for he in hyperedges]
min_size = min(hyperedge_sizes)
max_size = max(hyperedge_sizes)
mean_size = np.mean(hyperedge_sizes)
median_size = np.median(hyperedge_sizes)
std_size = np.std(hyperedge_sizes)
num_single_node_hyperedges = sum(np.array(hyperedge_sizes) == 1)

# Print the hyperedge statistics.
print(f'Hyperedge statistics: ')
print('Number of hyperedges without duplicated hyperedges', len(hyperedges))
print(f'min = {min_size}, ')
print(f'max = {max_size}, ')
print(f'mean = {mean_size}, ')
print(f'median = {median_size}, ')
print(f'std = {std_size}, ')
print(f'Number of hyperedges with size equal to one = {num_single_node_hyperedges}')


Hyperedge statistics: 
Number of hyperedges without duplicated hyperedges 2581
min = 1, 
max = 168, 
mean = 4.003099573808601, 
median = 3.0, 
std = 5.327622607829558, 
Number of hyperedges with size equal to one = 412


Construct incidence matrix

In [6]:
max_edges = len(hyperedges)
incidence_1 = np.zeros((x_0s.shape[0], max_edges))
for col, neighibourhood in enumerate(hyperedges):
    for row in neighibourhood:
        incidence_1[row, col] = 1

assert all(incidence_1.sum(0)>0) == True, "Some hyperedges are empty"
assert all(incidence_1.sum(1)>0) == True, "Some nodes are not in any hyperedges"
incidence_1 = torch.Tensor(incidence_1).to_sparse_coo()

# Train the Neural Network

We then specify the hyperparameters and construct the model, the loss and optimizer.

In [7]:
torch.manual_seed(0)

# in_channels = x_0s.shape[1]
# hidden_channels = 64
# out_channels = torch.unique(y).shape[0]
# task_level = "graph" if out_channels==1 else "node"


in_features = 1433
hidden_features = 8
num_classes = 7
n_layers = 2

model = HMPNN(in_features, 
              (256, hidden_features), num_classes, n_layers).to(device)

optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

# Categorial cross-entropy loss
loss_fn = torch.nn.CrossEntropyLoss()

# Accuracy
acc_fn = lambda y, y_hat: (y == y_hat).float().mean()

Now it's time to train the model, looping over the network for a low amount of epochs. We keep training minimal for the purpose of rapid testing.

In [11]:

initial_x_1 = torch.zeros_like(x_0s)

for epoch in range(100):
    model.train()
    optimizer.zero_grad()
    y_hat = model(x_0s, initial_x_1, incidence_1)
    loss = loss_fn(y_pred_logits[train_mask], y[train_mask])
    loss.backward()
    optimizer.step()

    train_loss = loss.item()
    train_acc = acc_fn(y_hat[train_mask].argmax(1), y[train_mask])

    model.eval()
    with torch.no_grad():
        y_pred_logits = model(dataset["x"], initial_x_1, dataset["incidence_1"])
    val_loss = loss_fn(y_pred_logits[dataset["val_mask"]], val_y_true).item()
    y_pred = y_pred_logits.argmax(dim=-1)
    val_acc = accuracy_score(val_y_true, y_pred[dataset["val_mask"]])
    print(
        f"Epoch: {epoch + 1} train loss: {train_loss:.4f} train acc: {train_acc:.2f} "
        f"val loss: {val_loss:.4f} val acc: {val_acc:.2f}"
    )

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2708 but got size 2581 for tensor number 1 in the list.

Finally, we evaluate the model against the test data.

In [7]:
test_y_true = dataset["y"][dataset["test_mask"]]
initial_x_1 = torch.zeros_like(dataset["x"])
model.eval()
with torch.no_grad():
    y_pred_logits = model(dataset["x"], initial_x_1, dataset["incidence_1"])
test_loss = loss_fn(y_pred_logits[dataset["test_mask"]], test_y_true).item()
y_pred = y_pred_logits.argmax(dim=-1)
test_acc = accuracy_score(test_y_true, y_pred[dataset["test_mask"]])
print(f"Test loss: {test_loss:.4f} test acc: {test_acc:.2f} ")

Test loss: 1.3152 test acc: 0.55 
