# Train a Hypergraph Message Passing Neural Network (HMPNN)

In this notebook, we will create and train a Hypergraph Message Passing Neural Network in the hypergraph domain. This method is introduced in the paper [Message Passing Neural Networks for
Hypergraphs](https://arxiv.org/abs/2203.16995) by Heydari et Livi 2022. We will use a benchmark dataset, Cora, a collection of 2708 academic papers and 5429 citation relations, to do the task of node classification. There are 7 category labels, namely `Case_Based`, `Genetic_Algorithms`, `Neural_Networks`, `Probabilistic_Methods`, `Reinforcement_Learning`, `Rule_Learning` and `Theory`.

Each document is initially represented as a binary vector of length 1433, standing for a unique subset of the words within the papers, in which a value of 1 means the presence of its corresponding word in the paper.

In [29]:
import torch
import numpy as np
from torch_geometric.data import Data
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score
from topomodelx.nn.hypergraph.HMPNN_layer import HMPNNLayer
from typing import List

If GPU's are available, we will make use of them. Otherwise, this will run on CPU.

In [2]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cpu


# Pre-processing

Here we download the dataset. It contains initial representation of nodes, the adjacency information and the category labels.

In [4]:
! wget https://linqs-data.soe.ucsc.edu/public/lbc/cora.tgz
! tar -xf cora.tgz

In [30]:
node_ids: List[str] = []
node_labels = []
node_features: List[torch.Tensor] = []
with open("cora/cora.content") as f:
    for line in f:
        entries = line.split()
        node_ids.append(entries[0])
        node_labels.append(entries[-1])
        node_features.append(
            torch.tensor(list(map(int, entries[1:-1])), dtype=torch.float)
        )
node_labels = np.array(node_labels)
node_features = torch.stack(node_features)

label_encoder = LabelEncoder()
y = label_encoder.fit_transform(node_labels)
node_indices = np.arange(len(node_ids))

Below, we construct the incidence matrix ($B_1$) which is of shape $n_\text{nodes} \times n_\text{edges}$.

In [31]:
#
incidence_1 = torch.zeros((len(node_ids), len(node_ids)), dtype=torch.long)
nodeid_to_index_map = dict(zip(node_ids, node_indices))
node_hyperedge_list = []
with open("cora/cora.cites") as f:
    for line in f:
        cited_id, citing_id = line.split()
        cited_index = nodeid_to_index_map[cited_id]
        citing_index = nodeid_to_index_map[citing_id]
        node_hyperedge_list.append([citing_index, cited_index])
        node_hyperedge_list.append([cited_index, citing_index])
incidence_1 = torch.sparse_coo_tensor(
    torch.LongTensor(node_hyperedge_list).T,
    torch.ones(len(node_hyperedge_list)),
    dtype=torch.long,
)

In [32]:
dataset = Data(x=node_features, incidence_1=incidence_1, y=torch.from_numpy(y))

Her we split the data into train, validation and test splits according to the paper.

In [33]:
# Adopted from torch_geometric.transforms.RandomNodeSplit with proper modification.
num_classes = int(dataset["y"].max().item()) + 1
train_mask = torch.zeros(len(node_ids), dtype=torch.bool)
val_mask = torch.zeros(len(node_ids), dtype=torch.bool)
test_mask = torch.zeros(len(node_ids), dtype=torch.bool)
for c in range(num_classes):
    idx = (dataset["y"] == c).nonzero(as_tuple=False).view(-1)
    idx = idx[torch.randperm(idx.size(0))]

    train_idx = idx[:20]
    train_mask[train_idx] = True

    val_idx = idx[20:90]
    val_mask[val_idx] = True

    test_idx = idx[90:]
    test_mask[test_idx] = True

dataset.node_stores[0].train_mask = train_mask
dataset.node_stores[0].val_mask = val_mask
dataset.node_stores[0].test_mask = test_mask

dataset = dataset.to(device)

# Create the Neural Network

Using the `HMPNNLayer` class, we create a neural network with stacked layers.

In [34]:
class HMPNN(torch.nn.Module):
    """Neural network implementation of HMPNN

    Parameters
    ---------
    in_features : int
        Dimension of input features
    hidden_features : Tuple[int]
        A tuple of hidden feature dimensions to gradually reduce node/hyperedge representations feature
        dimension from in_features to the last item in the tuple.
    num_classes: int
        Number of classes
    n_layer : 2
        Number of HMPNNLayer layers.
    adjacency_dropout_rate: 0.7
        Adjacency dropout rate.
    regular_dropout_rate: 0.5
        Regular dropout rate applied on features.
    """

    def __init__(
        self,
        in_features,
        hidden_features,
        num_classes,
        n_layer=2,
        adjacency_dropout_rate=0.7,
        regular_dropout_rate=0.5,
    ):
        super().__init__()
        hidden_features = (in_features,) + hidden_features
        self.to_hidden_linear = torch.nn.Sequential(
            *[
                torch.nn.Linear(hidden_features[i], hidden_features[i + 1])
                for i in range(len(hidden_features) - 1)
            ]
        )
        self.layers = torch.nn.ModuleList(
            [
                HMPNNLayer(
                    hidden_features[-1],
                    adjacency_dropout=adjacency_dropout_rate,
                    updating_dropout=regular_dropout_rate,
                )
                for _ in range(n_layer)
            ]
        )
        self.to_categories_linear = torch.nn.Linear(hidden_features[-1], num_classes)

    def forward(self, x_0, x_1, incidence_1):
        """Forward computation through layers.

        Parameters
        ---------
        x_0 : torch.Tensor
            Node features with shape [n_nodes, in_features]
        x_1 : torch.Tensor
            Hyperedge features with shape [n_hyperedges, in_features]
        incidence_1: torch.sparse.Tensor
            Incidence matrix (B1) of shape [n_nodes, n_hyperedges]

        Returns
        --------
        y_pred : torch.Tensor
            Predicted logits with shape [n_nodes, num_classes]
        """
        x_0 = self.to_hidden_linear(x_0)
        x_1 = self.to_hidden_linear(x_1)
        for layer in self.layers:
            x_0, x_1 = layer(x_0, x_1, incidence_1)

        return self.to_categories_linear(x_0)

# Train the Neural Network

We then specify the hyperparameters and construct the model, the loss and optimizer.

In [35]:
torch.manual_seed(41)

in_features = 1433
hidden_features = 8
num_classes = 7
n_layers = 2

model = HMPNN(in_features, (256, hidden_features), num_classes, n_layers).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
loss_fn = torch.nn.CrossEntropyLoss()

Now it's time to train the model, looping over the network for a low amount of epochs. We keep training minimal for the purpose of rapid testing.

In [36]:
train_y_true = dataset["y"][dataset["train_mask"]]
val_y_true = dataset["y"][dataset["val_mask"]]
initial_x_1 = torch.zeros_like(dataset["x"])
for epoch in range(100):
    model.train()
    optimizer.zero_grad()
    y_pred_logits = model(dataset["x"], initial_x_1, dataset["incidence_1"])
    loss = loss_fn(y_pred_logits[dataset["train_mask"]], train_y_true)
    loss.backward()
    optimizer.step()

    train_loss = loss.item()
    y_pred = y_pred_logits.argmax(dim=-1)
    train_acc = accuracy_score(train_y_true, y_pred[dataset["train_mask"]])

    model.eval()
    with torch.no_grad():
        y_pred_logits = model(dataset["x"], initial_x_1, dataset["incidence_1"])
    val_loss = loss_fn(y_pred_logits[dataset["val_mask"]], val_y_true).item()
    y_pred = y_pred_logits.argmax(dim=-1)
    val_acc = accuracy_score(val_y_true, y_pred[dataset["val_mask"]])
    print(
        f"Epoch: {epoch + 1} train loss: {train_loss:.4f} train acc: {train_acc:.2f} "
        f"val loss: {val_loss:.4f} val acc: {val_acc:.2f}"
    )

Epoch: 1 train loss: 2.1276 train acc: 0.14 val loss: 2.1094 val acc: 0.14
Epoch: 2 train loss: 2.0316 train acc: 0.14 val loss: 2.0689 val acc: 0.14
Epoch: 3 train loss: 1.9684 train acc: 0.15 val loss: 2.0353 val acc: 0.14
Epoch: 4 train loss: 1.9527 train acc: 0.15 val loss: 2.0091 val acc: 0.14
Epoch: 5 train loss: 1.9251 train acc: 0.19 val loss: 1.9861 val acc: 0.14
Epoch: 6 train loss: 1.9151 train acc: 0.17 val loss: 1.9669 val acc: 0.14
Epoch: 7 train loss: 1.8849 train acc: 0.29 val loss: 1.9513 val acc: 0.14
Epoch: 8 train loss: 1.8785 train acc: 0.29 val loss: 1.9383 val acc: 0.15
Epoch: 9 train loss: 1.8600 train acc: 0.30 val loss: 1.9272 val acc: 0.15
Epoch: 10 train loss: 1.8551 train acc: 0.25 val loss: 1.9179 val acc: 0.19
Epoch: 11 train loss: 1.8443 train acc: 0.36 val loss: 1.9096 val acc: 0.26
Epoch: 12 train loss: 1.8385 train acc: 0.34 val loss: 1.9020 val acc: 0.30
Epoch: 13 train loss: 1.8267 train acc: 0.36 val loss: 1.8951 val acc: 0.28
Epoch: 14 train loss:

Finally, we evaluate the model against the test data.

In [37]:
test_y_true = dataset["y"][dataset["test_mask"]]
initial_x_1 = torch.zeros_like(dataset["x"])
model.eval()
with torch.no_grad():
    y_pred_logits = model(dataset["x"], initial_x_1, dataset["incidence_1"])
test_loss = loss_fn(y_pred_logits[dataset["test_mask"]], test_y_true).item()
y_pred = y_pred_logits.argmax(dim=-1)
test_acc = accuracy_score(test_y_true, y_pred[dataset["test_mask"]])
print(f"Test loss: {test_loss:.4f} test acc: {test_acc:.2f} ")

Test loss: 1.3764 test acc: 0.54 
