In [1]:
import logging

from ppnp.pytorch import PPNP
from ppnp.pytorch.training import train_model
from ppnp.pytorch.earlystopping import stopping_args
from ppnp.pytorch.propagation import PPRExact, PPRPowerIteration
from ppnp.data.io import load_dataset

In [2]:
logging.basicConfig(
        format='%(asctime)s: %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S',
        level=logging.INFO)

# Load dataset

First we need to load the dataset we want to train on. The datasets used are in the `SparseGraph` format. This is just a class providing the adjacency, attribute and label matrices in a dense (`np.ndarray`) or sparse (`scipy.sparse.csr_matrix`) matrix format and some (in principle unnecessary) convenience functions. If you want to use external datasets, you can e.g. use the `networkx_to_sparsegraph` method in `ppnp.data.io` for converting NetworkX graphs to our SparseGraph format.

The four datasets from the paper (Cora-ML, Citeseer, PubMed and MS Academic) can be found in the directory `data`.

For this example we choose the Cora-ML graph.

In [3]:
graph_name = 'cora_ml'
graph = load_dataset(graph_name)
graph.standardize(select_lcc=True)

<Undirected, unweighted and connected SparseGraph with 15962 edges (no self-loops). Data: adj_matrix (2810x2810), attr_matrix (2810x2879), labels (2810), node_names (2810), attr_names (2879), class_names (7)>

# Set up propagation

Next we need to set up the proper propagation scheme. In the paper we've introduced the exact PPR propagation used in PPNP and the PPR power iteration propagation used in APPNP.

Here we use the hyperparameters from the paper. Note that we should use a different `alpha = 0.2` for MS Academic.

In [4]:
prop_ppnp = PPRExact(graph.adj_matrix, alpha=0.1)
prop_appnp = PPRPowerIteration(graph.adj_matrix, alpha=0.1, niter=10)

# Choose model hyperparameters

Now we choose the hyperparameters. These are the ones used in the paper for all datasets.

Note that we choose the propagation for APPNP.

In [5]:
model_args = {
    'hiddenunits': [64],
    'drop_prob': 0.5,
    'propagation': prop_appnp}

# Train model

Now we can train the model.

In [6]:
idx_split_args = {'ntrain_per_class': 20, 'nstopping': 500, 'nknown': 1500, 'seed': 2413340114}
reg_lambda = 5e-3
learning_rate = 0.01

test = False
device = 'cuda'
print_interval = 20

In [7]:
result = train_model(
        graph_name, PPNP, graph, model_args, learning_rate, reg_lambda,
        idx_split_args, stopping_args, test, device, None, print_interval)

2019-11-10 18:30:13: PPNP: {'hiddenunits': [64], 'drop_prob': 0.5, 'propagation': PPRPowerIteration()}
2019-11-10 18:30:13: PyTorch seed: 1419657858
2019-11-10 18:30:16: Epoch 0: Train loss = 2.00, train acc = 15.7, early stopping loss = 1.96, early stopping acc = 33.6 (0.587 sec)
2019-11-10 18:30:17: Epoch 20: Train loss = 1.94, train acc = 62.1, early stopping loss = 1.95, early stopping acc = 46.8 (0.583 sec)
2019-11-10 18:30:17: Epoch 40: Train loss = 1.90, train acc = 70.0, early stopping loss = 1.95, early stopping acc = 52.4 (0.554 sec)
2019-11-10 18:30:18: Epoch 60: Train loss = 1.83, train acc = 86.4, early stopping loss = 1.93, early stopping acc = 65.4 (0.559 sec)
2019-11-10 18:30:18: Epoch 80: Train loss = 1.76, train acc = 87.1, early stopping loss = 1.89, early stopping acc = 70.4 (0.559 sec)
2019-11-10 18:30:19: Epoch 100: Train loss = 1.68, train acc = 89.3, early stopping loss = 1.83, early stopping acc = 74.2 (0.558 sec)
2019-11-10 18:30:19: Epoch 120: Train loss = 1.

2019-11-10 18:30:49: Epoch 1200: Train loss = 0.51, train acc = 100.0, early stopping loss = 0.92, early stopping acc = 81.8 (0.563 sec)
2019-11-10 18:30:50: Epoch 1220: Train loss = 0.50, train acc = 100.0, early stopping loss = 0.92, early stopping acc = 81.0 (0.553 sec)
2019-11-10 18:30:50: Epoch 1240: Train loss = 0.51, train acc = 100.0, early stopping loss = 0.92, early stopping acc = 81.0 (0.551 sec)
2019-11-10 18:30:51: Epoch 1260: Train loss = 0.49, train acc = 100.0, early stopping loss = 0.90, early stopping acc = 82.6 (0.551 sec)
2019-11-10 18:30:52: Epoch 1280: Train loss = 0.48, train acc = 99.3, early stopping loss = 0.91, early stopping acc = 81.4 (0.551 sec)
2019-11-10 18:30:52: Epoch 1300: Train loss = 0.51, train acc = 100.0, early stopping loss = 0.90, early stopping acc = 81.2 (0.551 sec)
2019-11-10 18:30:53: Epoch 1320: Train loss = 0.50, train acc = 98.6, early stopping loss = 0.93, early stopping acc = 81.0 (0.552 sec)
2019-11-10 18:30:53: Epoch 1340: Train loss

RuntimeError: Error(s) in loading state_dict for PPNP:
	While copying the parameter named "propagation.A_hat", whose dimensions in the model are torch.Size([2810, 2810]) and whose dimensions in the checkpoint are torch.Size([2810, 2810]).