In [1]:
import logging
import tensorflow as tf
import os
os.environ['TP_CPP_MIN_LOG_LEVEL'] = '3'

from ppnp.tensorflow import PPNP
from ppnp.tensorflow.training import train_model
from ppnp.tensorflow.earlystopping import stopping_args
from ppnp.tensorflow.propagation import PPRExact, PPRPowerIteration
from ppnp.data.io import load_dataset





In [2]:
tf.logging.set_verbosity(tf.logging.INFO)
logging.basicConfig(
        format='%(asctime)s: %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S',
        level=logging.INFO)





# Load dataset

First we need to load the dataset we want to train on. The datasets used are in the `SparseGraph` format. This is just a class providing the adjacency, attribute and label matrices in a dense (`np.ndarray`) or sparse (`scipy.sparse.csr_matrix`) matrix format and some (in principle unnecessary) convenience functions. If you want to use external datasets, you can e.g. use the `networkx_to_sparsegraph` method in `ppnp.data.io` for converting NetworkX graphs to our SparseGraph format.

The four datasets from the paper (Cora-ML, Citeseer, PubMed and MS Academic) can be found in the directory `data`.

For this example we choose the Cora-ML graph.

In [3]:
graph_name = 'cora_ml'
graph = load_dataset(graph_name)
graph.standardize(select_lcc=True)

<Undirected, unweighted and connected SparseGraph with 15962 edges (no self-loops). Data: adj_matrix (2810x2810), attr_matrix (2810x2879), labels (2810), node_names (2810), attr_names (2879), class_names (7)>

# Set up propagation

Next we need to set up the proper propagation scheme. In the paper we've introduced the exact PPR propagation used in PPNP and the PPR power iteration propagation used in APPNP.

Here we use the hyperparameters from the paper. Note that we should use a different `alpha = 0.2` for MS Academic.

In [4]:
prop_ppnp = PPRExact(graph.adj_matrix, alpha=0.1)
prop_appnp = PPRPowerIteration(graph.adj_matrix, alpha=0.1, niter=10)

# Choose model hyperparameters

Now we choose the hyperparameters. These are the ones used in the paper for all datasets.

Note that we choose the propagation for APPNP.

In [5]:
model_args = {
    'hiddenunits': [64],
    'reg_lambda': 5e-3,
    'learning_rate': 0.01,
    'keep_prob': 0.5,
    'propagation': prop_appnp}

# Train model

Now we can train the model.

In [6]:
idx_split_args = {'ntrain_per_class': 20, 'nstopping': 500, 'nknown': 1500, 'seed': 2413340114}
test = False
save_result = False
print_interval = 20

In [7]:
result = train_model(
        graph_name, PPNP, graph, model_args, idx_split_args,
        stopping_args, test, save_result, None, print_interval)

2021-10-20 03:02:06: PPNP: {'hiddenunits': [64], 'reg_lambda': 0.005, 'learning_rate': 0.01, 'keep_prob': 0.5, 'propagation': <ppnp.tensorflow.propagation.PPRPowerIteration object at 0x7fa9f46f9890>}





2021-10-20 03:02:06: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/training.py:24: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.






2021-10-20 03:02:06: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/training.py:27: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

2021-10-20 03:02:06: Tensorflow seed: 1809758399





2021-10-20 03:02:06: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/training.py:30: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.






2021-10-20 03:02:06: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/training.py:30: The name tf.GPUOptions is deprecated. Please use tf.compat.v1.GPUOptions instead.






2021-10-20 03:02:06.916067: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-10-20 03:02:06.926649: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2600000000 Hz
2021-10-20 03:02:06.929568: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x559e749a3f30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-10-20 03:02:06.929599: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-10-20 03:02:06: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/model.py:22: The name tf.train.create_global_step is deprecated. Please use tf.compat.v1.train.create_global_step instead.






2021-10-20 03:02:06: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/ppnp.py:38: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.






2021-10-20 03:02:06: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/ppnp.py:48: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.






2021-10-20 03:02:06: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/ppnp.py:17: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.



Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.


2021-10-20 03:02:06: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/utils.py:25: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.





2021-10-20 03:02:07: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/model.py:31: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.






2021-10-20 03:02:07: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/model.py:81: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.






2021-10-20 03:02:07: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/model.py:84: The name tf.train.get_global_step is deprecated. Please use tf.compat.v1.train.get_global_step instead.






2021-10-20 03:02:07: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/model.py:91: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.






2021-10-20 03:02:07: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/training.py:47: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

2021-10-20 03:02:08: Step 0: Train loss = 2.26, train acc = 19.3, early stopping loss = 2.10, early stopping acc = 5.4 (0.976 sec)





2021-10-20 03:02:08: From /data/code/gnn-lx/ppnp/ppnp/tensorflow/model.py:94: The name tf.trainable_variables is deprecated. Please use tf.compat.v1.trainable_variables instead.

2021-10-20 03:02:09: Step 20: Train loss = 1.95, train acc = 65.7, early stopping loss = 1.96, early stopping acc = 48.8 (1.013 sec)
2021-10-20 03:02:10: Step 40: Train loss = 1.90, train acc = 68.6, early stopping loss = 1.94, early stopping acc = 53.0 (0.837 sec)
2021-10-20 03:02:11: Step 60: Train loss = 1.83, train acc = 82.9, early stopping loss = 1.92, early stopping acc = 62.2 (0.829 sec)
2021-10-20 03:02:12: Step 80: Train loss = 1.77, train acc = 89.3, early stopping loss = 1.86, early stopping acc = 75.2 (0.831 sec)
2021-10-20 03:02:12: Step 100: Train loss = 1.68, train acc = 92.1, early stopping loss = 1.81, early stopping acc = 75.2 (0.825 sec)
2021-10-20 03:02:13: Step 120: Train loss = 1.64, train acc = 97.1, early stopping loss = 1.74, early stopping acc = 80.8 (0.822 sec)
2021-10-20 03:02:14: 