# Evaluating Jumping Knowledge Networks on Citeseer and Cora
Here I try to replicate the evaluation of JK Networks described in the [Xu et al.](https://arxiv.org/abs/1806.03536). First Xu and colleagues test GCNs and GATs on the Citeseer and Cora datasets. They also test adding the Jumping Knowledge Aggregation to the GCN with LSTM, Max pooling and Concatenation aggregation methods. I will also test a simple MLP as a baseline. Xu and colleagues vary the number of layers from 1-6 (using a hidden layer size of 16 or 32) and choose the best performing model on the validation set then compare each of the best models on the test set. When testing I will use 3 different splits to report the mean and standard deviaiton of test accuracy.

In [1]:
from jk_networks import utils, models
import torch
from itertools import product
from collections import defaultdict

In [2]:
# import the CiteSeer dataset
from torch_geometric.datasets import Planetoid
citeseer = Planetoid(root='/tmp/CiteSeer', name='CiteSeer')

## Multi-Layer Perceptron

I will now train an MLP on the citeseer dataset. This should perform worse than the GCN model because it doesn't have any graph level information but it should provide a good baseline for how well a model can do with just the bag of word features.

In [3]:
def get_mlp_accuracies(data):
  num_layers = range(1, 7)
  hidden_layer_size = [16, 32]
  val_accuracies = defaultdict(dict)
  for num_layers, hidden_layer_size in product(num_layers, hidden_layer_size):
    mlp_model = models.MLP(data.num_features, [hidden_layer_size] * num_layers, data.num_classes)
    graph = data[0]
    utils.split_data_node_classification(graph, train_ratio=0.6, val_ratio=0.2, manual_seed=42)
    utils.train(mlp_model, graph)
    model_acc = utils.test(mlp_model, graph, graph.val_mask)
    val_accuracies[num_layers][hidden_layer_size] = model_acc
  return val_accuracies

In [4]:
get_mlp_accuracies(citeseer)

AttributeError: 'MLP' object has no attribute 'lin'

## Testing GCN
Following [Xu et al.](https://arxiv.org/abs/1806.03536) I will train a series of GCNs without any Jumping Knowledge on the Citeseer dataset. I will test 12 different models with the number of layers going from 1 to 6 and the number of hidden_feauters in {16, 32}. Note Xu and collegues use a 60, 20, 20 split which is different from the built in split of citesseer. 

In [22]:
def get_gcn_accuracies(data):
  num_layers = range(1, 7)
  hidden_layer_size = [16, 32]
  val_accuracies = defaultdict(dict)
  for num_layers, hidden_layer_size in product(num_layers, hidden_layer_size):
    gcn_model = models.GCN(data.num_features, [hidden_layer_size] * num_layers, data.num_classes)
    graph = data[0]
    utils.split_data_node_classification(graph, train_ratio=0.6, val_ratio=0.2, manual_seed=42)
    utils.train(gcn_model, graph)
    model_acc = utils.test(gcn_model, graph, graph.val_mask)
    val_accuracies[num_layers][hidden_layer_size] = model_acc
  return val_accuracies

In [23]:
import pandas as pd
val_accuracies = get_gcn_accuracies(citeseer)

In [11]:
val_accuracies = pd.DataFrame(val_accuracies)
val_accuracies.index.name = 'Hidden Layer Size'
val_accuracies.columns.name = 'Number of Layers'
val_accuracies.round(3)

Number of Layers,1,2,3,4,5,6
Hidden Layer Size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
32,0.714,0.565,0.585,0.517,0.226,0.217
16,0.668,0.582,0.606,0.356,0.197,0.206


### Results
We can see that the 1 Layer, 32 hidden features network performs the best. I will retrain that model on the validation and train sets.

In [16]:
best_gcn_model = models.GCN(citeseer.num_features, [32], citeseer.num_classes)
def train_best_model(model, data):
  graph = data[0]
  graph.train_mask = torch.logical_or(graph.train_mask, graph.val_mask) # train on train+val
  utils.train(model, graph)
  return utils.test(model, graph, graph.test_mask)

In [19]:
best_gcn_model_acc = train_best_model(best_gcn_model, citeseer)
best_gcn_model_acc

0.688

### Results Continued
As we can seee the normal GCN network is only at about 70% accuracy on the citeseer dataset. This number is different from the paper which found a GCN accuracy of 77.3% for the GCN on the citeseer dataset this could be due to some of the [preprocessing](https://github.com/pyg-team/pytorch_geometric/issues/2018) that torch_geometric does to citeseer.