# Benchmarking study

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samuelbarrett1234/atml-group-11/blob/dfalck-experiments/damon/notebooks/experiments.ipynb)


## Colab admin

To run in Colab switch to a GPU runtime, fill in your git username and access token, execute the below two cells, then restart your runtime.

In [16]:
USER = ...
TOKEN = ...

In [None]:
import torch
TORCH = torch.__version__.split("+")[0]
CUDA = "cu"+torch.version.cuda.replace(".","")

%pip install torch-scatter \
             torch-sparse \
             torch-cluster \
             torch-spline-conv \
             -f https://pytorch-geometric.com/whl/torch-{TORCH}+{CUDA}.html
%pip install torch-geometric 

In [None]:
# %pip install --no-cache cloud-tpu-client==0.10 torch==1.10 https://storage.googleapis.com/tpu-pytorch/wheels/colab/1.10/torch_xla-1.10-cp37-cp37m-linux_x86_64.whl
!git clone -b dfalck-experiments https://{USER}:{TOKEN}@github.com/samuelbarrett1234/atml-group-11.git
%pip install atml-group-11/damon/src

### To reload `oxgat`

If the repo has changed, do the following to update the package (make sure to restart the runtime):

In [None]:
%cd atml-group-11
!git pull origin dfalck-experiments
%cd ..
%pip uninstall oxgat
%pip install atml-group-11/damon/src

## Setup

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [1]:
import oxgat
import torch
from torch_geometric.datasets import Planetoid, Amazon, CoraFull, Coauthor
import torch_geometric.utils
from torch_geometric.transforms import NormalizeFeatures
from ogb.graphproppred import PygGraphPropPredDataset
import json

**At any point run the following cell to visualise logged training progress:**

In [None]:
%reload_ext tensorboard
%tensorboard --logdir=lightning_logs/

## Load data

In [2]:
torch.manual_seed(0)

# Transductive node classification
cora = Planetoid("./data", name="Cora", transform=NormalizeFeatures())
citeseer = Planetoid("./data", name="CiteSeer", transform=NormalizeFeatures())
pubmed = Planetoid("./data", name="PubMed", transform=NormalizeFeatures())
corafull = CoraFull("./data", transform=NormalizeFeatures())
photo = Amazon("./data", name="Photo", transform=NormalizeFeatures())
computers = Amazon("./data", name="Computers", transform=NormalizeFeatures())
compsci = Coauthor("./data", name="CS", transform=NormalizeFeatures())
physics = Coauthor("./data", name="Physics", transform=NormalizeFeatures())

# # Graph classification
# molhiv = PygGraphPropPredDataset(name="ogbg-molhiv")
# split_idx = molhiv.get_idx_split()
# molhiv_train = molhiv[split_idx["train"]]
# molhiv_val = molhiv[split_idx["valid"]]
# molhiv_test = molhiv[split_idx["test"]]

Add train-val-test splits to datasets that don't already have them:

In [3]:
def add_masks(data):
    num_nodes = data.x.size(0)
    num_classes = data.y.unique().size(0)

    train_mask = torch.zeros(num_nodes, dtype=torch.bool)
    for c in range(num_classes):
        relevant_nodes = (data.y == c).nonzero().flatten()
        n = min(relevant_nodes.size(0), 20)
        to_include = relevant_nodes[torch.randperm(relevant_nodes.size(0))[:n]]
        mask = torch_geometric.utils.index_to_mask(to_include, size=num_nodes)
        train_mask = torch.logical_or(train_mask, mask)

    unused_nodes = torch.logical_not(train_mask).nonzero().flatten()
    val_nodes = unused_nodes[torch.randperm(unused_nodes.size(0))[:500]]
    val_mask = torch_geometric.utils.index_to_mask(val_nodes, size=num_nodes)

    unused_nodes = torch.logical_not(torch.logical_or(train_mask, val_mask)).nonzero().flatten()
    test_nodes = unused_nodes[torch.randperm(unused_nodes.size(0))[:1000]]
    test_mask = torch_geometric.utils.index_to_mask(val_nodes, size=num_nodes)

    data.train_mask = train_mask
    data.val_mask = val_mask
    data.test_mask = test_mask
    return data

corafull = [add_masks(corafull[0])]
photo = [add_masks(photo[0])]
computers = [add_masks(computers[0])]
compsci = [add_masks(compsci[0])]
physics = [add_masks(physics[0])]

## GAT

In [None]:
gat_results = {}

In [None]:
# Cora
cora_model = oxgat.models.CustomNodeClassifier(in_features=1433,
                                               num_classes=7)
cora_model.standard_train(cora, use_gpu=True)
cora_model.standard_test(cora)
gat_results["cora"] = cora_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Missing logger folder: /content/lightning_logs


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_0/checkpoints/epoch=637-step=637.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_0/checkpoints/epoch=637-step=637.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [None]:
# CiteSeer
citeseer_model = oxgat.models.CustomNodeClassifier(in_features=3703,
                                                   num_classes=6)
citeseer_model.standard_train(citeseer, use_gpu=True)
citeseer_model.standard_test(citeseer)
gat_results["citeseer"] = citeseer_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_2/checkpoints/epoch=760-step=760.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_2/checkpoints/epoch=760-step=760.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [None]:
# PubMed
pubmed_model = oxgat.models.CustomNodeClassifier(in_features=500,
                                                 num_classes=3)
pubmed_model.standard_train(pubmed, use_gpu=True)
pubmed_model.standard_test(pubmed)
gat_results["pubmed"] = pubmed_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_4/checkpoints/epoch=288-step=288.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_4/checkpoints/epoch=288-step=288.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [None]:
# CoraFull
corafull_model = oxgat.models.CustomNodeClassifier(in_features=8710,
                                                   num_classes=70)
corafull_model.standard_train(corafull, use_gpu=True)
corafull_model.standard_test(corafull)
gat_results["corafull"] = corafull_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_6/checkpoints/epoch=566-step=566.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_6/checkpoints/epoch=566-step=566.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [None]:
# Photo
photo_model = oxgat.models.CustomNodeClassifier(in_features=745,
                                                num_classes=8)
photo_model.standard_train(photo, use_gpu=True)
photo_model.standard_test(photo)

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_8/checkpoints/epoch=880-step=880.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_8/checkpoints/epoch=880-step=880.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [None]:
# Computers
computers_model = oxgat.models.CustomNodeClassifier(in_features=767,
                                                    num_classes=10)
computers_model.standard_train(computers, use_gpu=True)
computers_model.standard_test(computers)
gat_results["computers"] = computers_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_10/checkpoints/epoch=1216-step=1216.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_10/checkpoints/epoch=1216-step=1216.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [None]:
# Compsci
compsci_model = oxgat.models.CustomNodeClassifier(in_features=6805,
                                                  num_classes=15)
compsci_model.standard_train(compsci, use_gpu=True)
compsci_model.standard_test(compsci)
gat_results["compsci"] = compsci_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_12/checkpoints/epoch=803-step=803.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_12/checkpoints/epoch=803-step=803.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [None]:
# Physics
physics_model = oxgat.models.CustomNodeClassifier(in_features=8415,
                                                  num_classes=5)
physics_model.standard_train(physics, use_gpu=True)
physics_model.standard_test(physics)
gat_results["physics"] = physics_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_14/checkpoints/epoch=653-step=653.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_14/checkpoints/epoch=653-step=653.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [None]:
# Molhiv
# molhiv_model = oxgat.models.CustomGraphClassifier(in_features=9,
#                                                   num_classes=2,
#                                                   cast_to_float=True)
# molhiv_model.standard_train(molhiv_train, molhiv_val, use_gpu=True)
# molhiv_model.standard_test(molhiv_test)

In [None]:
with open("/content/drive/MyDrive/Basket/ATML/gat_results.json", "w") as f:
    json.dump(gat_results, f)

## GAT with 2-hop neighbourhoods

In [5]:
gatd2_results = {}

In [8]:
# Cora
cora_model = oxgat.models.CustomNodeClassifier(in_features=1433,
                                               num_classes=7,
                                               neighbourhood_depth=2,
                                               sparse=False)
cora_model.standard_train(cora, use_gpu=True)
cora_model.standard_test(cora)
gatd2_results["cora"] = cora_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_16/checkpoints/epoch=409-step=409.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_16/checkpoints/epoch=409-step=409.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [9]:
# CiteSeer
citeseer_model = oxgat.models.CustomNodeClassifier(in_features=3703,
                                                   num_classes=6,
                                                   neighbourhood_depth=2,
                                                   sparse=False)
citeseer_model.standard_train(citeseer, use_gpu=True)
citeseer_model.standard_test(citeseer)
gatd2_results["citeseer"] = citeseer_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_18/checkpoints/epoch=592-step=592.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_18/checkpoints/epoch=592-step=592.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [None]:
# PubMed
pubmed_model = oxgat.models.CustomNodeClassifier(in_features=500,
                                                 num_classes=3,
                                                 neighbourhood_depth=2,
                                                 sparse=False)
pubmed_model.standard_train(pubmed, use_gpu=True)
pubmed_model.standard_test(pubmed)
gatd2_results["pubmed"] = pubmed_model.final_metrics

In [None]:
# CoraFull
corafull_model = oxgat.models.CustomNodeClassifier(in_features=8710,
                                                   num_classes=70,
                                                   neighbourhood_depth=2,
                                                   sparse=False)
corafull_model.standard_train(corafull, use_gpu=True)
corafull_model.standard_test(corafull)
gatd2_results["corafull"] = corafull_model.final_metrics

In [None]:
# Photo
photo_model = oxgat.models.CustomNodeClassifier(in_features=745,
                                                num_classes=8,
                                                neighbourhood_depth=2,
                                                sparse=False)
photo_model.standard_train(photo, use_gpu=True)
photo_model.standard_test(photo)
gatd2_results["photo"] = photo_model.final_metrics

In [None]:
# Computers
computers_model = oxgat.models.CustomNodeClassifier(in_features=767,
                                                    num_classes=10,
                                                    neighbourhood_depth=2,
                                                    sparse=False)
computers_model.standard_train(computers, use_gpu=True)
computers_model.standard_test(computers)
gatd2_results["computers"] = computers_model.final_metrics

In [None]:
# Compsci
compsci_model = oxgat.models.CustomNodeClassifier(in_features=6805,
                                                  num_classes=15,
                                                  neighbourhood_depth=2,
                                                  sparse=False)
compsci_model.standard_train(compsci, use_gpu=True)
compsci_model.standard_test(compsci)
gatd2_results["compsci"] = compsci_model.final_metrics

In [None]:
# Physics
physics_model = oxgat.models.CustomNodeClassifier(in_features=8415,
                                                  num_classes=5,
                                                  neighbourhood_depth=2,
                                                  sparse=False)
physics_model.standard_train(physics, use_gpu=True)
physics_model.standard_test(physics)
gatd2_results["physics"] = physics_model.final_metrics

In [None]:
# Molhiv
# molhiv_model = oxgat.models.CustomGraphClassifier(in_features=9,
#                                                   num_classes=2,
#                                                   cast_to_float=True,
#                                                   neighbourhood_depth=2)
# molhiv_model.standard_train(molhiv_train, molhiv_val, use_gpu=True)
# molhiv_model.standard_test(molhiv_test)

In [17]:
with open("/content/drive/MyDrive/Basket/ATML/gatd2_results.json", "w") as f:
    json.dump(gatd2_results, f)

## GAT without self-loops

In [5]:
gatnsl_results = {}

In [6]:
# Cora
cora_model = oxgat.models.CustomNodeClassifier(in_features=1433,
                                               num_classes=7,
                                               strict_neighbourhoods=True)
cora_model.standard_train(cora, use_gpu=True)
cora_model.standard_test(cora)
gatnsl_results["cora"] = cora_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_25/checkpoints/epoch=723-step=723.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_25/checkpoints/epoch=723-step=723.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [7]:
# CiteSeer
citeseer_model = oxgat.models.CustomNodeClassifier(in_features=3703,
                                                   num_classes=6,
                                                   strict_neighbourhoods=True)
citeseer_model.standard_train(citeseer, use_gpu=True)
citeseer_model.standard_test(citeseer)
gatnsl_results["citeseer"] = citeseer_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_27/checkpoints/epoch=694-step=694.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_27/checkpoints/epoch=694-step=694.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [8]:
# PubMed
pubmed_model = oxgat.models.CustomNodeClassifier(in_features=500,
                                                 num_classes=3,
                                                 strict_neighbourhoods=True)
pubmed_model.standard_train(pubmed, use_gpu=True)
pubmed_model.standard_test(pubmed)
gatnsl_results["pubmed"] = pubmed_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_29/checkpoints/epoch=399-step=399.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_29/checkpoints/epoch=399-step=399.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [9]:
# CoraFull
corafull_model = oxgat.models.CustomNodeClassifier(in_features=8710,
                                                   num_classes=70,
                                                   strict_neighbourhoods=True)
corafull_model.standard_train(corafull, use_gpu=True)
corafull_model.standard_test(corafull)
gatnsl_results["corafull"] = corafull_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_31/checkpoints/epoch=694-step=694.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_31/checkpoints/epoch=694-step=694.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [10]:
# Photo
photo_model = oxgat.models.CustomNodeClassifier(in_features=745,
                                                num_classes=8,
                                                strict_neighbourhoods=True)
photo_model.standard_train(photo, use_gpu=True)
photo_model.standard_test(photo)
gatnsl_results["photo"] = photo_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_33/checkpoints/epoch=991-step=991.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_33/checkpoints/epoch=991-step=991.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [11]:
# Computers
computers_model = oxgat.models.CustomNodeClassifier(in_features=767,
                                                    num_classes=10,
                                                    strict_neighbourhoods=True)
computers_model.standard_train(computers, use_gpu=True)
computers_model.standard_test(computers)
gatnsl_results["computers"] = computers_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_35/checkpoints/epoch=716-step=716.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_35/checkpoints/epoch=716-step=716.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [12]:
# Compsci
compsci_model = oxgat.models.CustomNodeClassifier(in_features=6805,
                                                  num_classes=15,
                                                  strict_neighbourhoods=True)
compsci_model.standard_train(compsci, use_gpu=True)
compsci_model.standard_test(compsci)
gatnsl_results["compsci"] = compsci_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_37/checkpoints/epoch=639-step=639.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_37/checkpoints/epoch=639-step=639.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [13]:
# Physics
physics_model = oxgat.models.CustomNodeClassifier(in_features=8415,
                                                  num_classes=5,
                                                  strict_neighbourhoods=True)
physics_model.standard_train(physics, use_gpu=True)
physics_model.standard_test(physics)
gatnsl_results["physics"] = physics_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_39/checkpoints/epoch=521-step=521.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_39/checkpoints/epoch=521-step=521.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [14]:
# Molhiv
# molhiv_model = oxgat.models.CustomGraphClassifier(in_features=9,
#                                                   num_classes=2,
#                                                   cast_to_float=True)
# molhiv_model.standard_train(molhiv_train, molhiv_val, use_gpu=True)
# molhiv_model.standard_test(molhiv_test)

In [15]:
with open("/content/drive/MyDrive/Basket/ATML/gatnsl_results.json", "w") as f:
    json.dump(gatnsl_results, f)

## GAT with degree encoding

In [4]:
gatde_results = {}

In [5]:
# Cora
cora_model = oxgat.models.CustomNodeClassifier(in_features=1433,
                                               num_classes=7,
                                               layer_type=oxgat.components.MultiHeadAttentionLayerWithDegrees,
                                               max_degree=oxgat.utils.get_max_degree(cora))
cora_model.standard_train(cora, use_gpu=True)
cora_model.standard_test(cora)
gatde_results["cora"] = cora_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_41/checkpoints/epoch=623-step=623.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_41/checkpoints/epoch=623-step=623.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [6]:
# CiteSeer
citeseer_model = oxgat.models.CustomNodeClassifier(in_features=3703,
                                                   num_classes=6,
                                                   layer_type=oxgat.components.MultiHeadAttentionLayerWithDegrees,
                                                   max_degree=oxgat.utils.get_max_degree(citeseer))
citeseer_model.standard_train(citeseer, use_gpu=True)
citeseer_model.standard_test(citeseer)
gatde_results["citeseer"] = citeseer_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_43/checkpoints/epoch=928-step=928.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_43/checkpoints/epoch=928-step=928.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [7]:
# PubMed
pubmed_model = oxgat.models.CustomNodeClassifier(in_features=500,
                                                 num_classes=3,
                                                 layer_type=oxgat.components.MultiHeadAttentionLayerWithDegrees,
                                                 max_degree=oxgat.utils.get_max_degree(pubmed))
pubmed_model.standard_train(pubmed, use_gpu=True)
pubmed_model.standard_test(pubmed)
gatde_results["pubmed"] = pubmed_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_45/checkpoints/epoch=89-step=89.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_45/checkpoints/epoch=89-step=89.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [8]:
# CoraFull
corafull_model = oxgat.models.CustomNodeClassifier(in_features=8710,
                                                   num_classes=70,
                                                   layer_type=oxgat.components.MultiHeadAttentionLayerWithDegrees,
                                                   max_degree=oxgat.utils.get_max_degree(corafull))
corafull_model.standard_train(corafull, use_gpu=True)
corafull_model.standard_test(corafull)
gatde_results["corafull"] = corafull_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_47/checkpoints/epoch=1977-step=1977.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_47/checkpoints/epoch=1977-step=1977.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [9]:
# Photo
photo_model = oxgat.models.CustomNodeClassifier(in_features=745,
                                                num_classes=8,
                                                layer_type=oxgat.components.MultiHeadAttentionLayerWithDegrees,
                                                max_degree=oxgat.utils.get_max_degree(photo))
photo_model.standard_train(photo, use_gpu=True)
photo_model.standard_test(photo)
gatde_results["photo"] = photo_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_49/checkpoints/epoch=640-step=640.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_49/checkpoints/epoch=640-step=640.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [10]:
# Computers
computers_model = oxgat.models.CustomNodeClassifier(in_features=767,
                                                    num_classes=10,
                                                    layer_type=oxgat.components.MultiHeadAttentionLayerWithDegrees,
                                                    max_degree=oxgat.utils.get_max_degree(computers))
computers_model.standard_train(computers, use_gpu=True)
computers_model.standard_test(computers)
gatde_results["computers"] = computers_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_51/checkpoints/epoch=338-step=338.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_51/checkpoints/epoch=338-step=338.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [11]:
# Compsci
compsci_model = oxgat.models.CustomNodeClassifier(in_features=6805,
                                                  num_classes=15,
                                                  layer_type=oxgat.components.MultiHeadAttentionLayerWithDegrees,
                                                  max_degree=oxgat.utils.get_max_degree(compsci))
compsci_model.standard_train(compsci, use_gpu=True)
compsci_model.standard_test(compsci)
gatde_results["compsci"] = compsci_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.rich_model_summary.RichModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Restoring states from the checkpoint path at /content/lightning_logs/version_53/checkpoints/epoch=689-step=689.ckpt
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Loaded model weights from checkpoint at /content/lightning_logs/version_53/checkpoints/epoch=689-step=689.ckpt


Output()

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

  f"The dataloader, {name}, does not have many workers which may be a bottleneck."


In [12]:
# Physics
physics_model = oxgat.models.CustomNodeClassifier(in_features=8415,
                                                  num_classes=5,
                                                  layer_type=oxgat.components.MultiHeadAttentionLayerWithDegrees,
                                                  max_degree=oxgat.utils.get_max_degree(physics))
physics_model.standard_train(physics, use_gpu=True)
physics_model.standard_test(physics)
gatde_results["physics"] = physics_model.final_metrics

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Output()

RuntimeError: ignored

In [None]:
# Molhiv
# molhiv_model = oxgat.models.CustomGraphClassifier(in_features=9,
#                                                   num_classes=2,
#                                                   cast_to_float=True)
# molhiv_model.standard_train(molhiv_train, molhiv_val, use_gpu=True)
# molhiv_model.standard_test(molhiv_test)

In [13]:
with open("/content/drive/MyDrive/Basket/ATML/gatde_results.json", "w") as f:
    json.dump(gatde_results, f)

## GATv2 normal (weight-sharing, bias)

## GATv2 (weight-sharing, no bias)

## GATv2 (source weights, bias)

## GATv2 (all separate weights, bias)