# 1. Example of training GNNs for Graph Classification

GRB provides easy-to-use APIs to train GNNs, facilitating the entire process from loading graph data, building GNN models, to evaluation and inference. Here is an example for the task of graph classification.

Contents
- [Load Dataset](#Load-Dataset)
- [Build Model](#Build-Model)
- [Training](#Training)
- [Inference](#Inference)
- [Evaluation](#Evaluation)

In [1]:
import os
import torch
import grb.utils as utils

## 1.1. Load Dataset

In [3]:
from grb.dataset import CogDLDataset

dataset_name = "mutag"
dataset = CogDLDataset(name=dataset_name,
                       data_dir="../../data/")

Dataset 'mutag' loaded.
    Number of graphs: 188
    Number of nodes (maximum): 28
    Number of edges (maximum): 66
    Number of features: 7
    Number of classes: 2
    Number of train samples: 150
    Number of val samples: 18
    Number of test samples: 20


## 1.2. Build Model

GRB supports models based on pure Pytorch, CogDL or DGL. The following is an example of GCNGC (GCN for Graph Classification) implemented by pure Pytorch. Other models can be found in ``grb/model/torch``, ``grb/model/cogdl``, or ``grb/model/dgl``.

### 1.2.1. GCNGC (Graph Convolutional Network for Graph Classification)

In [4]:
from grb.model.torch import GCNGC

model_name = "gcngc"
model = GCNGC(in_features=dataset.num_features,
              out_features=dataset.num_classes,
              hidden_features=64, 
              n_layers=3,
              residual=False,
              dropout=0.5)
print("Number of parameters: {}.".format(utils.get_num_params(model)))
print(model)

Number of parameters: 4802.
GCNGC(
  (layers): ModuleList(
    (0): GCNConv(
      (linear): Linear(in_features=7, out_features=64, bias=True)
      (dropout): Dropout(p=0.5, inplace=False)
    )
    (1): GCNConv(
      (linear): Linear(in_features=64, out_features=64, bias=True)
      (dropout): Dropout(p=0.5, inplace=False)
    )
  )
  (linear): Linear(in_features=64, out_features=2, bias=True)
  (dropout): Dropout(p=0.5, inplace=False)
)


## 1.3. Training

GRB provides ``grb.trainer.trainer`` that facilitates the training process of GNNs. For Graph Classification task, a mini-batch training on graphs is applied. Multiple graphs are merged into a large graph, then the results are pooled to predict label for each graph.

In [5]:
save_dir = "./saved_models/{}/{}".format(dataset_name, model_name)
save_name = "model.pt"
device = "cuda:0"
batch_size = 20

In [6]:
from grb.trainer.trainer import GraphTrainer

trainer = GraphTrainer(dataset=dataset, 
                       batch_size=batch_size,
                       optimizer=torch.optim.Adam(model.parameters(), lr=0.01),
                       loss=torch.nn.functional.cross_entropy,
                       lr_scheduler=False,
                       early_stop=True,
                       early_stop_patience=50,
                       device=device)

In [7]:
trainer.train(model=model,
              n_epoch=200,
              eval_every=1,
              save_after=0,
              save_dir=save_dir,
              save_name=save_name,
              verbose=False)

  0%|          | 0/200 [00:00<?, ?it/s]

Training finished. Best validation score: 1.0000
Training runtime: 12.8183.


## 1.4. Inference

In [8]:
model = torch.load(os.path.join(save_dir, save_name))
model = model.to(device)
model.eval()

GCNGC(
  (layers): ModuleList(
    (0): GCNConv(
      (linear): Linear(in_features=7, out_features=64, bias=True)
      (dropout): Dropout(p=0.5, inplace=False)
    )
    (1): GCNConv(
      (linear): Linear(in_features=64, out_features=64, bias=True)
      (dropout): Dropout(p=0.5, inplace=False)
    )
  )
  (linear): Linear(in_features=64, out_features=2, bias=True)
  (dropout): Dropout(p=0.5, inplace=False)
)

In [9]:
# by trainer
pred = trainer.inference(model)

## 1.5 Evaluation

In [10]:
# by trainer
test_score = trainer.evaluate(model, dataset.index_test)
print("Test score: {:.4f}".format(test_score))

Test score: 0.9000
