<a href="https://colab.research.google.com/github/priba/tutorial_notebook/blob/main/aerfai_summer_school_2021/AERFAI_summer_school_2021.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Graph Neural Networks for Pattern Recognition
## AERFAI - Online Summer School on Pattern Recognition and Machine Learning

[![View on Github](https://img.shields.io/static/v1.svg?logo=github&label=Repo&message=View%20On%20Github&color=lightgrey)](https://github.com/priba/tutorial_notebook/blob/main/aerfai_summer_school_2021/AERFAI_summer_school_2021.ipynb)

---

Graph Neural Networks (GNNs) have recently gained increasing popularity due to its novel uses in different domains such as Computer Vision, Natural Language Processing, Chemistry, Social Networks or Knowledge Graphs. In this tutorial, we will cover the basic usage of GNN for the problem of node or graph classification. In particular, we will build a GNN from scratch to understand the explained notation and concepts. Later, we will use the provided implementations of several GNN to check the difference in performance.

We will use [PyTorch](https://pytorch.org/) as our Deep Learning framework and the [Deep Graph Library (DGL)](https://www.dgl.ai/) as our GNN library.

## Prepare environment

Let's start preparing the environment we will use all over the session. First we will change the runtime to work on GPUs and later, we will install all the required libraries.

### Change runtime of notebook to GPU


```
  Select Runtime -> Change Runtime type -> select runtime python 3 and hardward accelerator GPU
```

### Install requirements

The basic libraries that will be used are:

*   [Network](https://networkx.github.io/)
*   [Pytorch](https://pytorch.org/)
*   [DGL](https://www.dgl.ai/)



In [None]:
!pip3 install torch
!pip3 install dgl-cu110

## Get the Data

The DGL provides several classes for reading graphs!

Check:  https://docs.dgl.ai/api/python/dgl.data.html

In the following sections we will see 3 examples, 2 for graph classification and 1 for node classification.

### Mini Graph Classification Dataset (MiniGCDataset)

This dataset is synthetic and we should define de number and size of the graphs for each set (train, validation and test).

*   class 0 : cycle graph
*   class 1 : star graph
*   class 2 : wheel graph
*   class 3 : lollipop graph
*   class 4 : hypercube graph
*   class 5 : grid graph
*   class 6 : clique graph
*   class 7 : circular ladder graph

In [None]:
import dgl
import networkx as nx
import matplotlib.pyplot as plt 

dgl.backend = 'pytorch'

dataset = dgl.data.MiniGCDataset(100, 23, 50)
# Transfer NetworkX graph with the corresponding attributes
g, label = dataset[50]

# Data structure
print(g)

# Node Information (empty in the case of MiniGCDataset)
print(g.ndata)

# Edge Information (empty in the case of simple graphs such as Letters)
print(g.edata)

# Plot
fig, ax = plt.subplots()
G = g.to_networkx()
nx.draw(G, ax=ax, pos=nx.circular_layout(G))
ax.set_title('Class: {:d}'.format(label))
plt.show()

### Graph kernel dataset

It allows loading the data from: ENZYMES, DD, COLLAB and MUTAG

In [None]:
dataset = dgl.data.TUDataset('MUTAG')
# Transfer NetworkX graph with the corresponding attributes
g, label = dataset[50]

# Data structure
print(g)

# Node Information (empty in the case of MiniGCDataset)
print(g.ndata)

# Edge Information (empty in the case of simple graphs such as Letters)
print(g.edata)

# Label
print(label)

# Plot
fig, ax = plt.subplots()
G = g.to_networkx()
nx.draw(G, ax=ax, pos=nx.circular_layout(G))
ax.set_title('Class: {:d}'.format(label.item()))
plt.show()

### Cora Graph Dataset

Is a citation network graph.

Nodes mean paper and edges mean citation relationships. Each node has a predefined feature with 1433 dimensions. The dataset is designed for the node classification task. The task is to predict the category of certain paper.

In [None]:
dataset = dgl.data.CoraGraphDataset()

# Transfer NetworkX graph with the corresponding attributes
g = dataset[0]

# Data structure
print(g)

# Node Information (empty in the case of MiniGCDataset)
print(g.ndata)

# Edge Information (empty in the case of simple graphs such as Letters)
print(g.edata)

# Node classification dataset divides the nodes into training, validation and test sets
print(g.ndata['train_mask'])
print(g.ndata['val_mask'])
print(g.ndata['test_mask'])

### To Define New Dataset Class
Pytorch provides an abstract class representig a dataset, ```torch.utils.data.Dataset```. We need to override two methods:

*   ```__len__``` so that ```len(dataset)``` returns the size of the dataset.
*   ```__getitem__``` to support the indexing such that ```dataset[i]``` can be used to get i-th sample




## Prepare DataLoader

```torch.utils.data.DataLoader``` is an iterator which provides:


*   Data batching
*   Shuffling the data
*   Parallel data loading

In our specific case, we need to deal with graphs of many sizes. Hence, we define a new collate function makin guse of the method ```dgl.batch```.

In this example we will work with the MiniGCDataset

In [None]:
from torch.utils.data import DataLoader

def collate(samples):
    # The input `samples` is a list of pairs
    #  (graph, label).
    graphs, labels = map(list, zip(*samples))
    batched_graph = dgl.batch(graphs)
    return batched_graph, torch.tensor(labels)

# Define the corresponding subsets for train, validation and test.
trainset = dgl.data.MiniGCDataset(1000, 23, 50)
validset = dgl.data.MiniGCDataset(100, 23, 100)
testset = dgl.data.MiniGCDataset(500, 23, 100)

# Define the three dataloaders. Train data will be shuffled at each epoch
train_loader = DataLoader(trainset, batch_size=32, shuffle=True,
                         collate_fn=collate)
valid_loader = DataLoader(validset, batch_size=32, collate_fn=collate)
test_loader = DataLoader(testset, batch_size=32, collate_fn=collate)

## Define GNN model

We have two options.

*   To define our own GNN layers (see next section)
*   To use predefined GNN layers (skip to NN Modules)




### Define our own Model

To define a Graph Convolution, three functions have to be defined:




*   Message: Decide which information is sent by a node
*   Reduce: Combine the messages and the current data
*   NodeApply: Update the node features that are recieved from the reduce function

In [None]:
import dgl.function as fn
import torch.nn as nn
import torch.nn.functional as F

# Sends a message of node feature h.
msg = fn.copy_src(src='h', out='m')
def message_func(edges):
    return {'m': edges.src['h']}

def reduce(nodes):
  """Take an average over all neighbor node features hu and use it to
  overwrite the original node feature."""
  accum = torch.sum(nodes.mailbox['m'], 1)
  return {'m': accum}

class NodeApplyModule(nn.Module):
  """Update the node feature hv with ReLU(Whv+b)."""
  def __init__(self, in_feats, out_feats, activation):
    super(NodeApplyModule, self).__init__()
    self.linear = nn.Linear(in_feats, out_feats)
    self.activation = activation

  def forward(self, node):
    h = torch.cat([node.data['h'], node.data['m']],1)
    h = self.linear(h)
    h = self.activation(h)
    return {'h' : h}  
  
class GCN(nn.Module):
  """Define a GCN layer"""
  def __init__(self, in_feats, out_feats, activation):
    super(GCN, self).__init__()
    self.apply_mod = NodeApplyModule(2*in_feats, out_feats, activation)
    
  def forward(self, g, feature):
    # Initialize the node features with h.
    g.ndata['h'] = feature
    g.update_all(msg, reduce)
    g.apply_nodes(func=self.apply_mod)
    return g.ndata.pop('h')
  
class Net(nn.Module):
  def __init__(self, in_dim, hidden_dim, n_classes):
    super(Net, self).__init__()
    self.layers = nn.ModuleList([
        GCN(in_dim, hidden_dim, F.relu),
        GCN(hidden_dim, hidden_dim, F.relu)])
    self.classify = nn.Linear(hidden_dim, n_classes)
    
  def forward(self, g):
    # For undirected graphs, in_degree is the same as
    # out_degree.
    h = g.in_degrees().view(-1, 1).float()
    
    if torch.cuda.is_available():
      h = h.cuda() 
    for conv in self.layers:
      h = conv(g, h)
    g.ndata['h'] = h
    hg = dgl.mean_nodes(g, 'h')
    return self.classify(hg)

### NN Modules

Several GNN layers are provided by default as [NN Modules](https://docs.dgl.ai/api/python/nn.pytorch.html#graphconv).

**Choose the network to use, either with NN Modules or defining it as it has been shown before.**

In [None]:
from dgl.nn.pytorch import GraphConv

class Net(nn.Module):
  def __init__(self, in_dim, hidden_dim, n_classes):
    super(Net, self).__init__()
    self.layers = nn.ModuleList([
        GraphConv(in_dim, hidden_dim, activation=F.relu),
        GraphConv(hidden_dim, hidden_dim, activation=F.relu)])
    self.classify = nn.Linear(hidden_dim, n_classes)
    
  def forward(self, g):
    # Use node degree as the initial node feature. For undirected graphs, the in-degree
    # is the same as the out_degree.
    h = g.in_degrees().view(-1, 1).float()
    
    if torch.cuda.is_available():
      h = h.cuda() 
      
    for conv in self.layers:
      h = conv(g, h)
      
    g.ndata['h'] = h
    hg = dgl.mean_nodes(g, 'h')
    return self.classify(hg)

## Training setup

In [None]:
import torch
import torch.optim as optim

model = Net(1, 256, trainset.num_classes)
if torch.cuda.is_available():
  model = model.cuda()
loss_func = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)
model.train()

epoch_losses = []
for epoch in range(80):
  epoch_loss = 0
  for iter, (bg, label) in enumerate(train_loader):
    if torch.cuda.is_available():
      label = label.cuda()
      bg = bg.to(label.device)
    prediction = model(bg)
    loss = loss_func(prediction, label)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    epoch_loss += loss.detach().item()
  epoch_loss /= (iter + 1)
  print('Epoch {}, loss {:.4f}'.format(epoch, epoch_loss))
  epoch_losses.append(epoch_loss)

## Evaluation

In [None]:
def accuracy(output, target):
  """Accuacy given a logit vector output and a target class
  """
  _, pred = output.topk(1)
  pred = pred.squeeze()
  correct = pred == target
  correct = correct.float()
  return correct.sum() * 100.0 / correct.shape[0]


model.eval()
acc = 0
with torch.no_grad():
  for iter, (bg, label) in enumerate(test_loader):
    if torch.cuda.is_available():
        label = label.cuda()
        bg = bg.to(label.device)
    prediction = model(bg)
    acc += accuracy(prediction, label) * label.shape[0]
acc = acc/len(testset)

print('Test accuracy {:.4f}'.format(acc))

Test accuracy 87.6000


## Plot results

In [None]:
from random import randrange
import matplotlib.pyplot as plt
for i in range(10):
  index = randrange(len(testset))
  g, label = testset[index]
  pred = model(g.to('cuda'))
  _, pred = pred.topk(1)
  G = g.to_networkx()
  plt.figure(i)
  nx.draw(G, pos=nx.circular_layout(G), arrows=False)
  plt.show()
  print('Label {}; Prediction {}'.format(label, pred.item()))