# PCS5024 - Aprendizado Estatístico - Statistical Learning - 2023/1
### Professors: 
### Anna Helena Reali Costa (anna.reali@usp.br)
### Fabio G. Cozman (fgcozman@usp.br)

In [1]:
#!pip install --quiet torch_geometric torch

In [2]:
import torch
import torch_geometric


# Graph Neural Networks

Graph Neural Networks (GNNs) are a class of deep learning models designed to work with graph-structured data. They have gained significant attention in recent years due to their ability to model complex relationships and patterns in a variety of domains, including social networks, molecular structures, and recommendation systems, among others.

GNNs can be considered a generalization of CNNs (grid graph), RNNs (line graph) and Transformers (fully connected graph).

<img src='https://miro.medium.com/v2/resize:fit:720/format:webp/1*e8xtqXuqNCBWhzdbF7krtA.png'  width="60%" height="60%">

The main concept in GNNs is the message passing mechanism. 

<img src='https://drive.google.com/uc?id=14GojFf-UdTkBtlmKpLmBaHO5FpIwadq1'  width="40%" height="40%">

Let's break it down into its components!

### The message function $\phi$
<img src='https://drive.google.com/uc?id=1OOytQ9n9yf81-lQHXk0Zqnm_NczvxFc7'  width="20%" height="20%">

Creates a vector representation for each edge that represents a **message** sent by node j to node i.

$\phi$ can be any differentiable function, so it can be trained through SGD. A common choice are MLPs.

### The permutation invariant aggregation $\bigoplus$
<img src='https://drive.google.com/uc?id=1OnO1UQqfa8wbm1ozi9oYuck1w8F-MQIg'  width="6%" height="6%">

Aggregate all messages from all neighbours into a single vector representation.

$\bigoplus$ can be any permutation invariant operation since permutation of neighbours produce isomorphic graphs. Examples are sum, mean, max and min.

### The update function $\gamma$
<img src='https://drive.google.com/uc?id=1k7KQl4CNvqVpZ0K1dMIXfJJpYfdnU5aU'  width="40%" height="40%">

Updates the target node's representation based on the aggregated message from neighbours.

Many GNNs architectures utilize a $\phi$ function with self-loops instead. It reduces the representation power, but also reduces training complexity.

Let's see an example of Node Classification on the Cora Dataset using PyTorch Geometric.

The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words.

https://relational.fit.cvut.cz/dataset/CORA

In [3]:
from torch_geometric.datasets import Planetoid

dataset = Planetoid(root="/tmp/Cora", name="Cora")


In [4]:
import torch
import torch.nn.functional as F
from torch_geometric.nn import GCNConv


class GCN(torch.nn.Module):
    def __init__(self,hidden_size):
        super().__init__()
        self.conv1 = GCNConv(dataset.num_node_features, hidden_size)
        self.conv2 = GCNConv(hidden_size, dataset.num_classes)

    def forward(self, data):
        x, edge_index = data.x, data.edge_index

        x = self.conv1(x, edge_index)
        x = F.relu(x)
        x = F.dropout(x, training=self.training)
        x = self.conv2(x, edge_index)

        return F.log_softmax(x, dim=1)


In [5]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
hidden_size = 100
model = GCN(hidden_size).to(device)
data = dataset[0].to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)

model.train()
for epoch in range(200):
    optimizer.zero_grad()
    out = model(data)
    loss = F.nll_loss(out[data.train_mask], data.y[data.train_mask])
    loss.backward()
    optimizer.step()


In [6]:
model.eval()
pred = model(data).argmax(dim=1)
correct = (pred[data.test_mask] == data.y[data.test_mask]).sum()
acc = int(correct) / int(data.test_mask.sum())
print(f"Accuracy: {acc:.4f}")


Accuracy: 0.8120


More information about GNNs and other geometric generalizations in Deep Learning can be found in:
Bronstein et al. - https://arxiv.org/abs/2104.13478