# Tutorial

Inspired from PyTorch Geometric [series of tutorials](https://pytorch-geometric.readthedocs.io/en/latest/get_started/introduction.html).

In [8]:
#Execute the following line codes if executing in Python environment where pytorch_geometric is not installed
#pip install -q torch-scatter -f https://data.pyg.org/whl/torch-1.10.0+cu113.html
#pip install -q torch-sparse -f https://data.pyg.org/whl/torch-1.10.0+cu113.html
#pip install -q git+https://github.com/pyg-team/pytorch_geometric.git

In [9]:
import numpy as np
import torch
import torch_geometric.datasets as datasets
import torch_geometric.data as data
import torch_geometric.transforms as transforms
import networkx as nx
from torch_geometric.utils.convert import to_networkx

## Part 2: Simple GNN in PyG

### Load a dataset

List all the available datasets. <br>
Check online [documentation](https://pytorch-geometric.readthedocs.io/en/2.5.3/cheatsheet/data_cheatsheet.html) for statistics.

In [10]:
from torch_geometric.transforms import NormalizeFeatures

In [11]:
name = 'Cora'
dataset = datasets.Planetoid('./data', name, transform=NormalizeFeatures())

  if osp.exists(f) and torch.load(f, weights_only=False) != _repr(
  f = osp.join(self.processed_dir, 'pre_filter.pt')


We make use of **[data transformations](https://pytorch-geometric.readthedocs.io/en/latest/notes/introduction.html#data-transforms) via `transform=NormalizeFeatures()`**.
Transforms can be used to modify your input data before inputting them into a neural network, *e.g.*, for normalization or data augmentation.
Here, we [row-normalize](https://pytorch-geometric.readthedocs.io/en/latest/modules/transforms.html#torch_geometric.transforms.NormalizeFeatures) the bag-of-words input feature vectors.


### Implement a two layers GNN (with one hidden layer)

Use of [`GCN`](https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.GCNConv.html#torch_geometric.nn.conv.GCNConv) graph convolutional layer.
This graph convolutional operator is from the “Semi-supervised Classification with Graph Convolutional Networks” [paper](https://openreview.net/forum?id=SJU4ayYgl).

In [12]:
import torch
import torch.nn.functional as F
from torch_geometric.nn import GCNConv

class GCN(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = GCNConv(dataset.num_node_features, 16) #as for any layer (e.g., Conv2d) {in,out}_channels should be precised
        #remind: input channels can be accessed through num_node_features attribute.
        self.conv2 = GCNConv(16, dataset.num_classes)

    def forward(self, data):
        x, edge_index = data.x, data.edge_index

        x = self.conv1(x, edge_index) #the forward pass takes as arguments the node embeddings `x` and edge list `edge_index`
        x = F.relu(x)
        #the modules can be combined with any other modules/functions from PyTorch
        x = F.dropout(x, training=self.training) #default p=0.5, training indicate the mode={train,eval}
        x = self.conv2(x, edge_index)

        return F.log_softmax(x, dim=1)

Training loop: <br>
*N.B: similar as for other NNs. Note that mini-batching is not used for this node classification task for the reasons explained in previous tutorial.*

In [13]:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model = GCN().to(device)
data = dataset[0].to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)

model.train()
for epoch in range(200):
    optimizer.zero_grad()
    out = model(data)
    loss = F.nll_loss(out[data.train_mask], data.y[data.train_mask]) #use the training split predifined through attribute train_mask
    loss.backward()
    optimizer.step()

Evaluation: <br>
*N.B: same as for other NNs.*

In [14]:
model.eval()#on the test split pre-defined through attribute test_mask
pred = model(data).argmax(dim=1)
correct = (pred[data.test_mask] == data.y[data.test_mask]).sum()
acc = int(correct) / int(data.test_mask.sum())
print(f'Accuracy: {acc:.4f}')

Accuracy: 0.8080
