At first, we need a dataset for training, validating and testing the Graph Neural Network (GNN). We load the Cora dataset (available in PyTorch Geometric framework) in this case. 

In [2]:
from torch_geometric.datasets import Planetoid

dataset = Planetoid(root='/tmp/Cora', name='Cora')

Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.x
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.tx
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.allx
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.y
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.ty
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.ally
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.graph
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.test.index
Processing...
Done!


Next, we import the libraries required for implementing a 2-layered GNN.

In [4]:
import torch
from torch_geometric.nn import GCNConv

In the following cell, we implement a 2-layered GNN.

In [19]:
import torch.nn.functional as F

class GCN(torch.nn.Module):#torch.nn.Module is the base class for all neural network modules in PyTorch 
    def __init__(self):
        super().__init__()
        self.conv1 = GCNConv(dataset.num_node_features, 16)#GCNConv() performs message computation, aggregation of the messages, and then, updating of the node embeddings. The 1st parameter 'number of input features per node' and the 2nd argument 'number of features per output' are provided for initializing the parameters of the class GCNConv.
        self.conv2 = GCNConv(16, dataset.num_classes)

    def forward(self, data):
        x, edge_index = data.x, data.edge_index#'x' represents the vector of node features and edge_index represents the adjacency matrix for connectivity

        x = self.conv1(x, edge_index)#conv1.forward() gets called here. The arguments 'x' and 'edge_index' are passed as inputs to forward().
        x = F.relu(x)#Applying Relu activation on the result of the above graph-convolution operation.
        
        x = F.dropout(x, training=self.training)#Randomly zero some of the elements of the input tensor 'x' with probability p(default: 0.5) using samples from a Bernoulli distribution.Also, the mode is set to 'training' because Dropout behaves differently during training and testing.
        
        x = self.conv2(x, edge_index)#conv2.forward() gets called here. The arguments 'x' and 'edge_index' are passed as inputs to forward().
        return F.log_softmax(x, dim=1)#Applying softmax activation on the result of the above graph-convolution operation.

Then, we choose the device on which we want to deploy the GNN and the training dataset.

A **torch.device** is an object representing the device on which a torch.Tensor is or will be allocated. The torch.device contains a device type ('cpu', 'cuda' or 'mps') 

In [6]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Next, we move the GNN parameters to the chosen device.

In [7]:
model = GCN().to(device)

Also, we move the dataset to the chosen device.

In [8]:
data = dataset[0].to(device)

Then, we choose the optimization algorithm (from the *torch.optim* package) for training/optimizing the parameters of our GNN. As an input, we provided *model.parameters()* to denote which parameters (tensors) to optimize. We also defined the decay constant and learning rate (lr).

In [11]:
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)

Next, we set the mode of GCN module (torch.nn.Module) to training. '*model.train()*' simple changes the '*self.training*' flag via '*self.training = training*' recursively for all modules. 

**Note**: By default, the mode is set to training and that is why they omit '*model.train()*' call. 

In [9]:
model.train()

GCN(
  (conv1): GCNConv(1433, 16)
  (conv2): GCNConv(16, 7)
)

Then, we train the GCN module for 200 epochs. 

In [24]:
for epoch in range(200):
    optimizer.zero_grad()#Initializes the gradients to zero at the beginning of each epoch
    out = model(data)#Calls the 'forward' method in the class GCN and stores the output/prediction of the network.
    loss = F.nll_loss(out[data.train_mask], data.y[data.train_mask])#Calculates the negative log likelihood loss (nll) between original ('data.y') and predicted data ('out') points
    loss.backward()#Backward pass for computing the gradients of the loss w.r.t to learnable parameters
    optimizer.step()#Updates the learnt parameters at the end of each epoch

Next, we set the mode of GCN module (torch.nn.Module) to testing. This is equivalent to executing: ***model.train(mode=False)***.

In [15]:
model.eval()

GCN(
  (conv1): GCNConv(1433, 16)
  (conv2): GCNConv(16, 7)
)

Obtain the predicted class by storing the output class that has max probability.

In [16]:
pred = model(data).argmax(dim=1)

Compute the total number of correct predictions

In [17]:
correct = (pred[data.test_mask] == data.y[data.test_mask]).sum()

Finally, we compute and print the accuracy performance of the GNN.

In [18]:
acc = int(correct) / int(data.test_mask.sum())
print(f'Accuracy: {acc:.4f}')

Accuracy: 0.7880
