# General Framework of GNNs
- **Essential idea**: 
   iteratively update the node representations by combining the representations of their neighbors and their own representations.
- Each layer has two important functions
  
- **Obtain node representation** with multiple layers:
  - Initialization: $H^0=X$
  - For $k=1,2,\cdots, K$, where $K$ is total num of GNN layers:
     $$a_v^k = \textbf{AGGREGATE}^k\{H_u^{k-1}: u\in N(v)\}$$
     $$H_v^k = \textbf{COMBINE}^k\{ H_v^{k-1}, a_v^k \}$$
  - $H^k$: can be treated as the final node representation
- **Apply node representation** for downstream tasks (e.g., node classification):
  - label of node $v$ (denoted as $\hat{y}_v$) can be predicted via:
     $$\hat{y}_v = \text{Softmax}(WH_v^\top)$$
     - $W\in\mathbb{R}^{\vert \mathcal{L}\vert\times F}$, $\vert\mathcal{L}\vert$ is the num of labels in the output space
- **Train the model** via minimizing the loss function:
  - $$O = \frac{1}{n_l}\sum_{i=1}^{n_l}\text{loss}(\hat{y}_i, y_i)$$
    - $n_l$: num of labeled nodes
    - $\text{loss}(\cdot,\cdot)$: a loss function (e.g., cross-entropy loss function)
  
# Graph Convolutional Networks (GCN)
- The node representation in each layer is updated as:
  - $$H^{k+1}=\sigma(\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^k W^k)$$
    - $\tilde{A}=A+\mathbf{I}$
    - $\tilde{D}$: a diagonal matrix with $\tilde{D}_{ii}=\sum_j\tilde{A}_{ij}$
    - $\sigma(\cdot)$: activation function such as `ReLU` and `Tanh`

- Download and load `Cora` dataset:

In [2]:
from torch_geometric.datasets import Planetoid

dataset = Planetoid(root='./dataset/Cora', name='Cora')

- Implementation GCN:
  - defines two `GCNConv` layers which get called in the forward pass of network
  - reference: https://pytorch-geometric.readthedocs.io/en/latest/get_started/introduction.html#learning-methods-on-graphs

In [6]:
import torch
import torch.nn.functional as F
from torch_geometric.nn import GCNConv

def accuracy(pred_y, y):
    return ((pred_y == y).sum() / len(y)).item()

class GCN(torch.nn.Module):
    def __init__(self, dim_in, dim_hid, dim_out):
        super().__init__()
        self.conv1 = GCNConv(dim_in, dim_hid)
        self.conv2 = GCNConv(dim_hid, dim_out)

    def forward(self, data):
        x, edge_index = data.x, data.edge_index

        x = self.conv1(x, edge_index)
        x = F.relu(x)
        x = F.dropout(x, training=self.training)
        x = self.conv2(x, edge_index)

        return F.log_softmax(x, dim=1)

device = torch.device('cpu')
model = GCN(dataset.num_features, 20, dataset.num_classes).to(device)
data = dataset[0].to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)

model.train()
for epoch in range(200):
    optimizer.zero_grad()
    out = model(data)
    loss = F.nll_loss(out[data.train_mask], data.y[data.train_mask])
    loss.backward()
    optimizer.step()
    pred = model(data).argmax(dim=1)
    correct = (pred[data.test_mask] == data.y[data.test_mask]).sum()
    acc = int(correct) / int(data.test_mask.sum())
    if(epoch % 20 == 0):
        val_loss = F.nll_loss(out[data.val_mask], data.y[data.val_mask])
        val_acc = accuracy(out[data.val_mask].argmax(dim=1),
                            data.y[data.val_mask])
        print(f'Epoch {epoch:>3} | Train Loss: {loss:.3f} | Train Acc:'
                f' {acc*100:>5.2f}% | Val Loss: {val_loss:.2f} | '
                f'Val Acc: {val_acc*100:.2f}%')

model.eval()
pred = model(data).argmax(dim=1)
correct = (pred[data.test_mask] == data.y[data.test_mask]).sum()
acc = int(correct) / int(data.test_mask.sum())
print(f'Accuracy: {acc:.4f}')

Epoch   0 | Train Loss: 1.925 | Train Acc: 39.00% | Val Loss: 1.94 | Val Acc: 18.20%
Epoch  20 | Train Loss: 0.120 | Train Acc: 71.30% | Val Loss: 0.95 | Val Acc: 70.00%
Epoch  40 | Train Loss: 0.057 | Train Acc: 75.90% | Val Loss: 1.02 | Val Acc: 72.40%
Epoch  60 | Train Loss: 0.032 | Train Acc: 75.00% | Val Loss: 0.89 | Val Acc: 74.80%
Epoch  80 | Train Loss: 0.040 | Train Acc: 76.20% | Val Loss: 0.99 | Val Acc: 73.00%
Epoch 100 | Train Loss: 0.033 | Train Acc: 74.40% | Val Loss: 0.92 | Val Acc: 76.20%
Epoch 120 | Train Loss: 0.026 | Train Acc: 74.60% | Val Loss: 0.95 | Val Acc: 74.20%
Epoch 140 | Train Loss: 0.016 | Train Acc: 75.70% | Val Loss: 0.97 | Val Acc: 74.00%
Epoch 160 | Train Loss: 0.025 | Train Acc: 75.80% | Val Loss: 0.90 | Val Acc: 76.20%
Epoch 180 | Train Loss: 0.036 | Train Acc: 76.30% | Val Loss: 0.90 | Val Acc: 73.60%
Accuracy: 0.7940
