# Data Handling of Graphs

A graph is used to model pairwise relations (edges) between objects (nodes). A single graph in PyG is described by an instance of `torch_geometric.data.Data`, which holds the following attributes by default:

`data.x`: Node feature matrix with shape `[num_nodes, num_node_features]`
`data.edge_index`: Graph connectivity in COO format with shape `[2, num_edges]` and type torch.long
`data.edge_attr`: Edge feature matrix with shape `[num_edges, num_edge_features]`
`data.y`: Target to train against (may have arbitrary shape), e.g., node-level targets of shape `[num_nodes, *]` or graph-level targets of shape `[1, *]`
`data.pos`: Node position matrix with shape `[num_nodes, num_dimensions]`


# Fetching Data

In [30]:
from torch_geometric.datasets import Planetoid
dataset = Planetoid(root='./data', name='Cora')
print(dataset.num_classes)
print(dataset.num_node_features)

7
1433


Here, the dataset contains only a single, undirected citation graph:

In [31]:
data = dataset[0]

print(data)
print(data.is_undirected())
print(data.train_mask.sum().item())
print(data.val_mask.sum().item())
print(data.test_mask.sum().item())

Data(x=[2708, 1433], edge_index=[2, 10556], y=[2708], train_mask=[2708], val_mask=[2708], test_mask=[2708])
True
140
500
1000


This time, the `Data` objects holds a label for each node, and additional node-level attributes: `train_mask`, `val_mask` and `test_mask`, where 
- `train_mask` denotes against which nodes to train (140 nodes),
- `val_mask` denotes which nodes to use for validation, e.g., to perform early stopping (500 nodes),
- `test_mask` denotes against which nodes to test (1000 nodes).


# Learning Methods on Graphs
After learning about data handling, datasets, loader and transforms in PyG, it’s time to implement our first graph neural network!

We will use a simple GCN layer and replicate the experiments on the Cora citation dataset. We first need to load the Cora dataset:

In [32]:
from torch_geometric.datasets import Planetoid

dataset = Planetoid(root='./data', name='Cora')

Note that we do not need to use transforms or a dataloader. Now let’s implement a two-layer GCN:

In [33]:
import torch
import torch.nn.functional as F
from torch_geometric.nn import GCNConv

class GCN(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = GCNConv(dataset.num_node_features, 16)
        self.conv2 = GCNConv(16, dataset.num_classes)

    def forward(self, data):
        x, edge_index = data.x, data.edge_index

        x = self.conv1(x, edge_index)
        x = F.relu(x)
        x = F.dropout(x, training=self.training)
        x = self.conv2(x, edge_index)

        return F.log_softmax(x, dim=1)

The constructor defines two `GCNConv` layers which get called in the forward pass of our network. Note that the non-linearity is not integrated in the conv calls and hence needs to be applied afterwards (something which is consistent accross all operators in PyG). Here, we chose to use `ReLU` as our intermediate non-linearity and finally output a `softmax` distribution over the number of classes. Let’s train this model on the training nodes for 200 epochs:

In [34]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = GCN().to(device)
data = dataset[0].to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=5e-4)

model.train()
for epoch in range(200):
    optimizer.zero_grad()
    out = model(data)
    loss = F.nll_loss(out[data.train_mask], data.y[data.train_mask])
    loss.backward()
    optimizer.step()

Finally, we can evaluate our model on the test nodes:

In [35]:
model.eval()
pred = model(data).argmax(dim=1)
correct = (pred[data.test_mask] == data.y[data.test_mask]).sum()
acc = int(correct) / int(data.test_mask.sum())
print(f'Accuracy: {acc:.4f}')

Accuracy: 0.8080


This is all it takes to implement your first graph neural network. The easiest way to learn more about Graph Neural Networks is to study the examples in the `examples/` directory and to browse `torch_geometric.nn`. Happy hacking!