# Data Handling of Graphs

A single graph: `torch.geometric.data.Data` hold following attributes by default:
- `data.x` (`[num_nodes, num_node_features]`): Node feature matrix
- `data.edge_index` (`[2, num_edges]`): Graph connectivity in [COO format](https://pytorch.org/docs/stable/sparse.html#sparse-coo-docs) type `torch.long`
- `data.edge_attr` (`[num_edges, num_edge_features]`): Edge feature matrix
- `data.y` (Node-level targets of shape `[num_nodes, *]` or graph-level targets of shape  `[1, *]`): Target to train against
- `data.x` (`[num_nodes, num_dimensions]`): Node position matrix

In [1]:
import torch
from torch_geometric.data import Data

In [2]:
# Define source-target nodes of all edges
edge_index  = torch.tensor([[0, 1, 1, 2],            
                            [1, 0, 2, 1]], dtype=torch.long)

x = torch.tensor([[-1],[0],[1]], dtype=torch.float)

data = Data(x=x, edge_index = edge_index)
data

Data(x=[3, 1], edge_index=[2, 4])

If you want to write your indices this way, you should transpose and call `contiguous` on it before passing them to the data constructor:

In [3]:
x = torch.tensor([[-1],[0],[1]], dtype=torch.float) # [3,1]

edge_index = torch.tensor([[0,1], # node 0->1
                           [1,0], # node 1->0
                           [1,2],
                           [2,1],
                        ],dtype=torch.long)

data = Data(x=x, edge_index = edge_index.t().contiguous())
data

Data(x=[3, 1], edge_index=[2, 4])

In [4]:
# Check your final Data object
data.validate(raise_on_error=True)

True

Besides holding a number of node-level, edge-level or graph-level attributes, Data provides a number of useful utility functions, e.g.

In [5]:
print(data.keys())

['edge_index', 'x']


In [6]:
print(data['x'])

tensor([[-1.],
        [ 0.],
        [ 1.]])


In [7]:
for key, item in data:
    print(f'{key}: found in data')

x: found in data
edge_index: found in data


In [8]:
'edge_attr' in data

False

Analyzing the graph structure

In [9]:
data.num_nodes

3

In [10]:
data.num_edges

4

In [11]:
data.num_node_features

1

In [12]:
data.has_isolated_nodes()

False

In [13]:
data.has_self_loops()

False

In [14]:
# Transfer data object to GPU
device = torch.device('cuda') if torch.cuda.is_available() else 'cpu'
print('Transfer to:',device)

data = data.to(device)

Transfer to: cuda


# Common Benchmark Datasets

In [15]:
from torch_geometric.datasets import TUDataset

In [22]:
# https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.datasets.TUDataset.html#torch_geometric.datasets.TUDataset
dataset = TUDataset(name='MUTAG')

TypeError: __init__() missing 1 required positional argument: 'root'