In [1]:
from torch_geometric.datasets import OGB_MAG

dataset = OGB_MAG(root='./data', preprocess='metapath2vec')
data = dataset[0]

Downloading http://snap.stanford.edu/ogb/data/nodeproppred/mag.zip
Extracting data/mag/raw/mag.zip
Downloading https://data.pyg.org/datasets/mag_metapath2vec_emb.zip
Extracting data/mag/raw/mag_metapath2vec_emb.zip
Processing...
Done!


# Utility Functions

In case the edge type can be uniquely identified by only the pair of source and destination node types or the edge type

In [None]:
paper_node_data = data['paper']
cites_edge_data = data['paper', 'cites', 'paper']

cites_edge_data = data['paper', 'paper']
cites_edge_data = data['cites']

add new node types or tensors and remove them:

In [None]:
data['paper'].year = ...    # Setting a new paper attribute
del data['field_of_study']  # Deleting 'field_of_study' node type
del data['has_topic']       # Deleting 'has_topic' edge type

access the meta-data of the data object, holding information of all present node and edge types:

In [None]:
node_types, edge_types = data.metadata()

print(node_types)
# ['paper', 'author', 'institution']

print(edge_types)
# [('paper', 'cites', 'paper'),
# ('author', 'writes', 'paper'),
# ('author', 'affiliated_with', 'institution')]

The `data` object can be transferred between devices as usual:

In [None]:
data = data.to('cuda:0')
data = data.cpu()

further have access to additional helper functions to analyze the given graph

In [None]:
data.has_isolated_nodes()
data.has_self_loops()
data.is_undirected()

can convert it to a homogeneous “typed” graph via to_homogeneous() which is able to maintain features in case their dimensionalities match across different types

Here, homogeneous_data.edge_type represents an edge-level vector that holds the edge type of each edge as an integer.

In [None]:
homogeneous_data = data.to_homogeneous()
print(homogeneous_data)
Data(x=[1879778, 128], edge_index=[2, 13605929], edge_type=[13605929])

# Heterogeneous Graph Transformations

Here, ToUndirected() transforms a directed graph into (the PyG representation of) an undirected graph, by adding reverse edges for all edges in the graph. Thus, future message passing is performed in both direction of all edges. The function may add reverse edge types to the heterogeneous graph, if necessary.

For all nodes of type 'node_type' and all existing edge types of the form ('node_type', 'edge_type', 'node_type'), the function AddSelfLoops() will add self-loop edges. As a result, each node might receive one or more (one per appropriate edge type) messages from itself during message passing.

The transform NormalizeFeatures() works like in the homogeneous case, and normalizes all specified features (of all types) to sum up to one.

In [None]:
import torch_geometric.transforms as T

data = T.ToUndirected()(data)
data = T.AddSelfLoops()(data)
data = T.NormalizeFeatures()(data)