# Saving and Loading Graphs using Dataset and Export

## Exporting graphs to a text file can be done using the methods defined in TorchUtils.Export

In [1]:
# Lets build some random graphs to save to text
# we will build 10 fully connected graphs with 5 node features and 2 edge features

import numpy as np

num_node_features = 5
num_edge_features = 2
num_graphs = 10

nodes = [
    np.random.uniform(
        0,1,size=(np.random.randint(5,8),num_node_features) # create node features for graphs with random number of nodes
    )
    for _ in range(num_graphs)
]


edge_indexes = [
    np.array([ [i,j]  for i in range(node.shape[0]) for j in range(i+1,node.shape[0]) ]).T
    for node in nodes
]

edges = [
    np.random.uniform(
        0,1,size=(edge_index.shape[1],num_edge_features)
    )
    for edge_index in edge_indexes
]

Now with some graphs built we can save them directly from these lists

In [2]:
from torchUtils import export_nodes,export_edges 

# Saves the node features to a text file in
# outdir/node_shape.txt - saves the shape of each graph node features
# outdir/node_x.txt - flattened list of node features
export_nodes(nodes,outdir='test_data/') 

# Saves the edge features to a text file in
# outdir/edge_shape.txt - saves the shape of each graph edge features
# outdir/edge_attr.txt - flattened list of edge features
# If there also edge_indexes, you can pass that as the index kwarg
# outdir/edge_index.txt - flattened list of edge indexes
export_edges(edges,outdir='test_data/',index=edge_indexes)

In [4]:
# Looking at the tree structure of the data directory
!tree test_data

test_data
├── edge_attr.txt
├── edge_index.txt
├── edge_shape.txt
├── node_shape.txt
└── node_x.txt

0 directories, 5 files


You can also save torch geometric graph data

In [7]:
from torch_geometric.data import Data 

graphs = [ Data(x=x,edge_index=edge_index,edge_attr=edge_attr) for x,edge_index,edge_attr in zip(nodes,edge_indexes,edges) ]

from torchUtils import export_graphs 

export_graphs(graphs,outdir='test_data')

Lets say you also want to save different features for the same graphs.

You can use tags to differentiate them in the same directory

Lets create some new features for our graph

In [9]:
# make new features for each node with a shape of 2
new_nodes = [
    np.random.uniform(
        -1,0,size=(node.shape[0],2)
    )   
    for node in nodes 
]

# make new features for each edge with a shape of 1
new_edges = [
    np.random.uniform(
        -1,0,size=(edge.shape[0],1)
    )   
    for edge in edges
]

In [10]:
# this will save these nodes as
# outdir/tag_node_x.txt 
# outdir/tag_node_shape.txt
export_nodes(new_nodes,outdir='test_data',tag='new')

# similarly for the edges
# outdir/tag_edge_attr.txt
# outdir/tag_edge_shape.txt
export_edges(new_edges,outdir='test_data',tag='new')

In [11]:
!tree test_data

test_data
├── edge_attr.txt
├── edge_index.txt
├── edge_shape.txt
├── new_edge_attr.txt
├── new_edge_shape.txt
├── new_node_shape.txt
├── new_node_x.txt
├── node_shape.txt
└── node_x.txt

0 directories, 9 files


Now that we have features saved to text file, we can load them in using torchUtils.Dataset

## Loading graphs with torchUtils.Dataset

The Dataset class takes in a root directory that points to the directory containing all the feature text files.

In [12]:
from torchUtils import Dataset 

dataset = Dataset('test_data/')

It will automatically load graphs using node_x.txt, node_shape.txt, edge_attr.txt, edge_shape.txt, edge_index.txt, and collect them into pytorch geometric Data structures

In [14]:
dataset[0]

Data(x=[6, 5], edge_index=[2, 15], edge_attr=[30, 1])

The Dataset class inherits from list, so it can be used just like a python list of graphs. You can load in alternative node and edge features by using the load_extra method

In [15]:
dataset.load_extra('new')

In [16]:
dataset[0]

Data(x=[6, 5], edge_index=[2, 15], edge_attr=[30, 1], new_x=[[-0.266, -0.265], [-0.291, -0.416], ... [-0.673, -0.159], [-0.84, -0.416]], new_edge_attr=[-0.915, -0.633, -0.23, -0.823, -0.785, ... -0.781, -0.22, -0.377, -0.186, -0.281])

These extra features will be made as a new attribute to each graph in the list