# How Does DGL Represent A Graph?

By the end of this tutorial you will be able to:

* Construct a graph in DGL from scratch.
* Assigning node and edge features to a graph.
* Query properties of a DGL graph such as node degrees and connectivity.
* Transform a DGL graph into another graph with DGL functions.
* Load and save DGL graph objects.

## DGL Graph Construction

DGL represents a directed graph as a `DGLGraph` object.  One can construct a graph by specifying the number of nodes in the graph as well as the list of source and destination nodes.

For instance, the following code constructs a directed star graph with 5 leaves.  The internal node is labeled 0, and the edges go from the internal node to the leaves.

In [1]:
import dgl
import numpy as np
import torch

g = dgl.graph(([0, 0, 0, 0, 0], [1, 2, 3, 4, 5]), num_nodes=6)
# Equivalently, PyTorch LongTensors may also work.
g = dgl.graph((torch.LongTensor([0, 0, 0, 0, 0]), torch.LongTensor([1, 2, 3, 4, 5])), num_nodes=6)

# You can omit the number of nodes argument if you can tell the number of nodes from the edge list alone.
g = dgl.graph(([0, 0, 0, 0, 0], [1, 2, 3, 4, 5]))

Using backend: pytorch


Edges in the graph have consecutive identifiers starting from 0, and are in the same order as the list of source and destination nodes during creation.

In [2]:
# Print the source and destination nodes of every edge.
print(g.edges())

(tensor([0, 0, 0, 0, 0]), tensor([1, 2, 3, 4, 5]))


In [3]:
# DGL graphs can be multigraphs.  That is, multiple edges can exist between the same pair of nodes.
mg = dgl.graph(([0, 0, 0], [1, 1, 1]))

<div class="alert alert-info">
    
**Note:**  The graphs in DGL are always directed since messages sent from one node to the other are often different between both directions.  If you want to handle undirected graphs, you may consider treating it as a bidirectional graph.  See [Graph Transformations](#Graph-Transformations) for an example of making a bidirectional graph.
    
</div>

## Assigning Node and Edge Features to Graph

In many graph data, nodes and edges have attributes.  The nodes and edges in DGL graphs can have multiple attributes.

Although the types of node and edge attributes can be arbitrary in real world, a DGL graph only accepts attributes stored in tensors (with numerical contents).  Consequently, an attribute of all the nodes or edges must have the same shape.  In the context of deep learning, those attributes are often called *features*.

You can assign and retrieve node and edge features via `ndata` and `edata` property.

In [4]:
# Assign a 3-dimensional node feature vector for each node.
g.ndata['x'] = torch.randn(6, 3)
# Assign a 4-dimensional edge feature vector for each edge.
g.edata['a'] = torch.randn(5, 4)
# Assign a 5x4 node feature matrix for each node.  Node and edge features in DGL can be multi-dimensional.
g.ndata['y'] = torch.randn(6, 5, 4)

In [5]:
print(g.edata['a'])

tensor([[-0.4139,  0.1066, -1.2615, -0.9166],
        [ 1.0759, -1.0922, -1.6612, -0.2515],
        [-1.1481, -1.3787, -0.9325, -1.3221],
        [-1.3788,  1.9841, -0.6983, -1.3810],
        [-1.4653, -1.6310, -0.3887, -1.1083]])


<div class="alert alert-info">

**Note:** The vast development of deep learning has provided us many ways to encode various types of attributes into numerical features. Here are some general suggestions:

* For categorical attributes (e.g. gender, occupation), consider converting them to integers or one-hot encoding.
* For variable length string contents (e.g. news article, quote), consider applying a language model.
* For images, consider applying a vision model such as CNNs.
    
You can find plenty of materials on how to encode such attributes into a tensor in the [PyTorch Deep Learning Tutorials](https://pytorch.org/tutorials/).
    
</div>

## Graph Queries

`DGLGraph` object provides various methods to query a graph structure.

In [6]:
print(g.num_nodes())
print(g.num_edges())
# Out degrees of the internal node
print(g.out_degrees(0))
# In degrees of the internal node - note that the graph is directed so the in degree should be 0.
print(g.in_degrees(0))
# Find if edges exist between the nodes:
print(g.has_edges_between(0, 1))
# Find the edge ID of the edge between two nodes:
print(g.edge_ids(0, 1))

6
5
5
0
True
0


## Graph Transformations

You can also transform a graph to another graph.

A common transformation is to create a subgraph from the original graph:

In [7]:
# Induce a subgraph from node 0, node 1 and node 3 from the original graph.
sg1 = g.subgraph([0, 1, 3])
# Induce a subgraph from edge 0, edge 1 and edge 3 from the original graph.
sg2 = g.edge_subgraph([0, 1, 3])

You can obtain the node/edge mapping from the subgraph to the original graph by looking into the node feature `dgl.NID` or edge feature `dgl.EID` in the new graph.

In [8]:
# The original IDs of each node in sg1
print(sg1.ndata[dgl.NID])
# The original IDs of each edge in sg1
print(sg1.edata[dgl.EID])
# The original IDs of each node in sg2
print(sg2.ndata[dgl.NID])
# The original IDs of each edge in sg2
print(sg2.edata[dgl.EID])

tensor([0, 1, 3])
tensor([0, 2])
tensor([0, 1, 2, 4])
tensor([0, 1, 3])


Another common transformation is to add a reverse edge for each edge in the original graph with `dgl.add_reverse_edges`.

<div class="alert alert-info">

**Note:** If you have an undirected graph, it is better to convert it into a bidirectional graph first via adding reverse edges.
    
</div>

In [9]:
newg = dgl.add_reverse_edges(g)
newg.edges()

(tensor([0, 0, 0, 0, 0, 1, 2, 3, 4, 5]),
 tensor([1, 2, 3, 4, 5, 0, 0, 0, 0, 0]))

## Loading and Saving Graphs

You can save a graph or a list of graphs via `dgl.save_graphs` and load them back with `dgl.load_graphs`.

In [10]:
# Save graphs
dgl.save_graphs('graph.dgl', g)
dgl.save_graphs('graphs.dgl', [g, sg1, sg2])

# Load graphs
(g,), _ = dgl.load_graphs('graph.dgl')
print(g)
(g, sg1, sg2), _ = dgl.load_graphs('graphs.dgl')
print(g)
print(sg1)
print(sg2)

Graph(num_nodes=6, num_edges=5,
      ndata_schemes={'y': Scheme(shape=(5, 4), dtype=torch.float32), 'x': Scheme(shape=(3,), dtype=torch.float32)}
      edata_schemes={'a': Scheme(shape=(4,), dtype=torch.float32)})
Graph(num_nodes=6, num_edges=5,
      ndata_schemes={'y': Scheme(shape=(5, 4), dtype=torch.float32), 'x': Scheme(shape=(3,), dtype=torch.float32)}
      edata_schemes={'a': Scheme(shape=(4,), dtype=torch.float32)})
Graph(num_nodes=3, num_edges=2,
      ndata_schemes={'_ID': Scheme(shape=(), dtype=torch.int64), 'x': Scheme(shape=(3,), dtype=torch.float32), 'y': Scheme(shape=(5, 4), dtype=torch.float32)}
      edata_schemes={'_ID': Scheme(shape=(), dtype=torch.int64), 'a': Scheme(shape=(4,), dtype=torch.float32)})
Graph(num_nodes=4, num_edges=3,
      ndata_schemes={'_ID': Scheme(shape=(), dtype=torch.int64), 'x': Scheme(shape=(3,), dtype=torch.float32), 'y': Scheme(shape=(5, 4), dtype=torch.float32)}
      edata_schemes={'_ID': Scheme(shape=(), dtype=torch.int64), 'a': Scheme

## What's next?

* See [here](https://docs.dgl.ai/api/python/dgl.DGLGraph.html#querying-graph-structure) for a list of graph structure query APIs.
* See [here](https://docs.dgl.ai/en/latest/api/python/dgl.html#subgraph-extraction-ops) for a list of subgraph extraction routines.
* See [here](https://docs.dgl.ai/en/latest/api/python/dgl.html#graph-transform-ops) for a list of graph transformation routines.
* API reference of [`dgl.save_graphs`](https://docs.dgl.ai/en/latest/generated/dgl.save_graphs.html) and [`dgl.load_graphs`](https://docs.dgl.ai/en/latest/generated/dgl.load_graphs.html).