<a href="https://colab.research.google.com/github/yurahuna/graphillion_tutorial/blob/master/ja/09_practical_guide.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Practical Guide for Graphillion

This chapter introduces practical know-how for efficiently solving graph-related problems using Graphillion.

The efficiency of a computation using Graphillion depends on the size of the ZDD representing the graph set. If a large ZDD is created in the process of manipulating a `GraphSet` object, it may become a bottleneck in the process.


Unfortunately, it is notoriously difficult to predict in advance how large a ZDD representing a given graph set will be. On the other hand, there are some rules of thumb for when ZDDs become large.

In this section, we explain the empirical rules for the size of ZDDs and the countermeasures that can be taken when ZDDs become large.


## Properties of ZDDs

### The number of elements in a graph set is not proportional to the size of the ZDD.

The size of a graph set is not necessarily proportional to the size of the ZDD.

In [3]:
!pip install graphillion
!git clone https://github.com/nsnmsak/graphillion_tutorial
!cp graphillion_tutorial/ja/tutorial_util.py .

fatal: destination path 'graphillion_tutorial' already exists and is not an empty directory.


In [4]:
from graphillion import GraphSet, tutorial
from tutorial_util import zdd_size, draw_zdd
import networkx as nx

In [5]:
GraphSet.set_universe(tutorial.grid(4, 4))
paths = GraphSet.paths(1, 25)
all_graphs = GraphSet({})

len(paths), zdd_size(paths), len(all_graphs), zdd_size(all_graphs)

(8512, 605, 1099511627776, 40)

The above code creates `all_graphs`, the set of all subgraphs, and `paths`, the set of paths between diagonal vertices of a grid graph, and compares the corresponding ZDD sizes. Although `all_graphs` has by far the largest number of elements in the graph set, a comparison of ZDD sizes shows that the ZDD corresponding to `all_graphs` is smaller.

### As the `universe` grows, the ZDD grows exponentially.

ZDDs can be used to compress and reduce the size of graph sets that cause combinatorial explosion. However, even after compression, the ZDD size tends to increase exponentially as the `universe` grows.

In [6]:
zdd_sizes = []
num_paths = []
for i in range(2, 8):
    GraphSet.set_universe(tutorial.grid(i, i))
    paths = GraphSet.paths(1, (i+1)**2)
    num_paths.append(len(paths))
    zdd_sizes.append(zdd_size(paths))

zdd_sizes, num_paths

([27, 131, 605, 5635, 11429, 545208],
 [12, 184, 8512, 1262816, 575780564, 789360053252])

Experience shows that when the number of vertices in a graph specified in `universe` exceeds several hundred, Graphillion will not be able to complete its processing in a realistic amount of time.

### Dense graphs tend to increase ZDD size.

If the graph set in `universe` is dense, that is, if it has a large number of edges relative to the number of vertices, the ZDD size tends to increase.

In [7]:
grid = tutorial.grid(3, 3)
complete = nx.complete_graph(7)

complete = [(i+1, j+1) for (i, j) in complete.edges()]
len(grid), len(complete) # the numbers of edges in graphs

(24, 21)

Let's see how the size of ZDD changes between a grid graph and a complete graph with the same number of edges.

First, here is an example of a grid graph.

In [8]:
grid_zdd_sizes = {}

GraphSet.set_universe(grid)

grid_zdd_sizes['cycle'] = zdd_size(GraphSet.cycles())
grid_zdd_sizes['tree'] = zdd_size(GraphSet.trees(1, is_spanning=True ))
grid_zdd_sizes

{'cycle': 109, 'tree': 189}

Next is an example of a complete graph.

In [9]:
grid_zdd_sizes = {}

GraphSet.set_universe(complete)

grid_zdd_sizes['cycle'] = zdd_size(GraphSet.cycles())
grid_zdd_sizes['tree'] = zdd_size(GraphSet.trees(1, is_spanning=True ))
grid_zdd_sizes

{'cycle': 479, 'tree': 1087}

We observed that ZDD is larger when dealing with complete graphs, which are dense graphs.

### Avoid Random Graph Sets

Compression by ZDDs works well when the subgraphs in the graph set have some regularity. On the other hand, if the subgraphs are randomly selected, it tends to create huge ZDDs.

In [10]:
GraphSet.set_universe(tutorial.grid(5, 5))

all_graphs = GraphSet({})

random_graphs = GraphSet([]) # initialize random_graphs as an empty graph set

for i, g in enumerate(all_graphs.rand_iter()):
    if i == 1000:
        break
    random_graphs = random_graphs.union(GraphSet([g])) # add a subgraph randomly
zdd_size(random_graphs)

21128

In the example above, `random_graphs` is created by randomly selecting 1000 subgraphs from `all_graphs`, which represents the set of all subgraphs. ZDDs are not compressed well, since it takes more than 20,000 vertices to represent a set of `random_graphs`, a set of 1000 subgraphs.

## Avoid increasing the size of ZDDs

When a large ZDD is built, it can take a long time to compute or consume all of your computer's memory. Here are some measures that can be taken to avoid such a situation.

### Use a computer with plenty of memory

Memory is often a bottleneck in the use of Graphillion. Depending on the size of the ZDD to be constructed, Graphillion can easily consume several GB of memory. If you are considering processing large graphs, we recommend that you use a computer with plenty of memory.

If possible, a computer with 10 GB or more of memory will expand your capabilities.

### Devising the order of operations

Graphillion constructs a desired `GraphSet` by compositing and filtering multiple `GraphSet` objects. If a large ZDD is constructed in the process, it becomes a bottleneck and reduces the efficiency of the computation. This bottleneck may be avoided by rearranging the order of operations.

Suppose we want to find a path connecting the diagonal vertices of a grid graph with fewer than 12 edges.

Let us see how the size of the ZDDs constructed along the way changes when we construct a `GraphSet` by first finding a set of paths and then adding the restriction on the number of edges.

In [11]:
GraphSet.set_universe(tutorial.grid(5, 5))

zdd_sizes = []
gs = GraphSet.paths(1, 36)
zdd_sizes.append(zdd_size(gs))

gs = gs.smaller(12)
zdd_sizes.append(zdd_size(gs))

zdd_sizes

[5635, 106]

次に，まず辺の数が12未満である`GraphSet`を構築して，そこから対角頂点間の経路となっているものを取りだす方法で`GraphSet`オブジェクトを作ってみましょう．

In [12]:
zdd_sizes = []
gs = GraphSet({}) # make a graph set containing all subgraphs
zdd_sizes.append(zdd_size(gs))

gs = gs.smaller(12) # make a graph set containing subgraphs with less than 12 edges
zdd_sizes.append(zdd_size(gs))

gs = gs.paths(1, 36)
zdd_sizes.append(zdd_size(gs))

zdd_sizes

[60, 550, 106]

Here, `gs.paths(1, 36)` is a method to create a `GraphSet` object that is a path between vertices 1 and 36 from the graph set `gs`. Other methods such as `gs.cycles`, `gs.trees`, etc. can be used as methods of `gs` to construct a graph set specific to a particular graph set.

By changing the construction order, the size of ZDDs created along the way could be reduced.

### Adjusting Variable Ordering

The labels of the vertices in a ZDD are ordered. In the following ZDD, the labels of the vertices appear in the order 1, 2, 3 no matter how the branches are traced from the root. The order in which the labels appear is called the **variable order**.

<img src="https://github.com/nsnmsak/graphillion_tutorial/blob/master/ja/img/09/sample_zdd.png?raw=1" alt="ZDD example" style="height: 300px ;"/>

The size of the ZDD changes (sometimes exponentially) as the variable order changes, and Graphillion can also achieve efficient processing by properly setting the variable order of the ZDD.

In Graphillion, the parameter of variable order can be specified at the time of initialization with `GraphSet.set_universe()`. Let's take a look at what happens when the variable order is changed.

In [13]:
grid = tutorial.grid(4,4)
GraphSet.set_universe(grid, traversal='as-is')
paths = GraphSet.paths(1, 25)
zdd_size(paths)

577

The argument `traversal` of `GraphSet.set_universe` specifies how to order the variables. If the argument is omitted, Graphillion sets the appropriate variable order by traversing the `universe` graph. If `traversal='as-is'`, the order of the edges of the given graph is used. In the example above, the order of the edges of `grid` is used.

Next, let's see what happens when we change the order of the edges of `grid`.

In [14]:
from random import sample

grid_shuffled = sample(list(grid), len(grid))
print(grid_shuffled)
GraphSet.set_universe(grid_shuffled, traversal='as-is')
paths = GraphSet.paths(1, 25)
zdd_size(paths)

[(7, 8), (21, 22), (6, 7), (9, 14), (3, 8), (18, 19), (16, 21), (17, 22), (14, 15), (14, 19), (3, 4), (2, 7), (9, 10), (1, 2), (8, 9), (8, 13), (11, 16), (15, 20), (5, 10), (24, 25), (20, 25), (11, 12), (19, 20), (4, 5), (13, 14), (16, 17), (12, 17), (23, 24), (12, 13), (19, 24), (17, 18), (10, 15), (7, 12), (6, 11), (18, 23), (22, 23), (1, 6), (4, 9), (2, 3), (13, 18)]


33464

`sample` is a method for randomly sorting a list. It can be seen that random ordering of variables in the `grid` results in a rapid increase in the size of the ZDD.

It is possible to reduce the ZDD size by giving the `universe` a good order of the variables according to the set graph. However, it is known that the problem of finding a variable order that reduces the ZDD is also **NP-hard**, so a good variable order cannot always be found.

There are several known methods to find a good variable order, but here we introduce a method to adjust the parameters of the variable order search. By changing the vertices set to the `source` argument of this method, different variable orderings can be obtained.

In [15]:
grid = tutorial.grid(4, 6)

zdd_sizes = []
for i in range(5):
    GraphSet.set_universe(grid, source=(5 * i + 1))
    paths = GraphSet.paths(1, 35)
    zdd_sizes.append(zdd_size(paths))
zdd_sizes

[6215, 4408, 6703, 4709, 6861]

The `source` argument specifies at which vertex to start the variable order search. It can be seen that changing the value of the `source` vertex changes the size of the ZDD.

## Summary of this chapter

This chapter introduced the ZDD behavior that you need to know in order to process efficiently with Graphillion.