### CS4423 - Networks
Angela Carnevale
School of Mathematical and Statistical Sciences
University of Galway



#### 2. Tree and Graph Traversal

# Week 3, lecture 2: Paths, Trees and Algorithms

In [None]:
import networkx as nx
import numpy as np

In [None]:
nodes = 'ABCDEFGHIJKLM'
edges = [
    'AB', 'CE', 'FG', 'FH', 'GI', 'GJ', 'HJ', 'HL', 'HM', 
    'IK', 'JK', 'KL', 'LM'
]
G = nx.Graph()
G.add_nodes_from(nodes)
G.add_edges_from(edges)

In [None]:
opts = { "with_labels": True, "node_color": 'y'}
nx.draw(G, **opts)

* $(F, G, I)$ is a path in the graph above, and $(H, J, K, L, H)$ is a cycle.

* A cycle in a simple graph provides, for any two nodes on that
cycle, (at least) two different paths from one to the other.

* This can be very handy to provide alternative routes for connectivity in case one of the edges should fail (e.g. in transportation networks). 


## Connected Components

**Definition.**
    <ul>
        <li>A simple graph is <b>connected</b> if, for
every pair of nodes, there is a path between them.
        </li>
        <li>
If a graph is not connected, it naturally breaks into pieces,
its <b>connected components</b>.
       

In [None]:
nx.draw(G, **opts)

* The connected components of the graph above are the
node sets $\{A, B\}$, $\{C, E\}$, $\{D\}$, and $\{F,G,H,I,J,K,L,M\}$.
* Note that a component can consist of a single node only.

In [None]:
list(nx.connected_components(G))

**Note.** 

The connected components of a graph are the equivalence classes of the equivalence relation 'there is a **path** from $x$ to $y$ on the node set $X$ of the
graph. This, in turn, is the **transitive closure** of the graph relation 'there is an
**edge** between $x$ and $y$'. 

##  Trees

* A graph is called **acyclic** if it does not contain any cycles.

*    A <b>tree</b> is a (simple) graph that is <b>connected</b> and <b>acyclic</b>.

In other words, between any two vertices in a tree there is **exactly one simple path**.

Trees can be characterized in many different ways.



**Theorem.**  Let $G = (X, E)$ be a (simple) graph of order $n = |X|$
and size $m = |E|$.
Then the following are equivalent:

* $G$ is a tree (i.e. acyclic and connected);

* $G$ is connected and $m = n-1$;

* $G$ is a minimally connected graph (i.e., removing any edge will disconnect $G$);

* $G$ is acyclic and $m = n-1$;

* $G$ is a maximally acyclic graph (i.e., adding any edge will introduce a cycle in $G$).

## Random Trees

We can ask `networkx` to produce a **random tree** with a given number of nodes:

In [None]:
T = nx.random_tree(15)
nx.draw(T, **opts)

**Note** how the nodes are labelled and stored for a random tree in `networkx`

In [None]:
T.nodes()

<b>Theorem (Cayley's Formula).</b>
    There are exactly $n^{n-2}$ distinct (labelled) trees on the $n$-element vertex set 
    $X = \{0, 1, 2, \dots, n-1\}$, if $n > 1$.

Also, there is one (trivial) tree on $1$ vertex.  Some values for $n > 1$:

In [None]:
domain = range(2, 10)
print(np.array([domain, [n**(n-2) for n in domain]]))

**Proof.** We will prove this formula by giving a bijective correspondence between trees on $X = \{0, 1, 2, \dots, n-1\}$ and sequences of
$n-2$ elements of $X$. The (unique) sequence of $n-2$ elements of $X$ associated with a tree via this bijection is its [**Prüfer Code**](https://en.wikipedia.org/wiki/Pr%C3%BCfer_sequence).

In [None]:
n = 8
TT = nx.random_tree(n)
nx.draw(TT, **opts)

How to determine the Prüfer code of a tree $T$ (destructively):

* Find the smallest leaf $x$
* Record the label $y$ of its unique neighbour
* Remove $x$ (and the edge $x - y$) from $T$
* Repeat until $T$ has only $2$ nodes left.

In [None]:
def pruefer_list(tree):
    for x in tree:
        if tree.degree(x) == 1:
            for y in tree[x]:
                tree.remove_node(x)
                return [x, y] 

**Note.** Here we can skip step $1$ of the algorithm since the nodes are visited from the smallest to the largest anyway.

In [None]:
T = TT.copy()
code = [pruefer_list(T) for k in range(n-2)]

In [None]:
print(np.array(code).transpose())

This process destroys the tree `T` almost completely.

In [None]:
print(T.nodes())
print(T.edges())

Let's wrap this up as a `python` function

In [None]:
def pruefer_node(tree):
    for x in tree:
        if tree.degree(x) == 1:
            for y in tree[x]:
                tree.remove_node(x)
                return y

In [None]:
def pruefer_code(tree):
    return [pruefer_node(tree) for k in range(tree.order() - 2)]

In [None]:
T = TT.copy()
code = pruefer_code(T)
code

Maybe surprisingly, the tree can be reconstructed from its Prüfer code.  This is based on the following fact
and shows that the map from trees to codes is a bijection!


<b>Fact:</b> The degree of node $x$ is $1$ plus the number of entries $x$ in the Prüfer code of $T$.


In [None]:
degrees = [1 for k in range(n)]
for k in code:
    degrees[k] += 1
degrees

In [None]:
[TT.degree[x] for x in TT]

How to restore the tree from its Prüfer code:

* Start with a graph with vertex set $X = \{0, 1, 2, \dots, n-1\}$ (and no edges yet).
* Compute the desired node degrees from the code.
* For each node $y$ in the code find the smallest degree-$1$-node $x$ and
add the edge $x - y$, then decrease the degrees of both $x$ and $y$ by $1$.
* Finally, connect the remaining $2$ nodes of degrees $1$ by an edge.

In [None]:
T = nx.empty_graph(n)
nx.draw(T, **opts)

In [1]:
code

NameError: name 'code' is not defined

In [None]:
# repeat n-2 times:
for y in code:
    x = degrees.index(1)
    T.add_edge(x, y)
    degrees[x] -= 1;  degrees[y] -= 1
    print(degrees, ": adding edge", x, "--", y)


Add the final edge:

In [None]:
e = [x for x in range(n) if degrees[x] == 1]
T.add_edge(*e)
print(e)

In [None]:
nx.draw(T, **opts)

Turn the entire procedure into a `python` function:

In [None]:
def tree_pruefer(code):

    # initialize graph and defects
    n = len(code) + 2
    tree = nx.empty_graph(n)
    degrees = [1 for x in tree]
    for y in code:
        degrees[y] += 1
        
    # add edges
    for y in code:
        for x in tree:
            if degrees[x] == 1:
                tree.add_edge(x, y)
                for z in (x, y):
                    degrees[z] -= 1
                break
                
    # final edge
    e = [x for x in tree if degrees[x] == 1]
    tree.add_edge(*e)
    
    return tree

* We can now construct a random tree on $n$ nodes from a random Prüfer code of length $n-2$.

In [None]:
code = np.random.randint(n, size=n-2)
code

In [None]:
tree = tree_pruefer(code)
nx.draw(tree, **opts)

Finally, we wrap this up into our own `python` function `random_tree`.

In [None]:
def random_tree(n):
    code = np.random.randint(n, size=n-2)
    return tree_pruefer(code)

In [None]:
T = random_tree(20)


In [None]:
nx.draw(T, **opts)

## Next: Depth First and Breadth First Search

[DFS](https://en.wikipedia.org/wiki/Depth-first_search)
and [BFS](https://en.wikipedia.org/wiki/Breadth-first_search)
are simple but efficient tree (and graph) traversal algorithms.

##  Code Corner

### `python`

* `+=`, `-=`: augmented assignment statements [[doc]](https://docs.python.org/3/reference/simple_stmts.html#augmented-assignment-statements)

### `networkx`

* `connected_components` [[doc]](https://networkx.github.io/documentation/stable/reference/algorithms/component.html)

* `random_tree` [[doc]](https://networkx.github.io/documentation/stable/reference/generated/networkx.generators.trees.random_tree.html)

* `copy`: [[doc]](https://networkx.org/documentation/stable/reference/classes/generated/networkx.Graph.copy.html)

* `empty_graph` [[doc]](https://networkx.github.io/documentation/stable/reference/generated/networkx.generators.classic.empty_graph.html)

### `numpy`

* `array`: [[doc]](https://numpy.org/doc/stable/reference/generated/numpy.array.html)

* `transpose`: [[doc]](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html)

* `random.randint`: [[doc]](https://numpy.org/doc/stable/reference/random/generated/numpy.random.randint.html)

## Exercises

1.  A tree $T$ uniquely determines its Prüfer code,
and hence the two nodes that remain after (destructively)
computing the code.   What are those two nodes, in terms of
properties of $T$, or its Prüfer code?

2. What tree has Prüfer code $[0, 1, 2, \dots, n-3]$?