Skip to content

Commit

Permalink
trees
Browse files Browse the repository at this point in the history
  • Loading branch information
emanuele-em committed Oct 29, 2023
1 parent b0e3650 commit 087f14a
Show file tree
Hide file tree
Showing 4 changed files with 373 additions and 3 deletions.
6 changes: 3 additions & 3 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,9 @@
- [Bellman–Ford algorithm](bellman_ford_algorithm.md)
- [Dijkstra’s algorithm](dijkstra_s_algorithm.md)
- [Floyd–Warshall algorithm](floyd_warshall_algorithm.md)
<!-- - [Tree algorithms](README.md) -->
<!-- - [Tree traversal](README.md) -->
<!-- - [Diameter](README.md) -->
- [Tree algorithms](tree_algorithm.md)
- [Tree traversal](tree_traversal.md)
- [Diameter](diameter.md)
<!-- - [All longest paths](README.md) -->
<!-- - [Binary trees](README.md) -->
<!-- - [Spanning trees](README.md) -->
Expand Down
206 changes: 206 additions & 0 deletions src/diameter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
# Diameter

The **diameter** of a tree
is the maximum length of a path between two nodes.
For example, consider the following tree:

<script type="text/tikz">
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {1};
\node[draw, circle] (2) at (2,3) {4};
\node[draw, circle] (3) at (0,1) {2};
\node[draw, circle] (4) at (2,1) {3};
\node[draw, circle] (5) at (4,1) {7};
\node[draw, circle] (6) at (-2,3) {5};
\node[draw, circle] (7) at (-2,1) {6};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\end{tikzpicture}
</script>

The diameter of this tree is 4,
which corresponds to the following path:

<script type="text/tikz">
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {1};
\node[draw, circle] (2) at (2,3) {4};
\node[draw, circle] (3) at (0,1) {2};
\node[draw, circle] (4) at (2,1) {3};
\node[draw, circle] (5) at (4,1) {7};
\node[draw, circle] (6) at (-2,3) {5};
\node[draw, circle] (7) at (-2,1) {6};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);

\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
\end{tikzpicture}
</script>

Note that there may be several maximum-length paths.
In the above path, we could replace node 6 with node 5
to obtain another path with length 4.

Next we will discuss two $O(n)$ time algorithms
for calculating the diameter of a tree.
The first algorithm is based on dynamic programming,
and the second algorithm uses two depth-first searches.

## Algorithm 1

A general way to approach many tree problems
is to first root the tree arbitrarily.
After this, we can try to solve the problem
separately for each subtree.
Our first algorithm for calculating the diameter
is based on this idea.

An important observation is that every path
in a rooted tree has a _highest point_:
the highest node that belongs to the path.
Thus, we can calculate for each node the length
of the longest path whose highest point is the node.
One of those paths corresponds to the diameter of the tree.

For example, in the following tree,
node 1 is the highest point on the path
that corresponds to the diameter:

<script type="text/tikz">
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {1};
\node[draw, circle] (2) at (2,1) {4};
\node[draw, circle] (3) at (-2,1) {2};
\node[draw, circle] (4) at (0,1) {3};
\node[draw, circle] (5) at (2,-1) {7};
\node[draw, circle] (6) at (-3,-1) {5};
\node[draw, circle] (7) at (-1,-1) {6};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);

\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
\end{tikzpicture}
</script>

We calculate for each node $x$ two values:

- `to_leaf(x)`: the maximum length of a path from `x` to any leaf
- `max_length(x)`: the maximum length of a path whose highest point is $x$

For example, in the above tree,
`to_leaf(1)=2`, because there is a path
$1 \rightarrow 2 \rightarrow 6$,
and `max_length(1)=4`,
because there is a path
$6 \rightarrow 2 \rightarrow 1 \rightarrow 4 \rightarrow 7$.
In this case, `max_length(1)` equals the diameter.

Dynamic programming can be used to calculate the above
values for all nodes in $O(n)$ time.
First, to calculate `to_leaf(x)`,
we go through the children of $x$,
choose a child $c$ with maximum `to_leaf(c)`
and add one to this value.
Then, to calculate `max_length(x)`,
we choose two distinct children $a$ and $b$
such that the sum `to_leaf(a)+to_leaf(b)`
is maximum and add two to this sum.

## Algorithm 2

Another efficient way to calculate the diameter
of a tree is based on two depth-first searches.
First, we choose an arbitrary node $a$ in the tree
and find the farthest node $b$ from $a$.
Then, we find the farthest node $c$ from $b$.
The diameter of the tree is the distance between $b$ and $c$.

In the following graph, $a$, $b$ and $c$ could be:

<script type="text/tikz">
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {1};
\node[draw, circle] (2) at (2,3) {4};
\node[draw, circle] (3) at (0,1) {2};
\node[draw, circle] (4) at (2,1) {3};
\node[draw, circle] (5) at (4,1) {7};
\node[draw, circle] (6) at (-2,3) {5};
\node[draw, circle] (7) at (-2,1) {6};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\node[color=red] at (2,1.6) {a};
\node[color=red] at (-2,1.6) {b};
\node[color=red] at (4,1.6) {c};

\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
\end{tikzpicture}
</script>

This is an elegant method, but why does it work?

It helps to draw the tree differently so that
the path that corresponds to the diameter
is horizontal, and all other
nodes hang from it:

<script type="text/tikz">
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (2,1) {1};
\node[draw, circle] (2) at (4,1) {4};
\node[draw, circle] (3) at (0,1) {2};
\node[draw, circle] (4) at (2,-1) {3};
\node[draw, circle] (5) at (6,1) {7};
\node[draw, circle] (6) at (0,-1) {5};
\node[draw, circle] (7) at (-2,1) {6};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\node[color=red] at (2,-1.6) {a};
\node[color=red] at (-2,1.6) {b};
\node[color=red] at (6,1.6) {c};
\node[color=red] at (2,1.6) {x};

\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
\end{tikzpicture}
</script>

Node $x$ indicates the place where the path
from node $a$ joins the path that corresponds
to the diameter.
The farthest node from $a$
is node $b$, node $c$ or some other node
that is at least as far from node $x$.
Thus, this node is always a valid choice for
an endpoint of a path that corresponds to the diameter.

91 changes: 91 additions & 0 deletions src/tree_algorithm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Tree algorithms

A **tree** is a connected, acyclic graph
that consists of $n$ nodes and $n-1$ edges.
Removing any edge from a tree divides it
into two components,
and adding any edge to a tree creates a cycle.
Moreover, there is always a unique path between any
two nodes of a tree.

For example, the following tree consists of 8 nodes and 7 edges:

<script type="text/tikz">
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {1};
\node[draw, circle] (2) at (2,3) {4};
\node[draw, circle] (3) at (0,1) {2};
\node[draw, circle] (4) at (2,1) {3};
\node[draw, circle] (5) at (4,1) {7};
\node[draw, circle] (6) at (-2,3) {5};
\node[draw, circle] (7) at (-2,1) {6};
\node[draw, circle] (8) at (-4,1) {8};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\end{tikzpicture}
</script>

The **leaves** of a tree are the nodes
with degree 1, i.e., with only one neighbor.
For example, the leaves of the above tree
are nodes 3, 5, 7 and 8.

In a **rooted** tree, one of the nodes
is appointed the **root** of the tree,
and all other nodes are
placed underneath the root.
For example, in the following tree,
node 1 is the root node.

<script type="text/tikz">
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {1};
\node[draw, circle] (4) at (2,1) {4};
\node[draw, circle] (2) at (-2,1) {2};
\node[draw, circle] (3) at (0,1) {3};
\node[draw, circle] (7) at (2,-1) {7};
\node[draw, circle] (5) at (-3,-1) {5};
\node[draw, circle] (6) at (-1,-1) {6};
\node[draw, circle] (8) at (-1,-3) {8};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (6) -- (8);
\end{tikzpicture}
</script>

In a rooted tree, the **children** of a node
are its lower neighbors, and the **parent** of a node
is its upper neighbor.
Each node has exactly one parent,
except for the root that does not have a parent.
For example, in the above tree,
the children of node 2 are nodes 5 and 6,
and its parent is node 1.

The structure of a rooted tree is _recursive_:
each node of the tree acts as the root of a **subtree**
that contains the node itself and all nodes
that are in the subtrees of its children.
For example, in the above tree, the subtree of node 2
consists of nodes 2, 5, 6 and 8:

<script type="text/tikz">
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (2) at (-2,1) {2};
\node[draw, circle] (5) at (-3,-1) {5};
\node[draw, circle] (6) at (-1,-1) {6};
\node[draw, circle] (8) at (-1,-3) {8};
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (6) -- (8);
\end{tikzpicture}
</script>
73 changes: 73 additions & 0 deletions src/tree_traversal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Tree traversal

General graph traversal algorithms
can be used to traverse the nodes of a tree.
However, the traversal of a tree is easier to implement than
that of a general graph, because
there are no cycles in the tree and it is not
possible to reach a node from multiple directions.

The typical way to traverse a tree is to start
a depth-first search at an arbitrary node.
The following recursive function can be used:

```rust
# let adj = [vec![], vec![2, 3, 4], vec![5, 6], vec![], vec![7], vec![], vec![8], vec![], vec![]];
# dfs(1,0,&adj);
fn dfs(s: usize, e: usize, adj: &[Vec<usize>]) {
// process node s
# println!("{s}");
for u in &adj[s] {
if *u != e {dfs(*u, s, adj)}
}
}
```

The function is given two parameters: the current node $s$
and the previous node $e$.
The purpose of the parameter $e$ is to make sure
that the search only moves to nodes
that have not been visited yet.

The following function call starts the search
at node $x$:

```rust, ignore
dfs(x, 0, &adg);
```

In the first call $e=0$, because there is no
previous node, and it is allowed
to proceed to any direction in the tree.

## Dynamic programming

Dynamic programming can be used to calculate
some information during a tree traversal.
Using dynamic programming, we can, for example,
calculate in $O(n)$ time for each node of a rooted tree the
number of nodes in its subtree
or the length of the longest path from the node
to a leaf.

As an example, let us calculate for each node $s$
a value `count[s]`: the number of nodes in its subtree.
The subtree contains the node itself and
all nodes in the subtrees of its children,
so we can calculate the number of nodes
recursively using the following code:

```rust
# let adj = [vec![], vec![2, 3, 4], vec![5, 6], vec![], vec![7], vec![], vec![8], vec![], vec![]];
# let mut count = vec![0; 9];
# dfs(1,0,&adj, &mut count);
fn dfs(s: usize, e: usize, adj: &[Vec<usize>], count: &mut [usize]){
# println!("count: {:?}", &count);
count[s] = 1;
for u in &adj[s] {
if *u == e {continue}
dfs(*u, s, adj, count);
count[s] += count[*u];
}
}
```

0 comments on commit 087f14a

Please sign in to comment.