# Chapter 08: Tree

## 8.1 General Trees

### 8.1.1 Tree Definitions and Properties

A **tree** is an abstract data type that stores elements hierarchcially.

##### Formal Tree Definition

Formally, we define **tree** $T$ as a set of **nodes** storing elements such that the nodes have a **parent-child** relationship that satisfies the following properties:

* If $T$ is nonempty, it has a special node, called the **root** of $T$, that has no parent.
* Each $v$ of $T$ different from the root has a unique **parent** node $w$; every node with parent $w$ is a **child** of $w$.

##### Other Node Relationships

Two nodes that are children of the same parent are **siblings**. A node $v$ is **external** if $v$ has no children. A node $v$ is **internal** if it has one or more children. External nodes are also known as **leaves**.

##### Ordered Trees

A tree is **ordered** if there is a meaningful linear order among the children of each nodes; that is, we purposefully identify the children of a node as being the first, second, third, and so on.

### 8.1.2 The Tree Abstract Data Type

We define a tree ADT using the concept of a **position** as an abstraction for a node of a tree. An element is stored at each position, and positions satisfy parent-child relationships that define the tree structure. A position object for a tree supports the method:

* `p.element()`: Return the element stored at position `p`.
* `T.root()`: Return the position of the root of tree `T`, or `None` if `T`
* `T.is_root(p)`: Return `True` if position `p`is the root of Tree `T`.
* `T.parent(p)`: Return the position of the parent of position `p` or `None` if `p` is the root of `T`.
* `T.num_children(p)`: Return the number of children of position `p`
* `T.children(p)`: Generate an iteration of the children of position `p`.
* `T.is_leaf(p)`: Return `True` if position `p` does not have any children.
* `len(T)`: Return the number of positions (and hence elements) that are contained in tree `T`.
* `T.is_empty()`: Return `True` if tree `T` does not contain any positions.
* `T.positions()`: Generate an iteration of all positions of tree `T`.
* `iter(T)`: Generate an iteration of all elements stored within tree `T`.

In [1]:
from abc import ABC, abstractmethod

class Tree(ABC):
    """Abstract base class representing a tree structure."""
    
    class Position(ABC):
        """An abstraction representing the location of a single element."""
        
        @abstractmethod
        def element(self):
            """Return the element stored at this Position."""
            pass
        
        @abstractmethod
        def __eq__(self, other):
            """Return True if other Position represents the same location."""
            pass
        
        def __ne__(self, other):
            """Return True if other does not represent the same location."""
            return not (self == other)

    @abstractmethod
    def root(self):
        """Return Position representing the tree's root (or None if empty)."""
        pass
    
    @abstractmethod
    def parent(self, p):
        """Return Position representing p's parent (or None if p is root)."""
        pass
    
    @abstractmethod
    def num_children(self, p):
        """Return the number of children that Position p has."""
        pass
    
    @abstractmethod
    def children(self, p):
        """Generate an iteration of Positions representing p's children."""
        pass
    
    @abstractmethod
    def __len__(self):
        """Return the total number of elements in the tree."""
        pass
    
    def is_root(self, p):
        """Return True if Position p represents the root of the tree."""
        return self.root() == p
    
    def is_leaf(self, p):
        """Return True if Position p does not have any children."""
        return self.num_children(p) == 0
    
    def is_empty(self):
        """Return True if the tree is empty."""
        return len(self) == 0
    
    def depth(self, p):
        if self.is_root(p):
            return 0
        else:
            return 1 + self.depth(self.parent(p))
        
    def _height2(self, p):
        if self.is_leaf(p):
            return 0
        else:
            return 1 + max(self._hegiht2(c) for c in self.children(p))
    
    def height(self, p=None):
        if p is None:
            p = self.root()
        return self._height2(p)

### 8.1.3 Computing Depth and Height

##### Depth

The **depth** of $p$ si the number of ancestor of $p$, excluding $p$ itself.

The running time of `T.depth(p)` for position `p` is $O(d_p + 1)$, where $d_p$ denotes the depth of $p$ in the tree $T$, because the algorithm performas a constant-time recursive step for each ancestor of $p$. Thus algorithm `T.depth(p)` runs in $O(n)$ worst-case time, where $n$ is the total number of positions of $T$, because a position of $T$ may have depth $n-1$ if all nodes from a single branch.

In [2]:
def depth(self, p):
    if self.is_root(p):
        return 0
    else:
        return 1 + self.depth(self.parent(p))

##### Height

The **height** of a position $p$ in a tree $T$ is also defined recursively:

* If $p$ is a fleaf, then the height of $p$ is 0
* Otherwise, the height of $p$ is one more thatn the maximum of the heights of $p$'s children.

*The height of a nonempty tree $T$ is equal to the maximum of the dpeths of its leaf positions.*

In [3]:
def _height1(self, p):
    return max(self.depth(p) for p in self.positions() if self.is_leaf(p))

However, algorithm `_hegiht1` is not very efficient. Because `_height1` calls algorithm `depth(p)` on each leaf of $T$, its running time is $O(n + \sum_{p \in L}(d_p + 1))$, where $L$ is the set of leaf positions of $T$. In the worst case, this will lead to $O(n^2)$.

It can be improved by:

In [4]:
def _height2(self, p):
    if self.is_leaf(p):
        return 0
    else:
        return 1 + max(self._hegiht2(c) for c in self.children(p))

It is important to understand why algorithm `height2` is more efficient than `height1`. The algorithm is recursive, and it progresses in a top-down fashion. If the method is initially called on the root of $T$, it will eventually be called once for each position of $T$. This is because the root eventually invokes the recursion on each of its children, which in turn invokes the recursion on each of their children, and so on.

We can determine the running time of the `height2` algorihtm by summing, over all the positions, the amount of time spent on the nonrecursive part of each call. In our implementation, there is a constant amound otf work per position, plus the overhead of computing the maximum over the iteration of children. Although we do not yet have a concrete implementation of `children(p)`, we assume that such an iteration is generated in $O(c_p + 1)$ time, where $c_p$ denotes the number of children
of $p$. Algorithm `height2` spends $O(c_p + 1)$ time at each position $p$ to compute the maximum, and its overall running time is $O(\sum_p (c_p + 1)) = O(n + \sum_p c_p)$.

*Let $T$ be a tree with $n$ positions, and let $c_p$ denote the number of children of a position $p$ of $T$. Then, summing over the positions of $T$, $\sum_p c_p = n-1$.*

By this proposition, the running time of algorithm `height2` is $O(n)$, where $n$ is the number of positions of $T$.

## 8.2 Binary Trees

A **binary tree** is an ordered tree with the following properties:

1. Every node has at most two children.
2. Each child node is alabeled as being either a left child or right child.
3. A left child precedes a right child in the order of children of a node.

The subtre rooted at a left or right child or an internal node $v$ is called a **left subtree** or **right subtree**. respectively, of $v$. A binary tree is **proper** if each node has either zero or two children. Some people also refer to such trees as being **full** binary trees. A binary tree that is not proper is **improper**.

### 8.2.1 The Binary Tree Abstract Data Type

* `T.left(p)`: Return the position that represents the left child of `p`, or `None` if `p` has no left child.
* `T.right(p)`: Return the position that represents the right child of `p` or `None` if `p` has no right child.
* `T.sibling(p)`: Return the position that represents the sibling of `p`, or `None` if `p` has no sibling.

In [7]:
class BinaryTree(Tree):
    """Abstract base class representing a binary tree structure."""
    
    @abstractmethod
    def left(self, p):
        """Return a Position representing p's left child.
        
        Return None if p does not have a left child.
        """
        pass
    
    @abstractmethod
    def right(self, p):
        """Return a Position representing p's right child.
        
        Return None if p does not have a right child.
        """
        pass
    
    def sibling(self, p):
        """Return a Position representing p's sibling (or None if no sibling)."""
        parent = self.parent(p)
        if parent is None:
            return None
        else:
            if p == self.left(parent):
                return self.right(parent)
            else:
                return self.left(parent)
            
    def children(self, p):
        """Generate an iteration of Positions representing p's children."""
        if self.left(p) is not None:
            yield self.left(p)
        
        if self.right(p) is not None:
            yield self.right(p)

### 8.2.2 Properties of Binary Trees

##### Proposition

Let $T$ be a nonempty binary tree, and let $n$, $n_W$, $n_I$ and $h$ denote the number of nodes, umber of external nodes, number of internal nodes, and height of $T$, respectively.Then $T$ has the following properties:

1. $h+1 \leq n \leq 2^{h+1} -1$
2. $1 \leq n_E \leq 2^h$
3. $h \leq n_I \leq 2^h - 1$
4. $\log(n+1) -1 \leq h \leq n - 1$

Also, if $T$ is proper, then $T$ has the following properties:

1. $2h + 1 \leq n \leq 2^{h+1} - 1$
2. $h +1 \leq n_E \leq 2^h$
3. $h \leq n_I \leq 2^h -1$
4. $\log(n+1) -1 \leq h \leq (n-1)/2$

##### Proposition

In a nonempty proper binary tree $T$, with $n_E$ external nodes and $n_I$ internal nodes, we have $n_E = n_I + 1$.

## 8.3 Implementing Trees