In [1]:
#include <iostream>
#include <algorithm>
#include "../src/SinglyLinkedList.h"
#include "../src/TreeTraversals.h"

# Trees

## Tree ADT

- **Tree** is a hierarchical data structure    
    - A rooted tree data structure stores information in **nodes**
        - Often stores data that is related to each other
    - Defined recursively
        - Trees are built using smaller trees

![](../img/tree.png)

- Similar to linked lists
    - There is a first node, or **root**
    - Each node has variable number of references to successors
    - Each node, other than the root, has exactly one node pointing to it

## Terminology

- All nodes will have zero or more **child nodes** or children
    - `I` has three children:  `J`, `K` and `L`	
- For all nodes other than the root node, there is one **parent node**
    - `H` is the parent `I`
- The degree of a node is defined as the number of its children
    - `deg(I) = 3`

![](../img/branch.png)

- Nodes with the same parent are **siblings**
    - `J`, `K`, and `L` are siblings

- Nodes with degree zero are also called **leaf nodes**
	
- All other nodes are said to be *internal nodes*, that is, they are internal to the tree
    
![](../img/leafs.png)

### Leaf Nodes
![](../img/nodes1.png)

### Internal Nodes
![](../img/nodes2.png)

- These trees are equal if the order of the children is ignored (*unordered trees*)

| | |
|-|-|
|![](../img/tree1.png)|![](../img/tree2.png)|
	

- They are different if order is relevant (*ordered trees*)
    - We will usually examine ordered trees (*linear orders*)
    - In a hierarchical ordering, order is not relevant

- The shape of a rooted tree gives a natural flow from the *root node*, or just **root**

![](../img/flow.png)

- A **path** is a sequence of nodes $(a_0, a_1, \ldots, a_n)$
    - where $a_{k + 1}$ is a child of $a_k$
	
- The length of this path is $n$	
	- E.g., the path $(B, E, G)$ has length 2
    
![](../img/path1.png)

- Paths of length <span style="color:gold">10 (11 nodes)</span> and <span style="color:red">4 (5 nodes)</span>
![](../img/path2.png)

- For each node in a tree, there exists a unique path from the root node to that node
	
- The length of this path is the **depth** of the node, e.g.,
    - `E` has depth 2
    - `L` has depth 3

![](../img/depth1.png)

- Nodes of depth up to 17

![](../img/depth2.png)

- The **height** of a tree is defined as the maximum depth of any node within the tree

- The height of a tree with one node is 0
    - Just the root node

- For convenience, we define the height of the empty tree to be  –1

- If a path exists from node $a$ to node $b$
    - $a$ is an **ancestor** of $b$
    - $b$ is a **descendent** of $a$

- Thus, a node is both an ancestor and a descendant of itself
    - We can add the adjective *strict* to exclude equality
        - $a$ is a strict descendent of $b$ if $a$ is a descendant of $b$ but $a \neq b$

- The root node is an ancestor of all nodes

- The ancestors of node `I` are `I`, `H`, and `A`:
![](../img/ancs1.png)
- The descendants of node `B` are `B`, `C`, `D`, `E`, `F`, and `G`:
![](../img/ancs2.png)

- Another approach to a tree is to define the tree recursively
    - A degree $0$ node is a tree
    - A node with degree $n$ is a tree if it has $n$ children and all of its children are disjoint trees (i.e., with no intersecting nodes)

![](../img/subtree.png)

- Given any node a within a tree with root $r$, the collection of $a$ and all of its descendants is said to
be a subtree of the tree with root $a$

## Example: XML

- In general, any XML can be represented as a tree
    - All XML tools make use of this feature
    - Parsers convert XML into an internal tree structure
    - XML transformation languages manipulate the tree structure
        - E.g., XSLT

- MathML: $x^2 + y^2 = z^2$

```xml
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <semantics>
    <mrow><mrow><msup><mi>x</mi><mn>2</mn></msup><mo>+</mo>
    <msup><mi>y</mi><mn>2</mn></msup></mrow>
    <mo>=</mo><msup><mi>z</mi><mn>2</mn></msup></mrow>
    <annotation-xml encoding="MathML-Content">
      <apply><eq/>
        <apply><plus/>
          <apply><power/><ci>x</ci><cn>2</cn></apply>
          <apply><power/><ci>y</ci><cn>2</cn></apply>
        </apply>
        <apply><power/><ci>z</ci><cn>2</cn></apply>
      </apply>
    </annotation-xml>
    <annotation encoding="Maple">x^2+y^2 = z^2</annotation>
  </semantics>
</math>
```

- The tree structure for the same MathML expression is
![](../img/mathml.png)

- Why use 500 characters to describe the equation
$$x^2 + y^2 = z^2$$
which, after all, is only twelve characters (counting spaces)?

- The root contains three children, each different codings of:
    - How it should look (presentation),
    - What it means mathematically (content), and
    - A translation to a specific language (Maple)

## Hierarchical ADT

- A hierarchical ordering of a finite number of objects may be stored in a tree data structure

- Operations on a hierarchically stored container include
    - Accessing the root
    - Given an object in the container
        - Access the parent of the current object
        - Find the degree of the current object
        - Get a reference to a child
        - Attach a new sub-tree to the current object
        - Detach this tree from its parent

## Abstract Trees

- An **abstract tree (or abstract hierarchy)** does not restrict the number of nodes
    - In this tree, the degrees vary
    
| Degree  | Nodes   |
|---------|---------|
| 0       | C, E, F, G, J, K |
| 1       | H |
| 2       | A, D, I |
| 3       | B |

![](../img/tree.png) 

## List-based Implementation

- We implement an abstract tree or hierarchy by using a class that:
    - Stores a value
    - Stores the children in a (linked) list

In [None]:
template <typename Type>
class SimpleTree {
private:
    Type node_value;
    SimpleTree *parent_node;
    SinglyLinkedList<SimpleTree *> children;

public:
    SimpleTree( const Type& = Type(), SimpleTree* = nullptr );
    ~SimpleTree();

    Type value() const;
    SimpleTree* parent() const;
    int degree() const;
    bool is_root() const;
    bool is_leaf() const;
    int height() const;
    int size() const;
    SimpleTree* child( int n ) const;

    void insert( const Type& );
    void attach( SimpleTree* );
    void detach();
    
    void depth_first_traversal() const;
};

- The tree with six nodes would be stored as follows:
![](../img/tree-adt.png)

- Much of the functionality is similar to that of the `SinglyLinkedNode` class

In [None]:
template <typename Type>
SimpleTree<Type>::SimpleTree( const Type &obj, SimpleTree *p) :
    node_value( obj ),
    parent_node( p ) 
{
    // Empty constructor
}

- Even though the `children` list will remove all its content at the destructor call. We need explicitly delete all children from the list as they are represented by the pointer.

In [None]:
template <typename Type>
SimpleTree<Type>::~SimpleTree() {
    while ( !children.empty() ) {
        auto ch = children.pop_front(); // remove child from tree node
        delete ch; // delete child node
    }
}

## Accessors

In [None]:
template <typename Type>
Type SimpleTree<Type>::value() const {
    return node_value;
}

In [None]:
template <typename Type>
SimpleTree<Type>* SimpleTree<Type>::parent() const {
    return parent_node;
}

In [None]:
template <typename Type>
bool SimpleTree<Type>::is_root() const {
    return ( parent() == nullptr );
}

In [None]:
template <typename Type>
bool SimpleTree<Type>::is_leaf() const {
    return ( degree() == 0 );
}

In [None]:
template <typename Type>
int SimpleTree<Type>::degree() const {
    return children.size();
}

- Accessing the $n^{th}$ child requires a for loop ($\Theta(n)$):

In [None]:
template <typename Type>
SimpleTree<Type>* SimpleTree<Type>::child( int n ) const {
    if ( n < 0 || n >= degree() ) {
        return nullptr;
    }

    // Skip the first n - 1 children
    SinglyLinkedNode< SimpleTree<Type>* >* child = children.begin();
    // auto child = children.begin();
    for ( int i = 1; i <= n; ++i ) {
        child = child->next();
    }

    return child->value();
}

- To detach a tree from its parent
    - If it is already a root, do nothing
    - Otherwise, erase this object from the parent's list of children and set the parent pointer to zero 

In [None]:
template <typename Type>
void SimpleTree<Type>::detach() {
    if ( is_root() ) {
        return;
    }
    parent()->children.remove( this );
    parent_node = nullptr;
}

- Attaching an entirely new tree as a subtree, however, first requires us to check if the tree is not already a subtree of another node
    - If so, we must detach it first and only then can we add it

In [None]:
template <typename Type>
void SimpleTree<Type>::attach( SimpleTree<Type> *tree ) {
    if ( !tree->is_root() ) {
        tree->detach();
    }
    tree->parent_node = this;
    children.push_back( tree );
}

- Inserting a new object to become a child is similar to a linked list

In [None]:
template <typename Type>
void SimpleTree<Type>::insert( const Type& obj ) {
    attach( new SimpleTree( obj, this ) );
}

## Size

- Suppose we want to find the size of a tree
    - An empty tree has size `0`, a tree with no children has size `1`
    - Otherwise, the size is one plus the size of all the children

In [None]:
template <typename Type>
int SimpleTree<Type>::size() const {
    if ( this == nullptr ) {
        return 0;
    }

    int tree_size = 1;
    for ( auto *child = children.begin(); child != nullptr; child = child->next() ) {
        tree_size += child->value()->size();
    }
    return tree_size ;
}

## Array-based Implementation

- Implementing a tree by storing the children in an array is similar, however, we must deal with the full structure
    - A general tree using an array would have a constructor similar to

In [None]:
template <typename Type>
SimpleTree<Type>::SimpleTree( Type const &obj, SimpleTree *p ):
    node_value( obj ),
    parent_node( p ),
    child_count( 0 ),
    child_capacity( 4 ),
    children( new (SimpleTree*)[child_capacity] )
{
    // Empty constructor
}

## Complexity

- We can summarizes the performance of the linked-structure implementation of a tree.

| Operation | Time |
|----------:|:----:|
|is_root, is_leaf| O(1) |
|parent     | O(1) |
|child(p)   | O(d) |
|empty, degree | O(1) |
|attach, detach, insert | O(1) |
|height, size| O(n) |

- where $d$ is the degree of a node $p$, the number of node's children, $n$ number of elements stored in the tree

## Summary

- We have looked at one implementation of a general tree
    - store the value of each node
    - store all the children in a linked list
    - not an easy ($\Theta(1)$) way to access children
    - if we use an array, different problems...

# Tree Traversals

## Background

- All the objects stored in an array or linked list can be accessed sequentially

- When discussing deques, we introduced iterators in C++
    - These allow the user to step through all the objects in a container

- Question: How can we iterate through all the objects in a tree in a predictable and efficient manner?
    - Requirements: $\Theta(n)$ run time and $O(n)$ memory 

- We have already seen one traversal
    - The **breadth-first traversal** visits all nodes at depth $k$ before proceeding onto depth $k + 1$
    - Easy to implement using a queue

- Another approach is to visit always go as deep as possible before visiting other siblings: **depth-first traversals**

## Breadth-First Traversal

- Breadth-first traversals visit all nodes at a given depth
    - Can be implemented using a queue
    - Run time is $\Theta(n)$
    - Memory is potentially expensive: maximum nodes at a given depth
    - Order: `A B H C D G I E F J K`

![](../img/bfs02.png)

## Queue-base Implementation

- The implementation was already discussed
    - Create a queue and push the root node onto the queue
    - While the queue is not empty
        - Push all of its children of the front node onto the queue
        - Pop the front node

## Depth & Height

- The **depth of node** is the number of its ancestors, excluding itself.

- Suppose we want to find the height of a tree
    - An empty tree has height `-1` and a tree with no children is height `0`
    - Otherwise, the height is one plus the maximum height of any sub tree

In [None]:
template <typename Type>
int SimpleTree<Type>::height() const {
    int tree_height = 0;
    for ( auto *child = children.begin(); child != nullptr; child = child->next() ) {
        tree_height = std::max( tree_height, 1 + child->value()->height() );
    }
    return tree_height;
}

- The `height` function is recursive in nature
    - Before the children are traversed, we assume that the node has no children and we set the height to zero:  $h_{current} = 0$
    - In recursively traversing the children, each child returns its height $h$ and we update the height if $1 + h > h_{current}$
    - Once all children have been traversed, we return $h_{current}$

- When the root returns a value, that is the height of the tree
![](../img/height.png)

### Example

In [3]:
// Create a tree
TreeTraversals<char> t('A');
t.insert('B'); t.insert('H');
auto ch = t.child(0); ch->insert('C'); ch->insert('E');
ch = t.child(0)->child(0); ch->insert('D');
ch = t.child(0)->child(1); ch->insert('F'); ch->insert('G');
ch = t.child(1); ch->insert('I'); ch->insert('M');
ch = t.child(1)->child(0); ch->insert('J'); ch->insert('K'); ch->insert('L');

In [7]:
{
    std::cout << t.value() << ": height = " << t.height() << std::endl;
    auto ch = (TreeTraversals<char>*)t.child(0);
    std::cout << ch->value() << ": height = " << ch->height() << std::endl;
    std::cout << ch->value() << ": depth = " << t.depth(ch) << std::endl;
    ch = (TreeTraversals<char>*)(t.child(0)->child(1)->child(1));
    std::cout << ch->value() << ": depth = " << t.depth(ch) << std::endl;
}

A: height = 3
B: height = 2
B: depth = 1
G: depth = 3


## Backtracking

- First, we will define a *backtracking* algorithm for stepping through a tree
    - At any node, we proceed to the first child that has not yet been visited
    - Or, if we have visited all the children (of which a leaf node is a special case), we *backtrack to the parent* and repeat this decision making process
	- We end once all the children of the root are visited

![](../img/backtrack.png)

## Depth-First Traversal

- We define such a path as a depth-first traversal

- We note that each node could be visited twice in such a scheme
    - The first time the node is approached (before any children), **preorder traversal**
    - The last time it is approached (after all children), **postorder traversal**
    
![](../img/dfs.png)    

## Implementing depth-first traversals

- Depth-first traversals can be implemented with recursion

In [36]:
template <typename Type>
void SimpleTree<Type>::depth_first_traversal() const {
    // Perform pre-visit operations on the value
    std::cout << "<" << node_value << ">";
    // Perform a depth-first traversal on each of the children
    for ( auto *child = children.begin(); child != nullptr; child = child->next() ) {
       child->value()->depth_first_traversal();
    }
    // Perform post-visit operations on the value
    std::cout << "</" << node_value << ">";
}

- Alternatively, we can use a stack
    - Create a stack and push the root node onto the stack
    - While the stack is not empty
        - Pop the top node 
        - Push all of the children of that node to the top of the stack in *reverse order*
- Run time is $\Theta(n)$
    - The objects on the stack are all unvisited siblings from the root to the current node
        - If each node has a maximum of two children, the memory required is $\Theta(h)$
            - where $h$ is the height of the tree

- With the recursive implementation, the memory is $\Theta(h)$
    - recursion just hides the memory

### Example

In [11]:
std::cout << "Preorder:" << std::endl;
t.depth_first_traversal(TreeTraversals<char>::Order::Pre);
std::cout << "\nPostorder:" << std::endl;
t.depth_first_traversal(TreeTraversals<char>::Order::Post);
std::cout << "\nDepth First:" << std::endl;
t.depth_first_traversal(TreeTraversals<char>::Order::Both);

Preorder:
<A><B><C><D><E><F><G><H><I><J><K><L><M>
Postorder:
</D></C></F></G></E></B></J></K></L></I></M></H></A>
Depth First:
<A><B><C><D></D></C><E><F></F><G></G></E></B><H><I><J></J><K></K><L></L></I><M></M></H></A>

![](../img/dfs.png)  

## Guidelines

- Depth-first traversals are used whenever
    - The parent needs information about all its children or descendants, or
    - The children require information about all its parent or ancestors

- In designing a depth-first traversal, it is necessary to consider
    - Before the children are traversed, what initializations, operations and calculations must be performed?

- In recursively traversing the children
    - What information must be passed to the children during the recursive call?
    - What information must the children pass back, and how must this information be collated?
    - Once all children have been traversed, what operations and calculations depend on information collated during the recursive traversals?
    - What information must be passed back to the parent?

## Printing a Hierarchy

- Consider the directory structure presented on the left 
    - How do we display this in the format on the right?

| | |
|-|-|
| ![](../img/dirs1.png) | ![](../img/dirs2.png) |


- What do we do at each step?

- For a directory, we initialize a tab level at the root to `0`
- We then do
    - Before the children are traversed, we must
        - Indent an appropriate number of tabs, and
        - Print the name of the directory followed by a `/`
    - In recursively traversing the children
        - A value of one plus the current tab level must be passed to the children, and
        - No information must be passed back
    - Once all children have been traversed, we are finished

- Assume the function `void print_tabs( int n )` prints `n` tabs

In [None]:
template <typename Type>
void SimpleTree<Type>::print( int depth ) const {
    print_tabs( depth );    
    std::cout << value()->directory_name() << '/' << std::endl;

    for ( auto *child = children.begin(); child != nullptr; child = child->next() ) {
        child->value()->print( depth + 1 );
    }
}

### Example

In [10]:
std::cout << t << std::endl;    

A/
	B/
		C/
			D/
		E/
			F/
			G/
	H/
		I/
			J/
			K/
			L/
		M/

