## 16.7 Summary

### 16.7.1 Rooted trees

A **rooted tree** consists of zero or more **nodes**.
Each node contains an item and is the **parent** of zero or more **child** nodes.
The root node has no parent; every other node has exactly one parent.
The **leaves** are the nodes without children.
The **size** of a tree is how many nodes (or items) it has.

Rooted trees represent hierarchical collections of items, organised by **levels**.
The **height** of a tree is how many levels it has.
Level zero, the top level, contains only the root.
If a node is at level *n*, then its children are at level, or **depth**, *n* + 1
and its parent is at level *n* − 1.

The **ancestors**
of a node are its parent, parent's parent, and so on until the root.
The **descendants** of a node are
its children, children's children and so on until reaching leaf nodes.

There are two basic ways of exhaustively searching a rooted tree.
A breadth-first search (BFS) generates the nodes level by level:
all of a node's children are tested before testing their children.
A depth-first search (DFS) generates and tests all descendants of a node's
child before doing the same for the next child of that node.
A **pre- or post-order traversal** is a DFS that tests a node respectively
before or after its children.

### 16.7.2 Binary trees

In a **binary tree**, each node has at most two children,
called the **left child** and **right child**.
A binary tree can be recursively defined as being empty or consisting of
a root and two binary trees, called the left and right **subtrees**.
The binary tree ADT operations are directly based on this definition:

Operation | Effect
-|-
new | create a new empty binary tree
join(*i*, *l*, *r*) | create a tree with root item *i* and subtrees *l* and *r*
root(*t*) | obtain the root item of binary tree *t*
left(*t*) | obtain the left subtree of *t*
right(*t*) | obtain the right subtree of *t*
*t* is empty | check if *t* has no nodes

When writing algorithms in English, the new operation is written as:
let *t* be an empty binary tree.
Operations root, left and right assume *t* isn't empty.
All operations can be implemented in constant time.

Due to their recursive structure, binary trees can be processed with
recursive divide-and-conquer algorithms.
The base case is either an empty tree or a leaf.
**Arm's-length recursion**, which tests the base case before the recursive call,
should be avoided as it complicates the algorithm and usually makes it slower.

An **in-order traversal** of a binary tree is a DFS that processes the root in
between processing the left and right subtrees.

The **balance factor** of a node is the difference between
the left subtree's height and the right subtree's height.
A binary tree is **balanced** if every node has balance factor −1, 0 or 1.

A binary tree is **complete** if all levels, except possibly the last one, are full
and the leaves on the last level have no gaps from left to right.
A binary tree is **perfect** if all its levels are full.
Complete trees and perfect trees are balanced.

The height of a binary tree is at least log │*tree*│, for a complete tree, and
at most │*tree*│, for a tree that has one node per level.
For a balanced tree, the height is proportional to log │*tree*│.

### 16.7.3 Binary search trees

A **binary search tree** (**BST**) is a binary tree with an ordering property:
each item is larger than the items in the left subtree and
smaller than the items in the right subtree.
Moreover, each subtree is a BST itself.
An in-order traversal of a BST produces items in ascending order.

BSTs are used to implement the map ADT, if items are key–value pairs with
unique and comparable keys.
All map operations require to first search for the node with the given key.
This can be done with a binary search, due to the ordering of the items.

Binary search visits one node per level so it takes
linear time in the height of the tree,
assuming each visited node takes constant time to process.
Binary search is Θ(1) in the best case, Θ(log │*tree*│) in the average case, and
Θ(│*tree*│) in the worst case.

A **self-balancing BST** automatically balances itself after
a node has been inserted or removed.

### 16.7.4 Heaps

A **min-heap** or **max-heap** is a binary tree with an additional structural property
(it's complete) and an ordering property: each item (or its key) is
respectively smaller or larger than its children (or their keys).
Min- and max-heaps are used to implement priority queues:
the smallest or largest priority item is in the root.

A heap can be efficiently stored and manipulated in an array.
A new item is added as the last node and bubbles up the tree while it's
smaller (or larger, for a max-heap) than the parent.
When the root is removed, the last item takes its place and bubbles down
while it's larger (respectively, smaller) than their children.
Inserting an item and removing the root take logarithmic time at worst,
when the item moves up (or the new root moves down) the height of the tree.

In Python, min-heaps can be created by repeatedly calling
`heappush(heap, item)` on an initially empty list `heap`.
The items added to the heap must be comparable.
Calling `heappop(heap)` on a non-empty heap returns and removes the root item.
Both functions are in module `heapq`.

**Heapsort** is a form of selection sort with unsorted items kept in a min-heap
instead of a sequence. Heapsort has linear best-case complexity and
log-linear worst-case complexity.
Due to swapping items up and down across the whole array,
heapsort doesn't have good cache locality and is slower in practice than in-place quicksort.

⟵ [Previous section](16_6_heapsort.ipynb) | [Up](16-introduction.ipynb) | [Next section](../17_Graphs_1/17-introduction.ipynb) ⟶