# [CptS 215 Data Analytics Systems and Algorithms](https://github.com/gsprint23/cpts215)
[Washington State University](https://wsu.edu)

[Gina Sprint](http://eecs.wsu.edu/~gsprint/)
# Heaps

Learner objectives for this lesson:
* Understand what a priority queue is
* Learn about the heap data structure to implement a priority queue


## Acknowledgments
Content used in this lesson is based upon information in the following sources:
* [Miller and Ranum](http://interactivepython.org/runestone/static/pythonds/index.html)
* [Dr. Ananth Kalyanaraman](http://www.eecs.wsu.edu/~ananth/)'s CptS 223 notes

## Priority Queues
A priority queue is a queue that orders the items in the queue by their *priority*. The items with the highest priority are at the front of the queue and the items with the lowest priority are at the back of the queue. If a very high priority item is enqueued, it will be stored toward (or possibly) at the front of the queue. It will thus be one of the first (or the first) items dequeued from the queue. 
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap1.png" width="500">

We could adapt our queue code from several weeks ago to be an implementation of a priority queue. We can enqueue items into the queue in sorted order. Such an algorithm would be $\mathcal{O}(n)$ for inserting items and $\mathcal{O}(1)$ for removing the highest priority item. In fact, there are several data structures we could use to implement a priority queue:

|Data structure|Insert|Remove highest priority|
|-|-|-|
|Unordered linked list|$\mathcal{O}(1)$|$\mathcal{O}(n)$|
|Ordered linked list|$\mathcal{O}(n)$|$\mathcal{O}(1)$|
|Balanced BST|$\mathcal{O}(log n)$|$\mathcal{O}(log n)$|

In this lesson we are going to cover a new data structure that is more suited for a priority queue implementation. The new data structure, a binary heap, has a *special* tree-like structure that supports inserting and removing items with $\mathcal{O}(log n)$ efficiency. The special tree is called a *binary heap*.

## Binary Heaps
A binary heap is a binary tree with two properties:
1. Structure property
1. Heap order property

A binary heap is going to be implemented as a balanced binary tree in order to keep enqueue and dequeue $\mathcal{O}(log n)$. A binary heap is usually a min heap or a max heap. A *min* heap maintains the smallest items at the front of the queue. A *max* heap maintains the largest items at the front of the queue. We will derive the interface and implementation of a min heap, though the implementation of the max heap is nearly the same.

### Structure Property
We will keep the heap balanced by creating the heap as a *complete binary tree*. Each level in a complete binary tree is full, except for the bottom level of the tree. The bottom level of the tree will be filled from left to right. The height of a complete binary tree with $N$ items is $floor(log_{2}(N))$.

Example of complete binary tree:
<img src="http://interactivepython.org/runestone/static/pythonds/_images/compTree.png" width="600">
(image from [http://interactivepython.org/runestone/static/pythonds/_images/compTree.png](http://interactivepython.org/runestone/static/pythonds/_images/compTree.png))

While we can store a complete binary tree in an object oriented fashion (nodes and links), we can more conveniently store it as a single list using indexes and offsets. Let a parent node $p$ be at index $i$ in the list. 
* The left child of $p$ is at index $2 * i$. 
* The right child of $p$ is at index $2 * i + 1$. 
* The parent of any node $n$ at index $i$ is at index $i // 2$. 

<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap2.png" width="600">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap3.png" width="700">

This storage of a complete binary tree in a single list lends itself well to efficient traversals and an efficient binary heap implementation.

Note: The first item at index 0 is 0. This is defined as such so the integer division used to lookup parent nodes results in the root node (smallest item in the heap) being at index position 1. 

Note: Since the list stores a complete binary tree, all nodes past the half way point in the list (`len(list)` // 2) are leaf nodes.

### Heap Order Property
In a heap, for every node `x` with parent node `p`, the item in `p` is smaller than or equal to the key in `x`. A tree that upholds the heap property will always have the smallest item in the root node. 

Example of a tree with the heap order property:
<img src="http://interactivepython.org/runestone/static/pythonds/_images/heapOrder.png" width="600">
(image from [http://interactivepython.org/runestone/static/pythonds/_images/heapOrder.png](http://interactivepython.org/runestone/static/pythonds/_images/heapOrder.png))

Note: duplicates are allowed in a binary heap. No order is implied for elements which do not share ancestor-descendant relationship.

### Insertion
To insert a new item into a min heap, insert the new item into the heap at the next available slot ("hole") in the complete binary tree (maintains heap structure property). Then, "percolate" the element up the heap while the heap order property is not satisfied. Let's take a look at an example:

Insert 14 at the next available slot in the heap:
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap4.png" width="400">
Percolate 14 up to restore the heap order property:
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap5.png" width="600">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap6.png" width="600">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap7.png" width="700">

### Delete Min
When we want to dequeue from the priority queue, we will delete the minimum item from the min heap. The minimum item is always at the root node so we will have to remove the root to delete the minimum key. To do this, we will decrease the heap size by one, move the last item in the heap to the root (maintains heap structure property) and "percolate" the element down while the heap order property is not satisfied. Let's take a look at an example:

Delete 13:
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap8.png" width="400">
Replace the root with 31 and then percolate 13 down to restore the heap property:
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap9.png" width="600">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap10.png" width="400">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap11.png" width="600">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap12.png" width="400">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap13.png" width="600">

### Build Heap
Suppose we need to construct a heap from an initial set of $N$ items. Think about the following two ways to construct a heap:
1. Start with an empty heap and perform $N$ inserts
    * $\mathcal{O}(N log_{2} N)$ worst case
1. Define a `build_heap()` method
    * Randomly populate initial heap with structure property
    * Perform a percolate down from each internal node (`H[size//2]` to `H[1]`) to establish the heap order property
    * $\mathcal{O}(N)$ worst case
    
Let's take a look at an example. Suppose the list of items to "heapify" is: 150, 80, 40, 10, 70, 110, 30, 120, 140, 60, 50, 130, 100, 20, 90
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap14.png" width="700">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap15.png" width="700">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap16.png" width="700">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap17.png" width="700">
<img src="https://raw.githubusercontent.com/gsprint23/cpts215/master/lessons/figures/heap18.png" width="700">

### Min Heap Interface
Public min heap methods include the following:
1. `BinaryHeap()`: creates a new, empty binary min heap
1. `insert(item)`: inserts the new item into the heap (like enqueue)
    * Add `item` to the end of the list (fills the complete binary tree from left to right). Bubble (percolate) up the new item to its proper position in the tree by swapping with its parent until it is larger than its parent (maintains the binary heap property).
1. `find_min()`: returns the item at the front of the heap without removing it (like peek)
    * Min item is at index 1
1. `delete_min()`: returns the item at the front of the heap and removes it (like dequeue)
    * Replace the root with the last item in the list (maintains tree completeness). Restore the binary heap property in the tree by bubbling down the root item to its proper position in the tree by swapping with its smallest child until it is smaller than both of its children (maintains the binary heap property).
1. `is_empty()`: returns True if the heap is empty, False otherwise
1. `build_heap(list)`: builds a new heap from a list of items
    * Set the heap to be the new list of items. Starting with the first non leaf node and working up to the node, move the nodes into their proper positions by bubbling down.
1. `decrease_key(item, priority)`: lowers the current value of the item to a new, higher priority value. 
    * Represents promoting a job (need to percolate up).
1. `increase_key(item, priority)`: increases the current value of the item to a new, lower priority value. 
    * Represents demoting a job (need to percolate down).
1. `remove(item)`: remove an item (not necessarily at the root). 
    * Represents aborting/canceling a job (need to `decrease_key(item, -inf)` then `delete_min`).

## Practice Problems
Note: the following problems are adapted from Koffman and Wolfgang.

### 1
Show the heap that would be used to store the words "this", "is", "the", "house", "that", "jack", "built", assuming they are inserted in that sequence. Exchange the oder of arrival of the first and last words and build the new heap.

### 2
Draw the heaps for the previous problem as arrays.

### 3
Show the result of removing the word "house" from the heaps in the previous problem.

### 4
A max heap is a heap in which each element has a key that is smaller than its parent, so the largest key is a the top of the heap. Build the max heap that would result from the numbers 15, 25, 10, 33, 55, 47, 82, 90, 18.