In [1]:
! jupyter nbconvert "Binary Heap.ipynb" --Exporter.preprocessors jupybeans.RemoveSkip --to html_embed

[NbConvertApp] Converting notebook Binary Heap.ipynb to html_embed
[NbConvertApp] Writing 1452178 bytes to Binary Heap.html


# Heap

- A heap is
  - a complete binary tree
  - obeys the heap order property
- A complete binary tree
  - Not a BST, just a tree where each node has 0-2 children
  - Every depth is completely filled except the last
  - The items in the last row are packed left
  - The height is $\log n$
  - What fraction of the tree is a leaf?
- heap order property
  - for every node $n$, the value of $n$'s parents is <= the value of n
- example
- insert
  - add in next open slot
  - percolate up
- big-O
  - $O(\log n)$ worse case
  - $O(1)$ average case https://stackoverflow.com/questions/39514469/argument-for-o1-average-case-complexity-of-heap-insertion
- remove
  - replace root with last item
  - percolate down
  - $O(\log n)$ average and worse case
- implementation
  - Array!
  
- Review structures
  - Stack, queue
  - vector, list
  - set, map
    - BST, hashtable
  - priority queue: heap!

A **heap** is a *complete binary tree* that obeys the *heap order property*.

<div class='big centered' style='font-size: 200pt'> 🤨 </div>

A **complete binary tree** is a binary tree (i.e. each node has at most two children) where every row but the last must be completely filled.

And the elements on the last row must be packed to the left.

In [1]:
%%file complete_binary_tree.txt
A > B
A < C
B < D
B > E
C > F

Writing complete_binary_tree.txt


In [6]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 complete_binary_tree.txt

<img src="complete_binary_tree.png" />

The **heap order property** means that each the value of any node is greater than its parent's value.

<img src="complete_binary_tree.png" />

You can think of the "heavy" values sinking to the bottom of the tree.

## Inserting into a heap

**How do you add a new item to the heap while maintaining the required properties?**

- Add an item to the next open slot
- Percolate up until the value is in the correct location

To *percolate up*:
- Compare the current value to the value of the parent node
- If the parent value is greater, swap

In [9]:
%%file numeric_heap.txt
n4a [label="4"]
n4b [label="4"]
n4a > n4b
n4a < 8
n4b > 7
n4b < 5

Overwriting numeric_heap.txt


In [10]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 numeric_heap.txt

<img src="numeric_heap.png" />

What is the result of inserting 8 into the heap?

In [13]:
%%file numeric_heap_with8.txt
n4a [label="4"]
n4b [label="4"]
n8a [label="8"]
n8b [label="8"]
n4a > n4b
n4a < n8a
n4b > 7
n4b < 5
n8a > n8b

Overwriting numeric_heap_with8.txt


In [14]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 numeric_heap_with8.txt

<img src="numeric_heap_with8.png" />

In [15]:
%%file numeric_heap_with2.txt
n2 [label="2"]
n4a [label="4"]
n4b [label="4"]
n2 > n4a
n4a > 7
n4a < 5
n2 < n4b
n4b > 8

Writing numeric_heap_with2.txt


In [16]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 numeric_heap_with2.txt

What is the result of inserting 2 into the heap?

<img src="numeric_heap.png" />

<img src="numeric_heap_with2.png" />

In [17]:
%%file needs4.txt
6 < 9
6 > 7
7 > 8

Writing needs4.txt


In [18]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 needs4.txt

In [19]:
%%file has4.txt
4 < 9
4 > 6
6 > 8
6 < 7

Writing has4.txt


In [20]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 has4.txt

Work with a partner.

What is the result of inserting 4 into this tree.

<img src="needs4.png" />

<img src="has4.png" />

What is the result of inserting 5, 2, 8, 3, 1, 9, 6, 4 into an empty heap?

In [24]:
%%file heap_5.dot
digraph {
  n5 [label="5"]
}

Writing heap_5.dot


In [25]:
! dot -Tpng -o heap_5.png heap_5.dot

In [26]:
%%file heap_52a.txt
5 > 2

Writing heap_52a.txt


In [27]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_52a.txt

In [28]:
%%file heap_52b.txt
2 > 5

Writing heap_52b.txt


In [29]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_52b.txt

In [30]:
%%file heap_528.txt
2 > 5
2 < 8

Writing heap_528.txt


In [31]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_528.txt

In [32]:
%%file heap_5283a.txt
2 > 5
2 < 8
5 > 3

Writing heap_5283a.txt


In [33]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_5283a.txt

In [34]:
%%file heap_5283b.txt
2 > 3
2 < 8
3 > 5

Writing heap_5283b.txt


In [35]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_5283b.txt

In [36]:
%%file heap_52831a.txt
2 > 3
2 < 8
3 > 5
3 < 1

Writing heap_52831a.txt


In [37]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_52831a.txt

In [38]:
%%file heap_52831b.txt
2 > 1
2 < 8
1 > 5
1 < 3

Writing heap_52831b.txt


In [39]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_52831b.txt

In [40]:
%%file heap_52831c.txt
1 > 2
1 < 8
2 > 5
2 < 3

Writing heap_52831c.txt


In [41]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_52831c.txt

In [42]:
%%file heap_528319.txt
1 > 2
1 < 8
2 > 5
2 < 3
8 > 9

Writing heap_528319.txt


In [43]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_528319.txt

In [44]:
%%file heap_5283196a.txt
1 > 2
1 < 8
2 > 5
2 < 3
8 > 9
8 < 6

Writing heap_5283196a.txt


In [45]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_5283196a.txt

In [46]:
%%file heap_5283196b.txt
1 > 2
1 < 6
2 > 5
2 < 3
6 > 9
6 < 8

Writing heap_5283196b.txt


In [47]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_5283196b.txt

In [48]:
%%file heap_52831964a.txt
1 > 2
1 < 6
2 > 5
2 < 3
6 > 9
6 < 8
5 > 4    

Writing heap_52831964a.txt


In [49]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_52831964a.txt

In [50]:
%%file heap_52831964b.txt
1 > 2
1 < 6
2 > 4
2 < 3
6 > 9
6 < 8
4 > 5

Writing heap_52831964b.txt


In [51]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_52831964b.txt

<img src="heap_5.png" />

<img src="heap_52a.png" />

<img src="heap_52b.png" />

<img src="heap_528.png" />

<img src="heap_5283a.png" />

<img src="heap_5283b.png" />

<img src="heap_52831a.png" />

<img src="heap_52831b.png" />

<img src="heap_52831c.png" />

<img src="heap_528319.png" />

<img src="heap_5283196a.png" />

<img src="heap_5283196b.png" />

<img src="heap_52831964a.png" />

<img src="heap_52831964b.png" />

Work with a partner.

Show the result of inserting 5, 3, 8, 1, 7, 4, 6, 2 into an empty Heap.

In [52]:
%%file heap_53817462.txt
1 > 2
1 < 4
2 > 3
2 < 7
3 > 5
4 > 8
4 < 6

Writing heap_53817462.txt


In [53]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 heap_53817462.txt

<img src="heap_53817462.png" />

## Big-O

- What is the big-O for inserting into a heap?

$O(\log n)$ for worst case.

$O(1)$ for average case. 

Intuition: <https://stackoverflow.com/a/39517553/2288986>
- Half of the items are leaves, and leaves are heavy, and non-leaves are light.
- The inserted item has a $\frac{1}{2}$ chance of staying put as a leaf.
- It has a $\left(\frac{1}{2}\right)^2 = \frac{1}{4}$ chance of ending up in the $height=1$ row.
- Thus, the probability of taking another step decreases exponentially.
- The result is an average of $O(1)$ time (actual constant is somewhere near 2). 

## Removing from a heap

Usually, when working with a heap, the item we remove is always the top item.

**Why?**

To remove the top item:

- Replace the top with the last item
- Percolate down until the order is correct

To percolate down:
- Compare an item with its **smaller** child
- Swap if the item is larger

In [54]:
%%file remove_4549658.txt
n4a [label="4"]
n4b [label="4"]
n5a [label="5"]
n5b [label="5"]

n4a > n5a
n4a < n4b
n5a > 9
n5a < 6
n4b > n5b
n4b < 8
n5a > 9
n5a < 6

Writing remove_4549658.txt


In [55]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_4549658.txt

<img src="remove_4549658.png" />

In [67]:
%%file remove_549658a.txt
n8 [label="8"]
n4b [label="4"]
n5a [label="5"]
n5b [label="5"]

n8 > n5a
n8 < n4b
n5a > 9
n5a < 6
n4b > n5b
n5a > 9
n5a < 6

Overwriting remove_549658a.txt


In [68]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_549658a.txt

<img src="remove_549658a.png" />

In [73]:
%%file remove_549658b.txt
n4b [label="4"]
n8 [label="8"]
n5a [label="5"]
n5b [label="5"]

n4b > n5a
n4b < n8
n5a > 9
n5a < 6
n8 > n5b
n5a > 9
n5a < 6

Overwriting remove_549658b.txt


In [74]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_549658b.txt

<img src="remove_549658b.png" />

In [77]:
%%file remove_549658c.txt
n4b [label="4"]
n8 [label="8"]
n5a [label="5"]
n5b [label="5"]

n4b > n5a
n4b < n5b
n5a > 9
n5a < 6
n5b > n8
n5a > 9
n5a < 6

Overwriting remove_549658c.txt


In [78]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_549658c.txt

<img src="remove_549658c.png" />

In [79]:
%%file remove_455968a.txt
n8 [label="8"]
n4b [label="4"]
n5a [label="5"]
n5b [label="5"]

n8 > n5a
n8 < n5b
n5a > 9
n5a < 6
n5a > 9
n5a < 6

Writing remove_455968a.txt


In [80]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_455968a.txt

<img src="remove_455968a.png" />

In [81]:
%%file remove_455968b.txt
n5b [label="5"]
n5a [label="5"]
n8 [label="8"]

n5b > n5a
n5b < n8
n5a > 9
n5a < 6
n5a > 9
n5a < 6

Writing remove_455968b.txt


In [82]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_455968b.txt

<img src="remove_455968b.png" />

In [85]:
%%file remove_55896a.txt
n6 [label="6"]
n5a [label="5"]
n8 [label="8"]

n6 > n5a
n6 < n8
n5a > 9

Overwriting remove_55896a.txt


In [86]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_55896a.txt

<img src="remove_55896a.png" />

In [87]:
%%file remove_55896b.txt
n5a [label="5"]
n6 [label="6"]
n8 [label="8"]

n5a > n6
n5a < n8
n6 > 9

Writing remove_55896b.txt


In [88]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_55896b.txt

<img src="remove_55896b.png" />

In [87]:
%%file remove_55896b.txt
n5a [label="5"]
n6 [label="6"]
n8 [label="8"]

n5a > n6
n5a < n8
n6 > 9

Writing remove_55896b.txt


In [88]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_55896b.txt

<img src="remove_55896b.png" />

In [91]:
%%file remove_5689a.txt
n9 [label="9"]
n6 [label="6"]
n8 [label="8"]

n9 > n6
n9 < n8

Overwriting remove_5689a.txt


In [92]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_5689a.txt

<img src="remove_5689a.png" />

In [93]:
%%file remove_5689b.txt
n6 [label="6"]
n9 [label="9"]
n8 [label="8"]

n6 > n9
n6 < n8

Writing remove_5689b.txt


In [94]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_5689b.txt

<img src="remove_5689b.png" />

In [95]:
%%file remove_698.txt
n8 [label="8"]
n9 [label="9"]

n8 > n9

Writing remove_698.txt


In [96]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_698.txt

<img src="remove_698.png" />

In [97]:
%%file remove_9.dot
digraph {
n9 [label="9"]
}


Writing remove_9.dot


In [98]:
! dot -Tpng -o remove_9.png remove_9.dot

<img src="remove_9.png" />

Work with a partner.

Show all the steps of removing an item from the heap.

**Remember**, you can only remove the top element.

In [1]:
%%file remove_class.txt
2 > 4
2 < 3
4 < 5
4 > 7

Writing remove_class.txt


In [2]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_class.txt

<img src="remove_class.png" />

In [3]:
%%file remove_classb.txt
3 > 4
3 < 5
4 > 7

Writing remove_classb.txt


In [4]:
! python /data/projects/btrees/btrees.py --w-scale 0.7 remove_classb.txt

<img src="remove_classb.png" />

When percolating down, why do we swap with the **smaller** child?

## Big-O

- What is the big-O for removing from a heap?

$O(\log n)$ for both average and worst case.

## Implementing a Heap

Observe:

In [18]:
%%file heap_numbered.txt
n1 [label=<1<br/><font point-size="10" color="red">1</font>>]
n2[label=<2<br/><font point-size="10" color="red">2</font>>]
n4 [label=<4<br/><font point-size="10" color="red">3</font>>]
n3 [label=<3<br/><font point-size="10" color="red">4</font>>]
n7 [label=<7<br/><font point-size="10" color="red">5</font>>]
n8 [label=<8<br/><font point-size="10" color="red">6</font>>]
n6 [label=<6<br/><font point-size="10" color="red">7</font>>]
n5 [label=<5<br/><font point-size="10" color="red">8</font>>]

n1 > n2
n1 < n4
n2 > n3
n2 < n7
n3 > n5
n4 > n8
n4 < n6

Overwriting heap_numbered.txt


In [19]:
! python /data/projects/btrees/btrees.py --w-scale 0.5 heap_numbered.txt

<img src="heap_numbered.png" />

Let's call the red numbers the "index" of the node.

What is the relationship between the index of the parent and the index of the **left** child?

What is the relationship between the index of the **left** child and the index of the **right** child?

What do all left children have in common? What do all right children have in common?

Inasmuch as a heap is a **complete** tree, items are only added or removed **from the end**.

Each item in the tree has an index, and the index of a node can be calculated from the index of a parent or child.

So, we typically implement a heap using a vector (or an array)!

<img src="heap_numbered.png" />

```
value: 1 2 4 3 7 8 6 5
index: 1 2 3 4 5 6 7 8
```

How do you find the children of a node?

Left:  $2i$  
Right: $2i+1$

How do you find the parent of a node?

$i/2$

So, given the heap 
```
1 2 4 3 7 8 6 5
```
what is the result of inserting a 2?

```
value:  1 2 4 3 7 8 6 5 2
index:  1 2 3 4 5 6 7 8 9

i = 9
p = 9/2 = 4
value[4] = 3
3 > 2, so swap

value:  1 2 4 2 7 8 6 5 3
index:  1 2 3 4 5 6 7 8 9

i = 4
p = 4 / 2 = 2
value[2] = 2
2 !> 2, so done.


1 2 4 2 7 8 6 5 3

```


In [20]:
%%file heap_numbered_p2.txt
n1 [label=<1<br/><font point-size="10" color="red">1</font>>]
n2 [label=<2<br/><font point-size="10" color="red">2</font>>]
n4 [label=<4<br/><font point-size="10" color="red">3</font>>]
n2b [label=<2<br/><font point-size="10" color="red">4</font>>]
n7 [label=<7<br/><font point-size="10" color="red">5</font>>]
n8 [label=<8<br/><font point-size="10" color="red">6</font>>]
n6 [label=<6<br/><font point-size="10" color="red">7</font>>]
n5 [label=<5<br/><font point-size="10" color="red">8</font>>]
n3 [label=<3<br/><font point-size="10" color="red">9</font>>]

n1 > n2
n1 < n4
n2 > n2b
n2 < n7
n2b > n5
n2b < n3
n4 > n8
n4 < n6

Writing heap_numbered_p2.txt


In [21]:
! python /data/projects/btrees/btrees.py --w-scale 0.5 heap_numbered_p2.txt

<img src="heap_numbered_p2.png" />

```
1 2 4 2 7 8 6 5 3
```

What is the result of removing an item from this heap?

```
3 4 3 5 4 6 8 5 8
```

```
value:  3 4 3 5 4 6 8 5 8
index:  1 2 3 4 5 6 7 8 9


value:  8 4 3 5 4 6 8 5
index:  1 2 3 4 5 6 7 8 9

i = 1
left = 2*1 = 2
right = left + 1 = 3
value[2] = 4
value[3] = 3
3 is smaller, so use child=3
8 > 3, so swap

value:  3 4 8 5 4 6 8 5
index:  1 2 3 4 5 6 7 8 9

i = 3
left = 6
right = 7
value[6] = 6
value[7] = 8
6 is smaller, so use child=6
8 > 6, so swap

value:  3 4 6 5 4 8 8 5
index:  1 2 3 4 5 6 7 8 9

i = 6
left = 12 (no child)
right = 13 (no child)
done.
```


**Note**: when using actual arrays or vectors, we need to use 0-based indexing.

#### 0-based Heap Relationships

- Root is at 0
- Left child is at $2i+1$
- Right child is at $2i+2$
- Parent is at $\frac{i-1}{2}$

## Key Ideas

- A heap is implemented with an array, much like a vector
- The "smallest" item is always at the root
  - You can define what "smallest" means to change what value ends up on top
- You can add in $O(1)$ and remove in $O(\log n)$

What data structure only allows you to retrieve the smallest item, regardless of the order you insert them?

A **priority queue**!

Priority queues are implemented using heaps.