# Heaps

Because heaps are complete binary trees:
* Lists can be used naturally to implement them. In addition, because insertionand deletion involve swapping nodes, lists are efficient, so no need for an actual tree implementation in many cases. 
* To add a node you start by adding it at the end of the tree. Then the node _swims_ towards the right place. Swimming is a matter of exchanging an item with its parent, if the parent is smaller (in a max heap).

**Compared to a BST:**
* Aim is to get the top prio element fast, instead of optimising search. **O(1)** to get the top prio element. 
* Like BSTs **O(logN)** to insert and delete. Insertion and deletion however are quite different. For example, you add at the end of the tree (no need for search) and the added node swims, it does not stay where it was added, as with a BST.
* Pros: 
  * You can otimise: The client thread can get the top prio item fast. Then the heap can take its O(logN) time to remove it asynchronously.
  * The pros of an array list: Easy of implementation, better locaclity, no waste of pointers space.

To use a heap, you must be able to navigate the structure upwards and downwards:
* The left child of the item at index `X` is at index `2X+1`.
* The right child of the item at index `X` is at index `2X+2`.
* The parent of the item at index `X` is at index `(X-1)/2`; the item at index 0 has no parent, of course!

**Swim:** To add a node, append it to the heap and while it is larger than its parent, swap them.<br>
<img src="Graphics/swim.png" width=70% align="left">

**Sink:** To remove the top node, replace it with the last node and then start swapping the last node (which is now at the root of the heap) with the largest of its children for as long as the child node is smaller.<br>
<img src="Graphics/sink.png" width=70% align="left">

In [11]:
class MaxHeap:
    def __init__(self):
        self.hlist = []
        
    def enqueue(self, value):
        self.hlist.append(value) 
    
    def swim(self, value, child_index):
        parent_index = (child_index-1)/2
        if len(self.hlist)<=1 or parent_index<0 or self.hlist[parent_index]>=self.hlist[child_index]: return
        self.hlist[child_index], self.hlist[parent_index] = self.hlist[parent_index], self.hlist[child_index]
        self.swim(value, parent_index)
    
    def swim_iter(self):
        child_index = len(self.hlist)-1
        parent_index = (child_index-1)/2 
        while self.hlist[child_index]>self.hlist[parent_index] and child_index>=0 and parent_index>=0:
            self.hlist[child_index], self.hlist[parent_index] = self.hlist[parent_index], self.hlist[child_index]
            child_index = parent_index
            parent_index = (parent_index-1)/2
        
    def dequeue(self):
        self.hlist[0], self.hlist[-1] = self.hlist[-1], self.hlist[0]
        return self.hlist.pop(-1)
    
    def sink(self, parent_index):
        left=2*parent_index+1
        right=2*parent_index+2
        if right>=len(self.hlist): return
        if self.hlist[right] >= self.hlist[left]: child_index=right
        else: child_index=left
        if self.hlist[parent_index]>=self.hlist[child_index]: return
        self.hlist[child_index], self.hlist[parent_index] = self.hlist[parent_index], self.hlist[child_index]
        self.sink(child_index)
        
    def print_heap(self):
        print self.hlist

**Notes:**

A lot of care is needed with the indices. Out of bounds is just around the corner.

Both an recursive and an iterative implementation of `swim` are provided for demo purposes. Providing both for `sink` would be a waste of time, it is trivial once you see the insertion functions.

It is much more convenient to separate `enquque` and `dequeue` from `swim` and `sink` respectively, particularly for the recursive implementations. Also because `dequeue` needs to return the top element so its flow is disrupted. Separating also supports an asynchronous enhancement.

`dequeue` is interesting because it is quite pythonic. Check the swapping as well as the access to the last element of the list (using index: -1).

In [12]:
h = MaxHeap()
h.enqueue(10)
h.swim_iter()
h.print_heap()
h.enqueue(20)
h.swim_iter()
h.print_heap()
h.enqueue(30)
h.swim_iter()
h.print_heap()
h.enqueue(40)
h.swim_iter()
h.print_heap()
h.enqueue(50)
h.swim_iter()
h.print_heap()
h.enqueue(60)
h.swim_iter()
h.print_heap()
h.enqueue(70)
h.swim_iter()
h.print_heap()

[10]
[20, 10]
[30, 10, 20]
[40, 30, 20, 10]
[50, 40, 20, 10, 30]
[60, 40, 50, 10, 30, 20]
[70, 40, 60, 10, 30, 20, 50]


In [13]:
h = MaxHeap()
h.enqueue(10)
h.swim(10, len(h.hlist)-1)
h.print_heap()
h.enqueue(20)
h.swim(20, len(h.hlist)-1)
h.print_heap()
h.enqueue(30)
h.swim(30, len(h.hlist)-1)
h.print_heap()
h.enqueue(40)
h.swim(40, len(h.hlist)-1)
h.print_heap()
h.enqueue(50)
h.swim(50, len(h.hlist)-1)
h.print_heap()
h.enqueue(60)
h.swim(60, len(h.hlist)-1)
h.print_heap()
h.enqueue(70)
h.swim(70, len(h.hlist)-1)
h.print_heap()

[10]
[20, 10]
[30, 10, 20]
[40, 30, 20, 10]
[50, 40, 20, 10, 30]
[60, 40, 50, 10, 30, 20]
[70, 40, 60, 10, 30, 20, 50]


In [14]:
a=h.dequeue()
print "popped:", a
h.print_heap()

popped: 70
[50, 40, 60, 10, 30, 20]


In [15]:
h.sink(0)
h.print_heap()

[60, 40, 50, 10, 30, 20]
