## Heap

The two implementation of PriorityQueue by Sorted and UnSorted list have the trade-off between two operation: insert and remove min element.

The heap datastructure can provide more efficient: both operation in logarithm time.

A HEAP is a binary tree T that stores a collection of elements at its positions and satisfies two addtional property: a relational property in terms of the way keys are stored in T and a structural property defined in terms of the shape of T itself.

The relational property is the following:
    
    HEAP-ORDER property: in a heap T, for every position p other than root, the key stored at p is greater than or equal the key stored in p's parent

For the sake of efficienty, the height of heap tree T should be minimize. We enforce this requirement by insisting that the heap T sastify an additional structual property - it must be an complete tree.

        Complete binary tree property: a heap T with height h is a complete binary tree if level 0, 1, 2, ..., h-1 of T have the maximum number of nodes posible and the remaining nodes at level h reside in the leftmost position positions at that level

The heap has some property:
    * A heap T storing n entries has height h = ceil(log(n))
    

Heap can implementation by using array-based for internal tree structure. It can avoid some complexities of a node-base tree structure. In particular, the add and remove_min operations of a priority queue both depend on locating the last index of the heap size n. 

In [3]:
class PriorityQueueBase:
    """Abstract base class for a priority queue"""
    class _Item:
        __slot__ = 'key', 'value'
        
        def __init__(self, k , v):
            self.key = k
            self.value = v
            
        def __lt__(self, other):
            return self.key < other.key
        
    def is_empty(self):
        """Assume have abstract base class"""
        return len(self) == 0

In [4]:
class HeapPriorityQueue(PriorityQueueBase):
    def _parent(self, j):
        return (j-1)//2
    
    def _left(self, j):
        return j * 2 + 1
    
    def _right(self, j):
        return j * 2 + 2
    
    def _has_left(self, j):
        return self._left(j) < len(self.data)
    
    def _has_right(self, j):
        return self._right(j) < len(self.data)
    
    def _swap(self, i, j):
        self.data[j], self.data[i] = self.data[i], self.data[j]
    
    def _upheap(self, j):
        parent = self._parent(j)
        if j > 0 and self.data[j] < self.data[parent]:
            self._swap(parent, j)
            self._upheap(parent)
            
    def _downheap(self, j):
        if self._has_left(j):
            left = self.data[self._left(j)]
            small_child = left
            if self._has_right(j):
                right = self.data[self._right(j)]
                if small_child > right:
                    small_child = right
            if self.data[j] > self.data[small_child]:
                self._swap(j, small_child)
                self._downheap(small_child)
                
    def __init__(self):
        self.data = []
        
    def __len__(self):
        return len(self.data)
    
    def add(self, key, value):
        self.data.appen(self._Item(key, value))
        self._upheap(len(self.data) - 1)
        
    def min(self):
        if self.is_empty():
            raise Empty('HeapPriorityQueue is empty')
        item = self.data[0]
        return item.key, item.value
    
    def remove_min(self):
        if self.is_empty():
            raise Empty('HeapPriorityQueue is empty')
            
        self._swap(0, len(self.data) - 1)
        item = self.data.pop()
        self._downheap(0)
        return item.key, item.value
    

## Analysis of a Heap-base Priority Queue

For Priority Queue based on array-representation or tree-based structured, the methods can be performed in O(1) or O(log(n)) time

|Operation|Running time|
|---------|------------|
|len|O(1)|
|is_empty()|O(1)|
|add|O(log(n))*|
|min|O(1)|
|remove_min|O(log(n))*|

## Bottom-up heap construction

If we start with empty heap and call n consecutive add operation will run on O(nlog(n)). But if all the n key-value pairs to be stored are given in advance, there is an alternative bottom-up construction method that run in O(n).

For simplicity of exposition, we describe this bottom-up heap construction assuming the number of keys, n, is an integer such that n = 2^(h+1) − 1. That is, the heap is a complete binary tree with every level being full, so the heap has height h = log(n + 1) − 1. Viewed nonrecursively, bottom-up heap construction consists of the following h + 1 = log(n + 1) steps:
    
    * In the first step, we construct (n + 1)/2 elementary heaps storing one entry each.
    * In the second step, we form (n+1)/4 heaps, each storing three entries, by joining pairs of elementary heaps and adding a new entry. The new entry is placed at the root and may have to be swapped with the entry stored at a child to preserve the heap-order property.
    * In the third step, we form (n + 1)/8 heaps, each storing 7 entries, by joining pairs of 3-entry heaps (constructed in the previous step) and adding a new entry. The new entry is placed initially at the root, but may have to move down with a down-heap bubbling to preserve the heap-order property.
    ...
    * In the generic ith step, 2 ≤ i ≤ h, we form (n+1)/2i heaps, each storing 2i−1 entries, by joining pairs of heaps storing (2i−1 −1) entries (constructed in the previous step) and adding a new entry. The new entry is placed initially at the root, but may have to move down with a down-heap bubbling to preserve the heap-order property.
    ...
    * In the last step, we form the final heap, storing all the n entries, by joining two heaps storing (n− 1)/2 entries (constructed in the previous step) and adding a new entry. The new entry is placed initially at the root, but may have to move down with a down-heap bubbling to preserve the heap-order property

In [None]:
%%add_to HeapPriorityQueue
def __init__(self, content = ()):
    self.data= [self._Item(key, value) for key, value in contents]
    if len(self.data) > 1:
        self._heapify()
        
def _heapify(self):
    start = self._parent(len(self) -1)
    for j in range(start, -1, -1):
        self._downheap(j)