## 1. Heap Sort

**Importance of tree**: The reason why *tree* is the most important data structure in CS, is that as long as the <u>storage order</u> or the branches of the nodes are adjusted, the type of problem solving can be completely different.

**From *BST* to Heap**: heap is a type of BST, while it only changes the <u>storage rules</u> of nodes, which is also called as *"priority queue"*.

### 1.1 Priority Queue

**Usage Scenario**: ordinary queues follow the FIFO rule, *i.e.*, <u>dequeue from the beginning and enqueue from the tail</u>. <br/>
However, considering the scenario where we need to dequeue the highest priority element from the queue, the complexity becomes $O(N^2)$ <br/> 
((1) traverse the entire queue to find the highest priority, $O(N)$; (2) deque the most prior element, $O(N)$).

The heap is specifically designed to tackle with this scenario, which can **maintain the time complexity of $O(\log N)$** while dequeueing the highest priority element. Even though the best time complexity of ordinary queues is $O(1)$, considering the average time complexity of heap and queue, the gap between performance is $O(\log N)$ and $O(N^2)$!

![performance gap between heap and queue](https://pic.leetcode-cn.com/1608021277-ZIMDFh-0.png)

### 1.2 Heap

Given an array, `[37, 22, 26, 12, 11, 25, 7, 3, 8, 10]`, after it is heapified, the heap will look like:

![heap example](https://pic.leetcode-cn.com/1608021348-jIZjqH-1.png)



#### 1.2.1 Characteristics of Heap

1. **(max heap) The priority of the parent node is higher than or equal to its child node**, and the value of the root node is the largest.
2. **Complete Binary Tree**. <br/>
   The heap is a kind of complete tree, and it is also a full binary tree except the bottom leaf nodes. And all the leaf nodes are placed towards the leaf side, *i.e.*, nodes are always added from the left to the right, which makes the heap more compact than normal trees.<br/>
   (The full binary tree is the best structure to storage data)
3. **Dynamic**.<br/>
   Similar to BST, in order to not destroy the characteristics of the heap, the heap has its own set of operations (*sift up*, *sift down*, *replace*, etc.)
4. **Certain relationship between parents and children**.
   - From parent to children: to find the child nodes of any given parent, we can use **$2i + 1$** to find the **left child** and **$2i + 2$** to find the **right child**.
   - From child to parent: to find the parent node of any given child, we can use **$(i-1)/2 | 0$** to find the **parent** (rounding down).<img width="50%" height="50%" src =https://i.ytimg.com/vi/IjPZc9zpn7Y/maxresdefault.jpg></img>

#### 1.2.2 Operations of Heap

##### Basic Operations: Sift Up and Sift Down

**(1) Sift Up**:
1. *Push*: push the new element $x_{\text{new}}$ to the end of the array, *i.e.*, the bottom of the heap.
2. *Move up*: compare $x_{new}$ with the value of its parent, if $x_{\text{new}} > x_{\text{parent}}$, exchange the positions of $x_{\text{new}}$ and $x_{\text{parent}}$
3. Continue to compare $x_{\text{new}}$ with the value its new parent, until $x_{\text{new}} \le x_{\text{parent}}$ or $x_{\text{new}}$ reaches the root node.
   
<font color=red>**(2) Sift Down**</font>
> AM: The most important reason why we use heap is that we want to dequeue the highest priority element efficiently.

Because it's required to maintain the nature of th heap during dequeue, the root node cannot be dequeued directly. So:
1. **after the root node is dequeued, we put the last element $x_{\text{end}}$ of the heap array into the top of the heap.**
2. then compare $x_{\text{end}}$ with the value of its left and right children, and exchange positions with the child with the <u>largest value</u>.
3. continue sift down $x_{\text{end}}$ until it is greater than two child nodes or reaches the leaf node.


##### Auxiliary Operations

**(3) Replace**<br/>
Now there's a need to replace the root node. We can implement `SiftUp` to the root node and `SiftDown` to the new element in order, however, it requires 2 $O(\log N)$ operations. We can define a function `replace()` to encapsule this process and reduce it to $O(\log N)$ 


**(4) Heapify**<br/>
How to quickly convert an array to a heap? It is indeed possible to directly add each item of the array to the heap, but this process is not the fastest. <br/>
Here we can take advantage of the characteristics of a complete binary tree, <u>start from the parent node of the last leaf node, and perform the `SiftDown` operation on all nodes between the starting node and the root node</u>.<br/>
In this case, the operation of half of the nodes can be saved on average, the time complexity is reduced from $O(N \log N)$ to $O(N)$.
<img src="https://pic.leetcode-cn.com/1608021426-PVtenH-5.png"></img>



### 1.3 Implement a Max Heap from Scratch

In [47]:
from typing import List
from math import floor
import big_o
from binarytree import build

In [92]:
class MaxHeap:
    def __init__(self, arr: List[int]):
        if len(arr) == 0: 
            self.arr = []  
        else:
            self.heapify(arr)
    
    def getParent(self, i: int) -> int:
        if i == 0: return 0
        return int(floor((i - 1) / 2))
    
    def getLeftChild(self, i: int) -> int:
        return 2 * i + 1
    
    def getRightChild(self, i: int) -> int:
        return 2 * i + 2
    
    def swap(self, i, j):
        self.arr[i], self.arr[j] = self.arr[j], self.arr[i] 
    
    def push(self, new: int):
        self.arr.append(new)
        
    def siftUp(self, i: int):
        parent = self.getParent(i)
        if self.arr[parent] < self.arr[i]:
            self.swap(parent, i)
            self.siftUp(parent)
            
    def sift(self) -> int:
        if len(self.arr) == 0: return
        self.swap(0, -1)
        max = self.arr.pop()
        self.siftDown(0)
        return max
    
    def siftDown(self, i: int):
        max = self.getLeftChild(i)
        if max >= len(self.arr): return
        if max + 1 < len(self.arr):
            if self.arr[max + 1] >= self.arr[max]:
                max += 1
        if self.arr[max] > self.arr[i]:
            self.swap(max, i)
            self.siftDown(max)
            
    def replace(self, new: int):
        if len(self.arr) == 0: return 
        max = self.arr[0]
        self.arr[0] = new
        self.sift(0)
        return max
    
    def heapify(self, arr: List[int]) -> List[int]:
        self.arr = arr
        start = self.getParent(len(self.arr) - 1)
        while start >= 0:
            self.siftDown(start)
            start -= 1

In [114]:
arr = [11, 25, 7, 37, 22, 3, 8, 26, 12, 10]  # [37, 22, 26, 12, 11, 25, 7, 3, 8, 10]
MH = MaxHeap(arr)
root = build(MH.arr)
print(root)


           _______37__
          /           \
     ____26___         8
    /         \       / \
  _25         _22    3   7
 /   \       /
11    12    10



### 1.4 Naive Heap Sort

<font color=yellow>Since the root node is the largest in the max heap, the sorted array can be obtained by dequeueing the elements of the heap one by one.</font>

In [118]:
arr = [11, 25, 7, 37, 22, 3, 8, 26, 12, 10]
MH = MaxHeap(arr)
res = []
while len(MH.arr) > 0:
    res.insert(0, MH.sift())
res

[3, 7, 8, 10, 11, 12, 22, 25, 26, 37]

#### 1.4.1 Heap Sort (use an extra array)

In [260]:
class heapSort:
    def Sort(self, arr: List[int]) -> List[int]:
        if len(arr) == 0: return []
        self.heap = arr
        self.heapify()
        res = []
        while len(self.heap) > 0:
            res.insert(0, self.sift())
        return res
    
    def heapify(self) -> List[int]:
        start = floor((len(self.heap) - 1) / 2)
        while start >= 0:
            self.siftDown(start)
            start -= 1
    
    def sift(self):
        self.heap[0], self.heap[-1] = self.heap[-1], self.heap[0]
        max = self.heap.pop()
        self.siftDown(0)
        return max
    
    def siftDown(self, i):
        switch = 2 * i + 1  # left child node of given parent node
        if switch > len(self.heap) - 1: return
        if switch + 1 < len(self.heap):
            if self.heap[switch + 1] > self.heap[switch]: 
                switch += 1
        if self.heap[i] < self.heap[switch]:
            self.heap[i], self.heap[switch] = self.heap[switch], self.heap[i]
            self.siftDown(switch)

In [261]:
arr = [11, 25, 7, 37, 22, 3, 8, 26, 12, 10]
HS = heapSort()
sorted_arr = HS.Sort(arr)
sorted_arr

[3, 7, 8, 10, 11, 12, 22, 25, 26, 37]


#### 1.4.2 Heap Sort (without extra arrays)

Sort directly on the original array without resorting to extra space.

Principle idea: after the top element of the heap is exchanged with the end, the **right boundary is reduced**.

![reduce_boundary](https://pic.leetcode-cn.com/1608021436-UlGGVX-6.png)

In [255]:
class heapSort:
    def Sort(self, arr: List[int]) -> List[int]:
        if len(arr) == 0: return []
        self.heap = arr
        self.heapify()
        end = len(self.heap) - 1
        while end > 0:
            self.sift(end)
            end -= 1
        return self.heap
    
    def heapify(self) -> List[int]:
        start = floor((len(self.heap) - 1) / 2)
        while start >= 0:
            self.siftDown(start, len(self.heap))
            start -= 1
    
    def sift(self, end):
        self.heap[0], self.heap[end] = self.heap[end], self.heap[0]
        self.siftDown(0, end)
        return max
    
    def siftDown(self, i, end):
        switch = 2 * i + 1  # left child node of given parent node
        if switch > end - 1: return
        if switch + 1 < end:
            if self.heap[switch + 1] > self.heap[switch]: 
                switch += 1
        if self.heap[i] < self.heap[switch]:
            
            self.heap[i], self.heap[switch] = self.heap[switch], self.heap[i]
            self.siftDown(switch, end)

In [256]:
arr = [11, 25, 7, 37, 22, 3, 8, 26, 12, 10]
HS = heapSort()
sorted_arr = HS.Sort(arr)
sorted_arr

[3, 7, 8, 10, 11, 12, 22, 25, 26, 37]

#### 1.4.3 Complexity Analysis

## 2. Applications of Heap Sort

[Applications of Heap Sort](https://leetcode.cn/circle/article/JyofwU/)