# Heaps

**Heap: A Special Tree with Rules!** 🔺

**Two Types of Heaps:**
- **Max-Heap**: The parent node is **always greater** than its children. (The biggest number sits at the top.)
- **Min-Heap**: The parent node is **always smaller** than its children. (The smallest number sits at the top.)
  
A heap is commonly implemented as a **binary tree**, but it’s stored in an **array** for efficient indexing.

> Fun Fact: Heaps are sometimes referred to as a "Priority Queue".

## Example of Max-Heap

```
        50
      /    \
    30      20
   /  \    /  \
  15  10  8    5
```

Here:
- 50 is the biggest and sits at the root.
- Every parent is bigger than its children.

## Concepts Explained

### Uses of Heaps:
- Fast for **inserting** and **removing** elements.
- Used in algorithms like **Dijkstra's shortest path** and **Prim’s MST**.
- **Priority Queues**: When you need to always get the "most important" (or smallest/largest) item first.
- **Heap Sort**: A sorting algorithm based on heaps.

### Key Takeaways
A **heap** is a tree-based data structure that satisfies the **heap property**:
  - **Min-Heap**: The parent node is always smaller than (or equal to) its child nodes.
  - **Max-Heap**: The parent node is always greater than (or equal to) its child nodes.

**Binary Heap**: Most commonly implemented as a **binary tree**, where each node has at most two children.

**Efficient Operations**:
   - Insert: O(log n)
   - Remove Min/Max: O(log n)
   - Peek Min/Max: O(1)

**Array Representation**: Heaps are typically stored as arrays to optimize memory usage and indexing:
   - Parent index: `(i - 1) // 2`
   - Left child index: `2 * i + 1`
   - Right child index: `2 * i + 2`

   ---
  

## Python’s `heapq` Module
Python’s `heapq` is a **min-heap** by default, meaning the smallest element is always at the root (index 0). If you need a max-heap, you can simulate it using negated values.

### Common `heapq` Functions:
1. **`heapify(iterable)`**:
   - Turns a list into a valid heap in O(n) time.
   - Example:
     ```python
     import heapq
     nums = [3, 2, 1, 5, 6, 4]
     heapq.heapify(nums)  # Converts to a min-heap
     print(nums)  # Output: [1, 2, 3, 5, 6, 4]
     ```

2. **`heappush(heap, element)`**:
   - Adds an element to the heap while maintaining the heap property.
   - Example:
     ```python
     heapq.heappush(nums, 0)
     print(nums)  # Output: [0, 2, 1, 5, 6, 4, 3]
     ```

3. **`heappop(heap)`**:
   - Removes and returns the smallest element (root) from the heap.
   - Example:
     ```python
     smallest = heapq.heappop(nums)
     print(smallest)  # Output: 0
     print(nums)  # Output: [1, 2, 3, 5, 6, 4]
     ```

4. **`heappushpop(heap, element)`**:
   - Pushes a new element onto the heap, then pops and returns the smallest element in one step (more efficient than separate push and pop operations).
   - Example:
     ```python
     result = heapq.heappushpop(nums, 8)
     print(result)  # Output: 1 (smallest element popped)
     print(nums)  # Output: [2, 5, 3, 8, 6, 4]
     ```

5. **`heapreplace(heap, element)`**:
   - Pops and returns the smallest element, then pushes a new element onto the heap in one step.
   - Example:
     ```python
     result = heapq.heapreplace(nums, 10)
     print(result)  # Output: 2 (smallest element popped)
     print(nums)  # Output: [3, 5, 4, 8, 6, 10]
     ```

6. **`nlargest(n, iterable)` and `nsmallest(n, iterable)`**:
   - Fetches the `n` largest or smallest elements from the heap or any iterable.
   - Example:
     ```python
     print(heapq.nlargest(3, nums))  # Output: [10, 8, 6]
     print(heapq.nsmallest(3, nums))  # Output: [3, 4, 5]
     ```

---


# Problems
- [Finding the Kth Largest Element](#finding-the-kth-largest-element)
- [Merge K Sorted Lists](#merge-k-sorted-lists)
- [Top K Frequent Elements](#top-k-frequent-elements)
- [Find Median from Data Stream](#find-median-from-data-stream)
- [How to Simulate a Max-Heap](#how-to-simulate-a-max-heap)

## Finding the Kth Largest Element

### Problem:
Find the **Kth largest element** in an array.

### Approach:
1. Use a **min-heap** of size `k`.
   - The smallest element in the heap is the root (`heap[0]`).
   - The heap always keeps the top `k` largest elements.
2. Iterate through the array:
   - If the current element is larger than the root of the heap, replace the root with this new element (maintaining size `k`).
3. At the end, the root of the heap is the Kth largest element.

### Explanation:
1. Start with the first `k` elements: `[3, 2]`.
   - Heapify: `[2, 3]`.
2. Process the rest of the array:
   - Element `1`: Ignore (smaller than root).
   - Element `5`: Replace root (`2` → `5`): `[3, 5]`.
   - Element `6`: Replace root (`3` → `6`): `[5, 6]`.
   - Element `4`: Ignore (smaller than root).
3. Final heap: `[5, 6]`. The root (`5`) is the 2nd largest element.

In [6]:
import heapq

def find_kth_largest(nums, k):
    # Step 1: Create a min-heap with the first k elements
    heap = nums[:k]
    heapq.heapify(heap)  # Turn it into a heap (O(k))

    # Step 2: Process the rest of the elements
    for num in nums[k:]:
        if num > heap[0]:  # Compare with the smallest in the heap
            heapq.heappushpop(heap, num)  # Push num and pop the smallest (O(log k))

    # Step 3: Return the smallest in the heap (the Kth largest overall)
    return heap[0]

# Example
nums = [3, 2, 1, 5, 6, 4]
k = 2
print(find_kth_largest(nums, k))  # Output: 5

5



## Merge K Sorted Lists

### Problem:
Merge `k` sorted linked lists into one sorted list.

### Approach:
1. Use a **min-heap** to track the smallest elements from all the lists.
2. Insert the first element from each list into the heap.
3. While the heap is not empty:
   - Remove the smallest element from the heap.
   - Add it to the merged list.
   - If the removed element has a next node, add that next node to the heap.
4. Continue until all elements are merged.

### Explanation:
- The heap ensures we always extract the smallest element efficiently.
- We use a tuple `(value, list_index, node)` to track the value and its origin, ensuring elements are processed in sorted order.

In [7]:
import heapq

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next

    def __repr__(self):
        return f"{self.val} -> {self.next}"

def merge_k_sorted_lists(lists):
    # Min-heap to store (value, list index, node)
    heap = []
    
    # Step 1: Initialize the heap with the first element of each list
    for i, node in enumerate(lists):
        if node:  # Skip empty lists
            heapq.heappush(heap, (node.val, i, node))

    # Step 2: Merge lists
    dummy = ListNode(0)
    current = dummy

    while heap:
        val, i, node = heapq.heappop(heap)  # Get the smallest element
        current.next = node  # Add it to the merged list
        current = current.next
        if node.next:  # Add the next element from the same list
            heapq.heappush(heap, (node.next.val, i, node.next))

    return dummy.next

# Helper function to create linked lists from lists
def create_linked_list(arr):
    if not arr:
        return None
    head = ListNode(arr[0])
    current = head
    for val in arr[1:]:
        current.next = ListNode(val)
        current = current.next
    return head

# Create some lists
list1 = create_linked_list([1, 4, 5])
list2 = create_linked_list([1, 3, 4])
list3 = create_linked_list([2, 6])

lists = [list1, list2, list3]

# Run merge_k_sorted_lists
merged_list = merge_k_sorted_lists(lists)
print(merged_list)


1 -> 1 -> 2 -> 3 -> 4 -> 4 -> 5 -> 6 -> None



## Top K Frequent Elements

### Problem:
Find the top `k` most frequent elements in an array.

### Approach:
1. Count the frequency of each element using a `Counter`.
2. Use a **min-heap** of size `k` to store the most frequent elements.
3. Iterate through the frequency map:
   - Add each element to the heap.
   - If the heap size exceeds `k`, remove the smallest frequency.

### Explanation:
- The heap keeps the top `k` elements based on frequency.
- At the end, we extract the elements from the heap for the result.

In [8]:

from collections import Counter
import heapq

def top_k_frequent(nums, k):
    # Step 1: Count frequencies
    freq_map = Counter(nums)

    # Step 2: Use a min-heap to store (frequency, element)
    heap = []
    for num, freq in freq_map.items():
        heapq.heappush(heap, (freq, num))  # Push frequency first for comparison
        if len(heap) > k:
            heapq.heappop(heap)  # Remove the smallest frequency

    # Step 3: Extract the elements from the heap
    return [num for freq, num in heap]

# Example
nums = [1, 1, 1, 2, 2, 3]
k = 2
print(top_k_frequent(nums, k))  # Output: [1, 2]

[2, 1]


## Find Median from Data Stream

### Problem:
Continuously find the median as numbers are added to a data stream.

### Approach:
1. Use two heaps:
   - A **max-heap** to store the smaller half of the numbers.
   - A **min-heap** to store the larger half of the numbers.
2. Balance the heaps such that:
   - The max-heap has at most one extra element compared to the min-heap.
3. The median is:
   - The root of the max-heap (odd total elements).
   - The average of the roots of both heaps (even total elements).

In [9]:

import heapq

class MedianFinder:
    def __init__(self):
        self.small = []  # Max-heap (inverted min-heap)
        self.large = []  # Min-heap

    def addNum(self, num):
        heapq.heappush(self.small, -num)  # Add to max-heap
        # Balance: Move the largest from small to large
        heapq.heappush(self.large, -heapq.heappop(self.small))
        
        # Ensure small has more elements if odd total
        if len(self.small) < len(self.large):
            heapq.heappush(self.small, -heapq.heappop(self.large))

    def findMedian(self):
        if len(self.small) > len(self.large):  # Odd total
            return -self.small[0]
        return (-self.small[0] + self.large[0]) / 2  # Even total

# Example
finder = MedianFinder()
finder.addNum(1)
finder.addNum(2)
print(finder.findMedian())  # Output: 1.5
finder.addNum(3)
print(finder.findMedian())  # Output: 2

1.5
2



## How to Simulate a Max-Heap
Python’s `heapq` only supports min-heaps, but you can simulate a max-heap by negating values.

In [10]:

nums = [3, 2, 1, 5, 6, 4]
max_heap = [-num for num in nums]  # Negate all values
heapq.heapify(max_heap)

# Push and pop operations
heapq.heappush(max_heap, -7)
largest = -heapq.heappop(max_heap)  # Remember to negate back

print(largest)  # Output: 7
print([-num for num in max_heap])  # Output: [6, 5, 4, 3, 2, 1]

7
[6, 5, 4, 3, 2, 1]


## Another Max-Heap

In [15]:
import heapq

my_list = [-4, 3, 1, 0, 2, 5, 10, 8, 12, 9]
n = len(my_list)

for i in range(n):
    my_list[i] = -my_list[i]

heapq.heapify(my_list)
print(my_list)

# Bonus Peak:
print(my_list[0])


[-12, -9, -10, -8, -2, -5, -1, -3, 0, 4]
-12


## Heap Sort

In [11]:
# Heap Sort
# Time: O(n log n), Space: O(n)
# Note: O(1) Space is possible via swapping, but this is complex!

import heapq

def heap_sort(arr):
    heapq.heapify(arr)
    n = len(arr)
    new_list = [0] * n

    for i in range(n):
        min = heapq.heappop(arr)
        new_list[i] = min
    
    return new_list

heap_sort([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Build a Heap From Scratch ( Sort of )


In [16]:
import heapq

# Time: O(n log n)

my_list = [-5, 4, 2, 1, 7, 0, 3]
my_heap = [  ]

for x in my_list:
    heapq.heappush(my_heap, x)
    print (my_heap)

[-5]
[-5, 4]
[-5, 4, 2]
[-5, 1, 2, 4]
[-5, 1, 2, 4, 7]
[-5, 1, 0, 4, 7, 2]
[-5, 1, 0, 4, 7, 2, 3]


# Tuples

In [17]:
import heapq
from collections import Counter

my_list = [5, 4, 3, 5, 4, 3, 5, 5, 4]
my_heap = []

# Count the frequency or number of occurences for each value. e.g., 5's frequency is 4
counter = Counter(my_list)

for k, v, in counter.items():
    # put smallest tuple on top (k, v)
    heapq.heappush(my_heap, (k, v))
    print(my_heap)

[(5, 4)]
[(4, 3), (5, 4)]
[(3, 2), (5, 4), (4, 3)]
