# Heap
A Heap is a **complete binary tree** data structure that satisfies the heap property:
- for every node, the value of its children is greater than or equal to its own value.

Heaps are usually used to implement **priority queues**, where the smallest (or largest) element is always at the root of the tree.

### Binary Heap Representation
A binary heap is typically represented as an array.   
- The root element will be at `arr[0]`.
- `arr[(i-1)/2]`:Returns the parent node
- `arr[2i+1]`:Returns the left child node
- `arr[2i+2]`:Returns the right child node

### Two Important Heap Operation
- `heapInsert(int i)`: insert a new element into position i on the heap (look up)
- `heapify(int i)`: after we change the value of the `arr[i]`, we need to adjust the heap to maintain its property (look down)

# Heap sort
Using the property of heap, we find a efficient sorting algorithm

### Algorithm
- (1) Transform the whole array into a **max heap** by using heapify(from bottom to top)
- (2) Swap the root of the heap to the last position, `size--`, then `heapify(root)`
- (3) Repeat step (2) until `size == 0`

### Time Complexity: O(nlogn)     
- Building a heap from bottom to top took O(n) time (step1)
  - n/2 + 2 * n/4 + 3 * n/8 + 4 * n/10 + ... = O(n)
- For step 2 and 3 and Each execution of step2 take O(logn) time, and we need to heapify n times

In [36]:
def heapsort(nums):
    heapsize = len(nums)
    # Step1: Transform the array into a MAX Heap
    for i in range(heapsize - 1, -1, -1):
        heapify(nums, i, heapsize)
    # Step 2:
    while heapsize > 1:
        heapsize -= 1
        nums[0], nums[heapsize] = nums[heapsize], nums[0]
        heapify(nums, 0, heapsize)

def heapify(nums, i, size):
    l = i * 2 + 1
    while l < size:
        larger_child = l + 1 if l + 1 < size and nums[l + 1] > nums[l] else l
        if nums[larger_child] <= nums[i]:
            break
        nums[i], nums[larger_child] = nums[larger_child], nums[i]
        i = larger_child
        l = i * 2 + 1

# Heapinsert, not used in our heapsord algorithm
def heapinsert(nums, i):
    while nums[i] > nums[(i - 1) // 2]:
        nums[i], nums[(i - 1) // 2] = nums[(i - 1) // 2], nums[i]
        i = (i - 1) // 2

mynums = [1, 4, 5, 7, 2, 3, 6, 9, 0, 8]
print(mynums)
heapsort(mynums)
print(mynums)

[1, 4, 5, 7, 2, 3, 6, 9, 0, 8]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


# Heap can be used to dynamically keep track of the Min/Max element of an array

---
### Q1: Merge K Sorted List (LC.23)
*You are given an array of k linked-lists lists, each linked-list is sorted in ascending order.*         
*Merge all the linked-lists into one sorted linked-list and return it.*

In [39]:
import heapq

class Solution:
    def mergeKLists(self, lists):
        heap = []
        
        for head in lists:
            if head:
                # We need to add a id to the heap because if two nodes have the same value (val),
                # there will be a runtime error since heapq cannot compare ListNode objects.
                heapq.heappush(heap, (head.val, id(head), head))
    
        dummy = ListNode(0)
        current = dummy
        
        while heap:
            val, i, node = heapq.heappop(heap)
            
            current.next = node
            current = current.next
            
            if node.next:
                heapq.heappush(heap, (node.next.val, i, node.next))
        
        return dummy.next


---
### Q2: Maximum Overlap of Intervals
*Given a set of intervals, represented as pairs of start and end points, determine the maximum number of intervals that overlap at any single point.*

*Example:*        
*Input: [(1, 5), (2, 6), (3, 7), (4, 8), (9, 10)]*        
*Output: 4*              
*Explanation: The maximum overlap occurs between the intervals (1, 5), (2, 6), (3, 7), (4, 8).*            

Key thoughts: given any left boundary of a overlapping interval, it must also be the left boundary of a line segment
- We first sort the line segments by their left boundary
- We also need a minHeap, which will store the right boundary of segments
- Then we iterate through the sorted array, and for each segment `i`:
  - we pop all elements in the minheap that is smaller than `li`.
  - Then we push `ri` into the minheap
  - Now what's left in the minHeap will overlap with segment `i`.
- Finally we compare the results and find max among all segments

In [52]:
import heapq

def maxOverlap(intervals):
    # Sort intervals by start time
    intervals.sort(key=lambda x: x[0])
    
    heap = []  # Min-heap to store end times
    ans = 0    # Variable to track the maximum overlap
    
    for interval in intervals:
        # Remove intervals from the heap that end before the current interval starts
        while heap and heap[0] <= interval[0]:
            heapq.heappop(heap)
        
        # Add the current interval's end time to the heap
        heapq.heappush(heap, interval[1])
        
        # Update the maximum overlap
        ans = max(ans, len(heap))
    
    return ans

---
### Q3: Minimum Operations To Halve Array Sum (LC.2208)
*You are given an array nums of positive integers. In one operation, you can choose any number from nums and reduce it to exactly half the number. (Note that you may choose this reduced number in future operations.)*           

*Return the minimum number of operations to reduce the sum of nums by at least half.*

In [59]:
class Solution:
    def halveArray(self, nums):
        target = sum(nums) / 2
        max_heap = [-num for num in nums]
        heapq.heapify(max_heap)
        num_op = 0
        total_reduced = 0
        while total_reduced < target:
            cur = -heapq.heappop(max_heap)
            total_reduced += cur / 2
            heapq.heappush(max_heap, -cur / 2)
            num_op += 1

        return num_op

### Q4.Meeting Room II
*Given an array of meeting time intervals intervals where intervals[i] = [starti, endi], return the minimum number of conference rooms required.*     

**Solution:**        
We only need a new meeting room when a meeting is about to start but none of the previous meeting has ended.           
Therefore, we need some data structure to keep track of the end time of previous meetings.          
Because the order of previous meeting doesn't matter, and the only thing that matters is the end time, we use a min heap to record the end time for previous meetings.      
Therefore, each time we have a new meeting, we know if a meeting room is available by checking the top of the heap, since it is the earliest end time.   
If even the earliest ending meeting is not ended, then we definitely need a new meeting room.             
Before we start iterating, we need to sort the array based on **start time**, corresponding to the way we check heap:
- this is because everytime we are compare the start time to the earliest end time, so we need to make sure the start time is also the earliest so that we don't "waste" any end time

In [6]:
class Solution:
    def minMeetingRooms(self, intervals):
        sorted_intervals = sorted(intervals, key=lambda x: x[0])
        # A MinHeap to keep track of meeting end time
        minHeap = []
        heapq.heapify(minHeap)
        ans = 0
        for interval in sorted_intervals:
            # If this is the first meeting or no meeting has ended before this one, new room is needed
            if not minHeap or minHeap[0] > interval[0]:
                ans += 1
            else:
                # minHeap[0] < interval[0], meaning that there is a meeting that has already ended
                # since we are using the previous room that become available, pop the old end time
                heapq.heappop(minHeap)
            heapq.heappush(minHeap, interval[1])

        return ans

### Q5. Meeting Room III
*You are given an integer n. There are n rooms numbered from 0 to n - 1.*

*You are given a 2D integer array meetings where meetings[i] = [starti, endi] means that a meeting will be held during the half-closed time interval [starti, endi). All the values of starti are unique.*

*Meetings are allocated to rooms in the following manner:*

*Each meeting will take place in the unused room with the lowest number.*
- *If there are no available rooms, the meeting will be delayed until a room becomes free. The delayed meeting should have the same duration as the original meeting.*
- *When a room becomes unused, meetings that have an earlier original start time should be given the room.*
- *Return the number of the room that held the most meetings. If there are multiple rooms, return the room with the lowest number.*

*A half-closed interval [a, b) is the interval between a and b including a and not including b.*

**Solution:**      
The difference between this and the prev question is that when we have a new meeting, we are not using the room that first become available. Instead we will use the room that has the smallest room number among all available rooms.
Therefore, we need to use two heap:
- `prevEndTime`: keep track of the end time of previous meetings and their room number
- `availableRoom`: keep track of available rooms

In [11]:
class Solution:
    def mostBooked(self, n, meetings):
        meetings.sort()
        cnts = [0] * n  # Track the number of meetings per room
        availableRooms = list(range(n))  # Min-heap for available rooms
        heapq.heapify(availableRooms)
        endTimes = []  # Min-heap for endTimes

        for start, end in meetings:
            # endTimes[0][0] <= start means that there will be rooms that become available when the new meeting start
            # If so, free up all of those rooms instead of using the room that first becomes available
            while endTimes and endTimes[0][0] <= start:
                heapq.heappush(availableRooms, heapq.heappop(endTimes)[1])
            
            # If there are any available room, use the room with lowest index
            if availableRooms:
                room = heapq.heappop(availableRooms)
                heapq.heappush(endTimes, (end, room))
            else:
                # No available room, so we need to delay meeting until the earliest room is free
                # delay essentially means use the first room that becomes available
                earliestEnd, room = heapq.heappop(endTimes)
                heapq.heappush(endTimes, (earliestEnd + (end - start), room))

            cnts[room] += 1  # Track meeting count per room

        return cnts.index(max(cnts))