<div style="line-height:0.5">
<h1 style="color:#709594"> Heaps usage </h1>
<span style="display: inline-block;">
    <h3 style="color: lightblue; display: inline;">Keywords:</h3> heapq methods: nlargest + merge + heappush + heappop + heappushpop + heapify
</span>
</div>

In [8]:
from copy import copy
import heapq
from collections import Counter

<h3 style="color:#709594"> Example #1: Finding the K Largest Elements in a List </h3>

In [4]:
def k_largest_elements(nums, k):
    return heapq.nlargest(k, nums)

nums = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
k_largest_elements(nums, 3)

[9, 6, 5]

<h3 style="color:#709594"> Example #2: Merging Sorted Iterables </h3>

In [5]:
list1 = [1, 3, 5, 7]
list2 = [2, 4, 6, 8]
merged_list = list(heapq.merge(list1, list2))
merged_list

[1, 2, 3, 4, 5, 6, 7, 8]

<h3 style="color:#709594"> Example #3: Heap Sort Implementation </h3>

In [6]:
def heapsort(iterable):
    h = []
    for value in iterable:
        heapq.heappush(h, value)
    return [heapq.heappop(h) for i in range(len(h))]

heapsort([3, 1, 4, 1, 5, 9, 2, 6, 5, 3])

[1, 1, 2, 3, 3, 4, 5, 5, 6, 9]

<h4 style="color:#709594"> Note: </h4>
<div style="margin-top: -9px;">
The heapq module in Python only provides a min-heap implementation, where the smallest element is at the root of the heap.<br> 
However, there are situations where a max-heap (where the largest element is at the root) is required. <br> 
By negating the values (using -), we effectively invert the order, turning the min-heap into a max-heap.
</div>

<h3 style="color:#709594"> Example #4: Implementing a Priority Queue </h3>

In [11]:
class PriorityQueue:
    """ 
    Since heapq creates a min-heap, the NEGATIVE -priority is used to ensure that an item with a higher priority (a larger number)\\ 
    is treated as smaller in the heap's ordering, and thus comes out first.\\ 
    It's a way to implement a priority queue where items with higher priorities are processed before those with lower priorities.
    """
    def __init__(self):
        self._queue = []
        self._index = 0

    def push(self, item, priority):
        heapq.heappush(self._queue, (-priority, self._index, item))
        self._index += 1

    def pop(self):
        return heapq.heappop(self._queue)[-1]

pq = PriorityQueue()
pq.push('task1', 1)
pq.push('task2', 3)
pq.push('task3', 2)

print(pq.pop())

task2


<h3 style="color:#709594"> Example #5: Finding the K Smallest Sum Combinations </h3>
<div style="margin-top: -9px;">

=> heappushpop() is the fast version of a heappush followed by a heappop.
</div>

In [19]:
def k_smallest_pairs(nums1, nums2, k):
    """
    When using heappush() and heappushpop() with a list of tuples, the heap operations consider the tuples as a whole...
    but the ordering is determined based on the first element of each tuple!!
    """
    if not nums1 or not nums2:
        # If either one of the input lists is empty, return an empty list
        return []  

    heap = []
    for i in nums1:
        for j in nums2:
            if len(heap) < k: # N.B. the elements are 3 but the node is one after first j iteration!
                # If the heap has fewer than k elements, push the negative sum (to create a max heap), along with the pair (i, j) onto the heap
                heapq.heappush(heap, (-(i + j), i, j)) # ...pushing more than one item!
            else:
                # If the heap already has k elements... it is like a list of tuples!
                if heap and -heap[0][0] > i + j:
                    # To keep track of the pairs with the smallest sums, replace the largest sum in the heap with the current pair's sum
                    # If the heap is not empty and the current pair's sum is smaller than the largest sum in the heap 
                    # (so, negative sign to transform it to positive!), 
                    heapq.heappushpop(heap, (-(i + j), i, j))
                else:
                    # If the current pair's sum is not smaller, no need to proceed further for this iteration of i, as all subsequent j's will only increase the sum
                    break
    # Return the extracted pairs
    #return [x for x in heap]  
    return [(x[1], x[2]) for x in heap]  


nums1 = [1, 7, 11]
nums2 = [2, 6, 4]
print(k_smallest_pairs(nums1, nums2, 3))

[(1, 6), (1, 2), (1, 4)]


<h3 style="color:#709594"> Example #6: Computing the Running Median </h3>

In [12]:
def get_running_median(stream):
    # Initialize two heaps: min_heap for larger half, max_heap for smaller half
    min_heap, max_heap = [], []  
    medians = []
    for number in stream:
        # Push number to min_heap, then pop smallest from min_heap to max_heap
        heapq.heappush(max_heap, -heapq.heappushpop(min_heap, number))  
        # Ensure that min_heap always has more or equal elements than max_heap
        if len(max_heap) > len(min_heap):  
            heapq.heappush(min_heap, -heapq.heappop(max_heap))
        # Calculate and append the current median
        medians.append((min_heap[0] - max_heap[0]) / 2.0 if len(min_heap) == len(max_heap) else min_heap[0])
    return medians

stream = [2, 1, 5, 7, 2, 0, 5]
get_running_median(stream)

[2, 1.5, 2, 3.5, 2, 2.0, 2]

<h3 style="color:#709594"> Example #7: Finding the Closest Points to the Origin </h3>

In [None]:
def k_closest_points(points, k):
    heap = []  
    for (x, y) in points: 
        # Calculate the negative square of the distance to make a max heap
        dist = -(x**2 + y**2)  
        # If the heap is at capacity (k items)
        if len(heap) == k:  
            # Replace the farthest point if the current point is closer
            heapq.heappushpop(heap, (dist, x, y))  
        else:
            # Add the current point to the heap
            heapq.heappush(heap, (dist, x, y))  
    # Extract points from the heap and return them
    return [(x,y) for (_, x, y) in heap]  

<h3 style="color:#709594"> Example #8: Finding Frequent Elements </h3>

In [None]:
def top_k_frequent(nums, k):
    if k == len(nums):
        # If k equals the length of nums, return nums as all elements are frequent
        return nums  
    # Count the frequency of each element in nums
    count = Counter(nums)  
    # Find the k most frequent elements based on their frequency
    return heapq.nlargest(k, count.keys(), key=count.get)

<h3 style="color:#709594"> Example #9: Reorganizing String to Avoid Adjacent Duplicates </h3>

In [20]:
def reorganize_string(s):
    # Count the frequency of each character in the string
    count = Counter(s)  
    # Create a max heap (using negative counts) of characters and their frequencies
    max_heap = [(-cnt, char) for char, cnt in count.items()]
    # Convert the list into a heap
    heapq.heapify(max_heap)  
    prev_cnt, prev_char = 0, ''  
    result = ''
    while max_heap:
        # Pop the most frequent character
        cnt, char = heapq.heappop(max_heap)  
        # Add the most frequent character to the result string
        result += char  
        if prev_cnt < 0:
            # Push the previous character back onto the heap if it still has a count
            heapq.heappush(max_heap, (prev_cnt, prev_char))  
        # Update previous character and count
        prev_cnt, prev_char = cnt + 1, char
    # Return the result if it's valid. Otherwise, return an empty string
    return result if len(result) == len(s) else ""

s = "aabdasda"
print(reorganize_string(s))

adabadas


<h3 style="color:#709594"> Example #10: Given an integer array nums and an integer k, return the kth largest element in the array.

Note that it is the kth largest element in the sorted order, not the kth distinct element.
Can you solve it without sorting? </h3>

In [10]:
def findKthLargest(nums, k):
    ############### solution 1: with sorting 
    # nums.sort(reverse=True)
    # return nums[k-1]
    ############### solution 2: with max => ~ quadratic complexity O(k * n)
    '''
    temp = copy(nums)
    for _ in range(k):
        max_el = max(temp)
        temp.remove(max_el)
    return max_el
    '''
    ############### solution 3: with heap structure =>  complexity  O(n * log(k))
    # FIll the min_heap 
    # If heap size is greater than k, remove the smallest element
    # The top element in the min heap is the kth largest
    min_heap = []
    for x in nums:
        heapq.heappush(min_heap, x)
        if len(min_heap) > k:
            heapq.heappop(min_heap)
    
    return min_heap[0]

nums = [2,3,4,5,1,6,5]
k = 2
res = findKthLargest(nums, k)
res

5