# Background
- pg 143
- Big(O)
    * heappush(heap, item) : O(log n)
    * heappop(heap, item) : O(log n)
    * peek: O(1)
    * heapify: O(n) : 
      - better than a sort, which is typically n * log(n)
      - Converts an unordered list into a valid heap. This is done bottom-up, starting from the lowest non-leaf nodes, and each subtree adjustment takes constant time on average for all nodes. The result is a linear-time operation.
- Key question: How do I know if I should use min or max heap?
    * Ex: if want k most,
      - Key question: what does the heap contain?
          * max_heap : if heap contains all numbers, and we will pop k times to get final result. 
          * min_heap : if heap contains only top k highst number. This is more space efficient, but worst run time.
          * Motto:
            - if k-most, higest value --> min heap
            - if k smallest, lowest value --> max heap

In [1]:
import heapq

# Python default heap is a min heap
print('MIN HEAP')
a = [5,2,15]
heapq.heapify(a) 
print(a) # a[0] is the min value


print('\nMAX HEAP')
b = [6, 2, 8, 3]
c = [-i for i in b]
heapq.heapify(c) # O(n)
print(c)
-c[0] 

MIN HEAP
[2, 5, 15]

MAX HEAP
[-8, -3, -6, -2]


8

### K Most Frequent Strings
- Find the k most freq strings in an array, and return then sorted by the frequency in descending order.  If 2 strings have same frequency, sort them by lexicographical order
- Ex:
    * strs = ["go", "coding", "byte", "byte", "go", "interview", "go"]  k=2
    * res = ["go", "byte"]

In [9]:
import heapq
from collections import Counter

def k_most_freqs_via_minheap(strs, k):
    word_to_cnt = Counter(strs)
    
    min_heap = []  # because we want to remove the smallest cnts
    
    for s, cnt in word_to_cnt.items(): # O(n)
        
        heapq.heappush(min_heap, (cnt, s)) # O(log(k)) bc the heap has most k elements

        if len(min_heap) > k:
            heapq.heappop(min_heap)  # O(log(k))  bc the heap has most k elements


    res =  [heapq.heappop(min_heap)[1] for _ in range(k)] # O( k * log(k) )
    res.reverse()
    return res

strs = ["go", "coding", "byte", "byte", "go", "interview", "go"]

k_most_freqs_via_minheap(strs, 2)

['go', 'byte']

In [8]:
import heapq
from collections import Counter 

# n = num of words

def k_most_freq_via_maxheap(strs, k):

    # RT: O(n)
    freqs = Counter(strs)

    # RT: O(n)
    max_heap = [(-cnt, w) for w, cnt in freqs.items()]

    # RT: O(n)
    heapq.heapify(max_heap)

    # RT: O(k* (1) )
    return  [heapq.heappop(max_heap)[1] for i in range(k)]
        

strs = ["go", "coding", "byte", "byte", "go", "interview", "go"]

k_most_freq_via_maxheap(strs, 2)

['go', 'byte']

### Combine Sorted Linked List
- Given k singly linked list, each sorted in ascending order, combine them in to one sortd order

In [21]:
class ListNode:
    def __init__(self, value, next):
        self.value = value
        self.next = next

    def __lt__(self, other):
        return self.value <= other.value

In [23]:
from typing import List
import heapq

def combine_sorted_linked_list(lists: List[ListNode]) -> ListNode:
    
    # Aim to have heap have a member from each of the lists
    heap = []

    for head in lists:
        if head:
            heapq.push(heap, head)

    dummy = ListNode(-1)

    # Curr is the pter to our result linked list
    curr = dummy

    while heap:
        smallest_node = heapq.heappop(heap)

        curr.next = smallest_node
        curr = curr.next

        if smallest_node.next:
            heapq.heappush(heap, smallest_node.next)

    return dummy.next

### Median of an Integer Stream
- Design a DS that supports adding integers from a data stream and retrieving the median of all elements at any point
    * add(num: int) -> None: adds an integer to the DS
    * get_median() -> float: returns the median of all integers so far

- Example:
    * Input: [add(3), add(6), get_median(), add(1), get_median() ]
    * Output: [4.5, 3.0]

In [5]:
import heapq

class Median:
    def __init__(self):
        self.left = []  # max_heap
        self.right = []  # min_heap
    
    def add(self, num):
        n, m = len(self.left), len(self.right)

        if not self.left or num < -self.left[0]:
            heapq.heappush(self.left, -num)
            if len(self.left) > len(self.right) + 1:
                heapq.heappush(self.right, - heapq.heappop(self.left))
                

        else:
            heapq.heappush(self.right, num)
            if len(self.left) < len(self.right):
                heapq.heappush(self.left, - heapq.heappop(self.right) )
        
        
        return 
    
    def find_median(self):
        n, m = len(self.left), len(self.right)
        if n == m:
            return (-self.left[0] + self.right[0]) / 2.0

        # left is always equal or larger than right based on add implementation
        return -self.left[0] 