# Heaps 🏉

## Priority Queues -`cpujobscheduling` or `airtrafficcontrol`

We need something more complex than a Queue for an Air Traffic Controller because there are a lot of things to think about before letting a plane land or takeoff (remaining fuel, time waited, distance from runway). 

So we make **Priority Queues.**

For visualization you can check out this [wonderful site.](https://www.cs.usfca.edu/~galles/visualization/Heap.html)

This is a collection of prioritized elements that allows arbitrary element insertion, and allows the removal of the element that has first priority.

## Extra - Priority Queues in Real World 🥰

## Heaps - `O(log n)` time insert and remove

Python implements priority queues with `heaps`. A min heap is all you need!

A more efficient realization of a priority queue using a data structure called a `binary heap`. 

This data structure allows us to perform both **insertions and removals in logarithmic time**, which is a significant improvement over the list-based implementations.

The fundamental way the heap achieves this improvement is to use the structure of a binary tree to find a compromise between elements being entirely unsorted and perfectly sorted.

### Heap-Order Property - `min is always root` - `parent are always small`

In a heap T, for every position p other than the root, the key stored at p is greater than or equal to the key stored at p's parent. 

A minimum key is always stored at the root of T.

### Complete Binary Tree Property

A heap T with height $h$ is a complete binary tree if levels $0, 1, 2, ..., h - 1$ of T, have the maximum number of nodes possible (level i has $2^i$ nodes, for $0 ≤ i ≤ h - 1$).

And the remaining nodes at level h reside in the leftmost possible positions at that level.

You can check the website to understand this part 😍

## max heaps ? 🤔

If you really want to make max heaps, you should multiply all values in your list with $-1$

Now you have a max heap ! 

Meaning the max value is in your root!

## `kClosest` and `smash_stones` are beautiful examples. 

In [8]:
from heapq import heapify, heappop, heappush

a = [4,2,1,6,5,8,2]

# O(n)
heapify(a)
print(a) # [1, 2, 2, 6, 5, 8, 4]

# O(log(n))
print(heappop(a)) # 1
print(a) # [2, 2, 4, 6, 5, 8]

# this should go to the root of the heap
# O(log(n))
heappush(a, 0)
print(a)

[1, 2, 2, 6, 5, 8, 4]
1
[2, 2, 4, 6, 5, 8]
[0, 2, 2, 6, 5, 8, 4]


In [10]:
# O(log(n))
print(heappop(a)) # 2
print(a) # [2, 5, 4, 6, 8]

# heaps adjust lists in place

# all True - so True
print(all([a[0] == 2, a[1] == 5, a[2] == 4, 
            a[3] == 6, a[4] == 8]))

2
[2, 5, 4, 6, 8]
True


## The Heap Data Structure 💖

A heap T storing n entries has height $h = ⌊log(n⌋$ ($h = ⌊log(n⌋$ - floor of log n).

After a removal of an item, there will be a down heap bubbling.

## Analysis of a Heap-Based Priority Queue

Just results:

| Operation                | Running Time |     |
| ------------------------ | ------------ | --- |
| `len(P)`, `P.is_empty()` | $O(1)$       |     |
| `P.min()`                | $O(1)$       |     |
| `P.add()`                | $O(log(n))$* |     |
| `P.remove_min()`         | $O(log(n))$* |     |

`*` means amortized, if array based.

## Python’s `heapq` Module 😍

Python’s standard distribution includes a `heapq` module that provides support for heap-based priority queues. 

**That module does not provide any priority queue class; instead it provides functions that allow a standard Python list to be managed as a heap.** 
 
Its model is essentially the same as our own, with n elements stored in list cells $L[0]$ through $L[n − 1]$, based on the level-numbering indices with the smallest element at the root in $L[0]$. 

We note that `heapq` does not separately manage associated values; elements serve as their own key.

The `heapq` module supports the following functions, all of which presume that existing list L satisfies the heap-order property prior to the call:

| Method                   | Explained                                                                                                                                                                                                                                                                                                                                                                                                    |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `heappush(L, e)`         | Push element e onto list L and restore the heap-order property. The function executes in $O(log n)$ time.                                                                                                                                                                                                                                                                                                    |
| `heappop(L)`             | Pop and return the element with smallest value from list L, and reestablish the heap-order property. The operation executes in $O(log n)$ time.                                                                                                                                                                                                                                                              |
| `heappushpop(L, e)`      | Push element `e` on list L and then pop and return the smallest item. The time is $O(log n)$, but it is slightly more efficient than separate calls to `push` and `pop` because the size of the list never changes. If the newly pushed element becomes the smallest, it is immediately returned. Otherwise, the new element takes the place of the popped element at the root and a down-heap is performed. |
| `heapreplace(L, e)`      | Similar to `heappushpop`, but equivalent to the `pop` being performed before the `push` (in other words, the new element cannot be returned as the smallest). Again, the time is $O(log n)$, but it is more efficient that two separate operations. The module supports additional functions that operate on sequences that do not previously satisfy the heap-order property.                               |
| `heapify(L)`             | Transform unordered list to satisfy the heap-order property. This executes in $O(n)$ time by using the **bottom-up construction** algorithm.                                                                                                                                                                                                                                                                 |
| `nlargest(k, iterable)`  | Produce a list of the `k` largest values from a given iterable. This can be implemented to run in $O(n + k log n)$ time, where we use n to denote the length of the iterable.                                                                                                                                                                                                                                |
| `nsmallest(k, iterable)` | Produce a list of the `k` smallest values from a given iterable. This can be implemented to run in $O(n + k log n)$ time, using similar technique as `nlargest`.                                                                                                                                                                                                                                             |

In [1]:
# heapq have interesting methods:
from heapq import nlargest, nsmallest

print(nsmallest(6, [1,2,3,4,5,6,7,8,9,10])) # [1, 2, 3, 4, 5, 6]

print(nlargest(3, [10,20,30,400,5000])) # [5000, 400, 30]

[1, 2, 3, 4, 5, 6]
[5000, 400, 30]


# Examples are here! 🍉

In [4]:
from collections import Counter
from heapq import nlargest

def top_frequent_elements(nums: list[int], k : int ) -> list[int]:
    
    c = Counter(nums)

    # COunter has most_common method
    return list((elem for elem, value in c.most_common(k)))
    
print(top_frequent_elements(nums = [1,1,1,2,2,3], k = 2))

def topKFrequent(nums: list[int], k: int) -> list[int]:
    # small edge case
    if k == len(nums):
        return nums

    # frequency map 
    count = Counter(nums)
    
    # print(count) # Counter({1: 3, 2: 2, 3: 1})
    
    # parameters for nlargest(n, iterable, key)
    # return the n largest based on the values of the Counter keys
    return nlargest(k, count.keys(), key=count.get)

print(topKFrequent(nums = [1,1,1,2,2,3], k = 2))

[1, 2]
[1, 2]


In [1]:
"""Genius - about heapq"""

# C-9.26 
# Show how to implement the stack ADT using only a 
# priority queue and one additional integer 
# instance variable.

import heapq

class StackWithPriorityQueue:
    def __init__(self):

        # Priority queue as a min-heap
        self.pq = []  

        # Additional integer to maintain the order
        self.order = 0  

    def push(self, item):
        
        # Push the item into the priority queue 
        # with a negative order
        heapq.heappush(self.pq, (-self.order, item))

        # Increment the order for the next item
        self.order += 1

    def pop(self):
        if self.is_empty():
            raise IndexError("Stack is empty")
        
        # Pop the item with the highest order
        _, item = heapq.heappop(self.pq)  

        self.order -= 1  # Decrease the order
        return item

    def top(self):
        if self.is_empty():
            raise IndexError("Stack is empty")
        
        # Get the item with the highest order 
        # without popping it
        return self.pq[0][1]

    def is_empty(self):
        return len(self.pq) == 0

    def size(self):
        return len(self.pq)

# Example usage:
stack = StackWithPriorityQueue()
stack.push(1)
stack.push(2)
stack.push(3)

print(f"current stack: {stack.pq}") 
# current stack: [(-2, 3), (0, 1), (-1, 2)]

print("Top:", stack.top())  # Output: 3
print("Pop:", stack.pop())  # Output: 3
print("Pop:", stack.pop())  # Output: 2
print("Is Empty?", stack.is_empty())  # Output: False
print("Size:", stack.size())  # Output: 1

# In this implementation:

# We use a priority queue (min-heap) to store the 
# elements with a negative order so that 
# the highest order (lowest negative value) item 
# comes out first when popped.

# The order variable is used to assign a unique 
# order to each element pushed into the stack.

# When popping an element, we extract the element 
# with the highest order (lowest negative value) from 
# the priority queue.

# The top method returns the element with 
# the highest order without popping it.

# The is_empty and size methods provide 
# information about the stack's state.

# This implementation effectively simulates a stack 
# using a priority queue and an order variable.

current stack: [(-2, 3), (0, 1), (-1, 2)]
Top: 3
Pop: 3
Pop: 2
Is Empty? False
Size: 1


In [5]:
"""
Design a class to find the kth largest element in a stream. 

Note that it is the kth largest element in 
the sorted order, not the kth distinct element.

Implement KthLargest class:

    KthLargest(int k, int[] nums) Initializes the object with 
        the integer k and the stream of integers nums.

    int add(int val) Appends the integer val to the stream 
        and returns the element representing the kth 
        largest element in the stream.
 
Example 1:

    Input:
        ["KthLargest", "add", "add", "add", "add", "add"]
        [[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]]
    
    Output:
        [null, 4, 5, 5, 8, 8]

    Explanation:

        KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]);
        kthLargest.add(3);   // return 4
        kthLargest.add(5);   // return 5
        kthLargest.add(10);  // return 5
        kthLargest.add(9);   // return 8
        kthLargest.add(4);   // return 8
 
Constraints:

    1 <= k <= 10^4
    
    0 <= nums.length <= 10^4
    
    -10^4 <= nums[i] <= 10^4
    
    -10^4 <= val <= 10^4
    
    At most 10^4 calls will be made to add.
    
    It is guaranteed that there will be at least k elements 
    
    in the array when you search for the kth element.

Takeaway:

    if we use an array,
    sorting would be n log n
    finding where to insert would be o(n)

    lets use a min heap of size k
    we can get the min of min heap in o(1)
    we can add an element in log n time

    kth element will be the smallest element in 
    size k min heap
"""

from heapq import heapify, heappush, heappop


class KthLargest_:
    # THIS DOES NOT WORK
    # Nice try but yeah.

    def __init__(self, k: int, nums: list[int]):
        """This was an attempt to make MaxHeap
        Turns out its not needed."""
        negative_nums = [-elem for elem in nums]
        
        heapify(negative_nums)
        self.stream = negative_nums  
        self.k = k

    def add(self, val: int) -> int:
        # pushes a new item on the heap
        heappush(self.stream,  (-1) * val)
        #  return kth item on the heap without popping it
        return self.stream[self.k - 1]

class KthLargest:

    # if we use an array,
    # sorting would be n log n
    # finding where to insert would be o(n)

    # lets use a min heap of size k
    # we can get the min of min heap in o(1)
    # we can add an element in log n time
    
    def __init__(self, k: int,  nums: list[int]):
        # minheap with K largest integers
        self.heap = nums
        self.k = k
        # turn the list into a minimum heap
        heapify(self.heap)

        # we only need a heap with k elements
        while len(self.heap) > k:
            heappop(self.heap)

    def add(self, val: int) -> int:
        heappush(self.heap, val)

        # if we have more elements than k
        if len(self.heap) > self.k:
            heappop(self.heap)

        # kth element is the smallest now
        return (self.heap[0])

# Your KthLargest object will be instantiated and called as such:
# obj = KthLargest(k, nums)
# param_1 = obj.add(val)

In [9]:
"""
You are given an array of integers stones where 
stones[i] is the weight of the ith stone.

We are playing a game with the stones. 

On each turn, we choose the heaviest two stones 
and smash them together. 

Suppose the heaviest two stones have 
weights x and y with x <= y. 

The result of this smash is:

    If x == y, both stones are destroyed, and
    
    If x != y, the stone of weight x is destroyed, and the stone 
        of weight y has new weight y - x.

At the end of the game, there is at most one stone left.

Return the weight of the last remaining stone. 

If there are no stones left, return 0.

Example 1:

    Input: stones = [2,7,4,1,8,1]
    
    Output: 1
    
    Explanation: 
        
        We combine 7 and 8 to get 1 so the array 
            converts to [2,4,1,1,1] then,

        we combine 2 and 4 to get 2 so the array 
            converts to [2,1,1,1] then,

        we combine 2 and 1 to get 1 so the array 
            converts to [1,1,1] then,

        we combine 1 and 1 to get 0 so the array 
            converts to [1] then 
            
            that's the value of the last stone.

Example 2:

    Input: stones = [1]
    
    Output: 1
 
Constraints:

    1 <= stones.length <= 30
    
    1 <= stones[i] <= 1000

Takeaway:

    if we use a sorted approach, we have to sort 
    the list every time

    use a max heap, 
    heapify takes o(n)
    every time accessing max heap is o(log n) - possibly 
    running n times

    to make a max heap in Python, just multiply 
    every value with -1

    in the calculations, use the example in your mind.

"""

from heapq import heapify, heappop, heappush

class Solution:

    def lastStoneWeight_(self, stones: list[int]) -> int:
        # brute force approach
        # did not work
        def smash(seq):
            if len(seq) == 1:
                return seq
            biggest = max(seq)
            seq.remove(biggest)
            second_biggest = max(seq)  # Note: Find the max from the updated seq list.
            
            if biggest == second_biggest:
                seq.remove(second_biggest)
            else:
                seq.append(biggest - second_biggest)
            
            return seq  # Return the modified list
        
        while len(stones) > 2:
            stones = smash(stones)  # Update the stones list

        return stones[0]

    def lastStoneWeight(self, stones: list[int]) -> int:
        # if we use a sorted approach, we have to sort 
        # the list every time

        # use a max heap, 
        # heapify takes o(n)
        # every time accessing max heap is o(log n) - possibly 
        # running n times

        # to make a max heap in Python, just multiply 
        # every value with -1

        stones = [-elem for elem in stones]
        heapify(stones)

        while len(stones) > 1:
            first = heappop(stones)
            second = heappop(stones)

            if second > first:
                # -5 - - 8 = 3
                # so make it first - second instead of 
                # second - first

                # we want negatives, this is a 
                # max heap
                heappush(stones, first - second)

        # if there are not any stones in the sequence
        stones.append(0)
        return abs(stones[0])

if __name__ == "__main__":
    
    sol = Solution()
    print(sol.lastStoneWeight_(stones = [2,7,4,1,8,1]))
    print(sol.lastStoneWeight_(stones = [1]))

    print()
    print(sol.lastStoneWeight(stones = [2,7,4,1,8,1]))
    print(sol.lastStoneWeight(stones = [1]))

1
1

1
1


In [11]:
"""
Given an array of points where points[i] = [xi, yi] represents 
a point on the X-Y plane and an integer k, return the k 
closest points to the origin (0, 0).

The distance between two points on the X-Y plane is the Euclidean 
distance (i.e., √(x1 - x2)2 + (y1 - y2)2).

You may return the answer in any order. 

The answer is guaranteed to be unique (except for the order that it is in).

Example 1:

    Input: points = [[1,3],[-2,2]], k = 1
    
    Output: [[-2,2]]

    Explanation:
        
        The distance between (1, 3) and the origin is sqrt(10).
        The distance between (-2, 2) and the origin is sqrt(8).
        Since sqrt(8) < sqrt(10), (-2, 2) is closer to the origin.

        We only want the closest k = 1 points from the origin, so 
            the answer is just [[-2,2]].

Example 2:

    Input: points = [[3,3],[5,-1],[-2,4]], k = 2
    
    Output: [[3,3],[-2,4]]

    Explanation: 
    
        The answer [[-2,4],[3,3]] would also be accepted.

Constraints:

    1 <= k <= points.length <= 10^4
    -10^4 <= xi, yi <= 10^4

Takeaway:

    this solution works but it is O(n log n)
    [[1,3],[-2,2]], k = 1
    [[-2, 2]]

    if we can calculate the euclidian distance 
    for every point and
    sort the points list accordingly

    we can return a slice of it.

    heap solution

    we only want k points, we do not have to 
    sort all of the list
    we can use a min heap

    we can calculate for [[1,3],[-2,2]]
    [10, 1, 3], [8, -2, 2]
    put them all in a min heap with heapify - which is o(n)
    and select k items, which would be 
    k log n which would be better than n log n

"""


from heapq import heapify, heappop, _heapify_max

class Solution:

    def kClosest_(self, points: list[list[int]], k: int) -> list[list[int]]:
        # this solution works but it is O(n log n)
        # [[1,3],[-2,2]], k = 1
        # [[-2, 2]]
        
        # if we can calculate the euclidian distance for every point and
        # sort the points list accordingly
        
        # we can return a slice of it.
        points_and_distance = []       
        
        for elem in points:
            # we dont even need to square root because we are just comparing
            dist = (((elem[0] * elem[0]) + (elem[1] * elem[1]) ) **  0.5 )
            # add distance next to the coordinates
            points_and_distance.append(elem + [dist])

        print(points_and_distance)

        points_and_distance.sort(key = lambda x : float(x[2]))
        
        # return points_and_distance[k:][0] 

        return [point[:2] for point in points_and_distance[:k]]

    def kClosest(self, points: list[list[int]], k: int) -> list[list[int]]:
        # we only want k points, we do not have to 
        # sort all of the list
        # we can use a min heap

        # we can calculate for [[1,3],[-2,2]]
        # [10, 1, 3], [8, -2, 2]
        # put them all in a min heap with heapify - which is o(n)
        # and select k items, which would be 
        # k log n which would be better than n log n
        
        min_heap = []

        # o(n)
        for x, y in points:
            dist_representer = (x * x) + (y * y)
            min_heap.append([dist_representer, x, y])
        
        # o(n)
        heapify(min_heap)

        # this is a way to go
        res = []
        while k > 0:
            dist, x, y = heappop(min_heap)
            res.append([x, y])
            k -= 1

        return res

        # this is another way to go
        # return [[x, y] for _, x, y in [heappop(min_heap) for _ in range(k)]]


if __name__ == '__main__':
    sol = Solution()
    print(sol.kClosest(points = [[1,3],[-2,2]], k = 1))


[[-2, 2]]


In [12]:
"""
Given an integer array nums and an integer k, return the kth 
largest element in the array.

Note that it is the kth largest element in the sorted 
order, not the kth distinct element.

Can you solve it without sorting?

Example 1:

    Input: nums = [3,2,1,5,6,4], k = 2
    
    Output: 5

Example 2:

    Input: nums = [3,2,3,1,2,4,5,5,6], k = 4
    
    Output: 4

Constraints:

    1 <= k <= nums.length <= 10^5
    
    -10^4 <= nums[i] <= 10^4

Takeaway:

    instead of sorting - o(nlogn)
    we can use a heap

    every time we pop an element from the heap
    it is log n
    so this result will be - o(n + k log n)

    There is a solution whick uses quick_select
    which is kinda like quick_sort
"""

from heapq import heapify, heappop

class Solution:

    def findKthLargest__(self, nums: list[int], k: int) -> int:
        # instead of sorting - o(nlogn)
        # we can use a heap
        # THIS WORKS

        # every time we pop an element from the heap
        # it is log n
        # so this result will be - o(n + k log n)

        negative = [-elem for elem in nums]
        heapify(negative)
        # now we have a max heap
        # for [3,2,1,5,6,4] - negative - [-6, -5, -4, -3, -2, -1]
        
        for _ in range(k):
            result = heappop(negative)
            
        return -result

    def findKthLargest_(self, nums: list[int], k: int) -> int:
        # with sorting
        nums.sort(reverse = True)
        return nums[k - 1]

    def findKthLargest(self, nums: list[int], k: int) -> int:
        # quick select - kinda like quicksort
        # average case o(n) - worst case o(n**2)
        
        # if the array is sorted
        # kth largest element is:
        k  = len(nums) - k

        # left and right pointers
        def quick_select(l, r):
            # let pivot bve the rightmost element
            # p is just left pointer
            pivot, p = nums[r], l
            # iterate over the list, except last element
            for i in range(l, r):
                # 
                if nums[i] <= pivot:
                    nums[p], nums[i] = nums[i], nums[p]
                    # onto next position
                    p += 1
            # swapping the rightmost element with current 
            # position of the pointer
            nums[p], nums[r] = nums[r], nums[p]


            if p > k :
                # look at the left portion
                return quick_select(l , p - 1)
            elif p < k:
                # look at the right portion
                return quick_select(p + 1, r)
            else: 
                # we found the element
                return nums[p]

        return quick_select(0, len(nums) - 1)

In [15]:
"""
Given a characters array tasks, representing the tasks a CPU needs 
to do, where each letter represents a different task. 

Tasks could be done in any order. 

Each task is done in one unit of time. 

For each unit of time, the CPU could complete either one task or just be idle.

However, there is a non-negative integer n that represents the cooldown 
period between two same tasks (the same letter in the array), that is 
that there must be at least n units of time between any two same tasks.

Return the least number of units of times that the CPU will 
take to finish all the given tasks.

Example 1:

    Input: tasks = ["A","A","A","B","B","B"], n = 2
    
    Output: 8
    
    Explanation: 
        
        A -> B -> idle -> A -> B -> idle -> A -> B
        There is at least 2 units of time between any two same tasks.

Example 2:

    Input: tasks = ["A","A","A","B","B","B"], n = 0

    Output: 6

    Explanation: 
    
        On this case any permutation of size 6 would 
            work since n = 0.
        
        ["A","A","A","B","B","B"]
        ["A","B","A","B","A","B"]
        ["B","B","B","A","A","A"]
        ...
        And so on.

Example 3:

    Input: tasks = ["A","A","A","A","A","A","B",
        "C","D","E","F","G"], n = 2
    
    Output: 16
    
    Explanation: 
        
        One possible solution is
        A -> B -> C -> A -> D -> E -> A -> F -> G 
            -> A -> idle -> idle -> A -> idle -> idle -> A

Constraints:

    1 <= task.length <= 10^4

    tasks[i] is upper-case English letter.
    
    The integer n is in the range [0, 100].

Takeaway:

    we have a tasks list
    we need to make sure that there is at least 
    n difference between same tasks

    we need to know the most frequent task
    because it will occur first in our solution

    we will use a max heap
    which task is most frequent - time complexity o(log n)

    we will also use a queue to hold the frequencies of tasks
    and the time they will be scheduled again
"""

from collections import Counter, deque
import heapq

class Solution:

    def leastInterval__(self, tasks: list[int], n: int) -> int:
        # does not work
        
        # edge case
        if n == 0: return len(tasks)
        
        # most optimal way would be running 
        # distinct tasks if possible
        freq_map = {}
        for elem in tasks:
            freq_map[elem] = freq_map.get(elem, 0) + 1
        
        print(freq_map)
        
        length = 0
        
        # RuntimeError: dictionary changed size during iteration
        # you cannot do that
        
        while freq_map:
            for k in freq_map:
                if freq_map[k] == 0:
                    del freq_map[k]
                    continue
                freq_map[k] -= 1
                length += 1
                
        return length

    def leastInterval_(self, tasks: list[int], n: int) -> int:
        
        # we have a tasks list
        # we need to make sure that there is at least 
        # n difference between same tasks
        
        # we need to know the most frequent task
        # because it will occur first in our solution

        # we will use a max heap
        # which task is most frequent - time complexity o(log n)

        # we will also use a queue to hold the frequencies of tasks
        # and the time they will be scheduled again

        # count the occurrences of each task
        count = Counter(tasks)
        
        # make a max heap to store negative task 
        # counts (most frequent tasks first)
        max_heap = [-cnt for cnt in count.values()]
        heapq.heapify(max_heap)
        
        # Initialize time to keep track of the current time
        time = 0
        
        # make a queue to hold frequencies of 
        # tasks and the time they will be scheduled again
        # [frequency, time]
        q = deque()

        # Main loop
        while max_heap or q:
            time += 1

            if max_heap:
                # Retrieve the most frequent task and 
                # update its count
                cnt = 1 + heapq.heappop(max_heap)
                
                # If the count is not zero, schedule it for later
                if cnt:
                    q.append([cnt, time + n])

            if q and q[0][1] == time:
                # Check if a task is scheduled to be 
                # executed at the current time
                heapq.heappush(max_heap, q.popleft()[0])

        # Return the total time taken to complete 
        # the tasks with intervals
        return time

    def leastInterval(self, tasks: list[int], n: int) -> int:
        # much smaller code in size

        # edge case
        if n == 0:
            return len(tasks)

        # Get the count of each task using Counter and
        #  store them in a list
        counts = list(Counter(tasks).values())
        
        # Find the maximum count, i.e., the most repeated task
        most_repeats = max(counts)
        
        # Count how many tasks have the maximum count
        num_longest = counts.count(most_repeats)
        
        # Calculate the minimum time required to 
        # complete the tasks
        # 
        # This is based on the maximum count and the
        # cooling interval (n)
        # 
        # If there are gaps between the most repeated 
        # tasks, consider them
        # 
        # Otherwise, return the length of the tasks list
        return max(len(tasks), (most_repeats-1) * (n+1) + num_longest)
    
sol = Solution()
print(sol.leastInterval_(tasks = ["A","A","A","B","B","B"], n = 2)) # 8

8


In [7]:
"""
Design a simplified version of Twitter where users can 
post tweets, follow/unfollow another user, and is able to see the 
10 most recent tweets in the user's news feed.

Implement the Twitter class:

    - Twitter() Initializes your twitter object.

    - void postTweet(int userId, int tweetId) Composes a new tweet 
        with ID tweetId by the user userId. Each call to this function 
        will be made with a unique tweetId.

    - List<Integer> getNewsFeed(int userId) Retrieves the 10 most recent 
        tweet IDs in the user's news feed. Each item in the news feed must 
        be posted by users who the user followed or by the user themself. 
        Tweets must be ordered from most recent to least recent.

    - void follow(int followerId, int followeeId) The user with ID 
        followerId started following the user with ID followeeId.

    - void unfollow(int followerId, int followeeId) The user with ID 
        followerId started unfollowing the user with ID followeeId.
 
Example 1:

    Input:

        ["Twitter", "postTweet", "getNewsFeed", "follow", "postTweet", 
        "getNewsFeed", "unfollow", "getNewsFeed"]
        [[], [1, 5], [1], [1, 2], [2, 6], [1], [1, 2], [1]]

    Output:
    
        [null, null, [5], null, null, [6, 5], null, [5]]
    
    Explanation:

        Twitter twitter = new Twitter();
        twitter.postTweet(1, 5); // User 1 posts a new tweet (id = 5).
        twitter.getNewsFeed(1);  // User 1's news feed should return a list 
        with 1 tweet id -> [5]. return [5]
        twitter.follow(1, 2);    // User 1 follows user 2.
        twitter.postTweet(2, 6); // User 2 posts a new tweet (id = 6).
        twitter.getNewsFeed(1);  // User 1's news feed should return a list 
        with 2 tweet ids -> [6, 5]. Tweet id 6 should precede tweet id 5 
        because it is posted after tweet id 5.
        twitter.unfollow(1, 2);  // User 1 unfollows user 2.
        twitter.getNewsFeed(1);  // User 1's news feed should return a list 
        with 1 tweet id -> [5], since user 1 is no longer following user 2.

Constraints:

    1 <= userId, followerId, followeeId <= 500
    
    0 <= tweetId <= 10^4
    
    All the tweets have unique IDs.
    
    At most 3 * 104 calls will be made to postTweet, getNewsFeed, follow, and unfollow

Takeaway:

    The default thinking just works. Use hashmaps that have values of lists and
        hashmaps that have values of sets.

    Better approach:

    Using defaultdict is nice when you are sure for the 
        type of values of your hashmap

    Using a minheap in this question is especially important 
        for getnewsfeed method. Getting the latest tweets from people that are tweeting
        in their own pace, min_heap helps

"""

class Twitter__:
    # FIRST try
    # does not work
    def __init__(self):
        self.user_follow_dict = {}
        self.user_content = {}
        self.tweet_id = 0

    def postTweet(self, userId: int, tweetId: int) -> None:
        self.user_content[userId] = self.user_content.get(userId, []) + [tweetId]
        
    def getNewsFeed(self, userId: int) -> "list [int]":
        other_peoples_tweets = []
        for user in self.user_follow_dict:
            other_peoples_tweets += self.user_content[user]
        
        result = other_peoples_tweets + self.user_content[userId] 
        
        return sorted(result)

    def follow(self, followerId: int, followeeId: int) -> None:
        # using the huge dict for every person
        # append to the people who someone is following
        self.user_follow_dict[followeeId] = self.user_follow_dict.get(followeeId, []) \
            [followerId]

    def unfollow(self, followerId: int, followeeId: int) -> None:
        self.user_follow_dict[followeeId] = self.user_follow_dict.get(followeeId, []).remove(followeeId) \
            if self.user_follow_dict[followeeId] else []


from collections import defaultdict
from heapq import heappush, heappop, heapify


class Twitter_:
    # with expert help

    def __init__(self):
        # unique people only
        self.user_follow_dict = defaultdict(set)
        # just a lot of tweets for everyone
        self.user_content = defaultdict(list)
        # starting from 0
        self.time = 0
        # as it is given in question
        self.feed_size = 10

    def postTweet(self, userId: int, tweetId: int) -> None:
        # add the tweet to the writers content, as a tuple with id and 
        self.user_content[userId].append((self.time, tweetId))
        self.time += 1

    def getNewsFeed(self, userId: int):
        tweets = []
        # all the following and user his/herself
        users = self.user_follow_dict[userId] | {userId}  # Include the user's own tweets
        for user in users:
            # extend all tweets
            tweets.extend(self.user_content[user])
        tweets.sort(key=lambda x: x[0], reverse=True)
        # only return 10 of the tweets, returning only tweetIds also
        return [tweet[1] for tweet in tweets[:self.feed_size]]

    def follow(self, followerId: int, followeeId: int) -> None:
        # add the follower to followee
        self.user_follow_dict[followerId].add(followeeId)

    def unfollow(self, followerId: int, followeeId: int) -> None:
        # edge case
        if followerId != followeeId:  # Ensure a user cannot unfollow themselves
            # remove followeeId from the set
            self.user_follow_dict[followerId].discard(followeeId)

class Twitter:
    
    # This approach starts from methods, not the constructor
    def __init__(self):
        # time
        self.count = 0
        self.tweet_map = defaultdict(list) # userId -> list of [count, tweetIds]
        self.follow_map = defaultdict(set) # userId -> set of followeeId
        pass

    def postTweet(self, userId: int, tweetId: int) -> None:
        # a hashmap mapping userid to tweet list
        self.tweet_map[userId].append([self.count, tweetId])
        self.count -= 1

    def getNewsFeed(self, userId: int) -> "list[int]":
        # 10 most recent tweets
        # we cannot just compare tweetId , we also need a time.
        # because person2 might have the latest tweet in Twitter
        # compared to person1, but we might also get some tweets 
        # based on timing, from person1
        res = [] # ordered starting from recent
        min_heap = []

        # add the user themself to the list
        self.follow_map[userId].add(userId)

        for followeeId in self.follow_map[userId]:
            # does this person have at least one tweet
            if followeeId in self.tweet_map:
                # last index of the list
                index = len(self.tweet_map[followeeId]) - 1
                count, tweetId = self.tweet_map[followeeId][index]
                min_heap.append([count, tweetId, followeeId, index - 1])
                
        heapify(min_heap)

        while min_heap and len(res) < 10:
            count, tweetId, followeeId, index = heappop(min_heap)
            res.append(tweetId)
            if index >= 0:
                count, tweetId = self.tweet_map[followeeId][index]
                heappush(min_heap, [count, tweetId, followeeId, index -1])

        # we will be returning tweetId's
        return res

    def follow(self, followerId: int, followeeId: int) -> None:
        # we can use a hashmap that holds a list for each user - o(1)
        # But removing the people from this list would be o(n)
        # is there a more efficient way? 
        # dict(set) - which will be o(1) insert and delete
        self.follow_map[followerId].add(followeeId)

    def unfollow(self, followerId: int, followeeId: int) -> None:
        if followeeId in self.follow_map[followerId]:
            self.follow_map[followerId].remove(followeeId)

# Your Twitter object will be instantiated and called as such:
# obj = Twitter()
# obj.postTweet(userId,tweetId)
# param_2 = obj.getNewsFeed(userId)
# obj.follow(followerId,followeeId)
# obj.unfollow(followerId,followeeId)

In [8]:
"""
The median is the middle value in an ordered integer list. 

If the size of the list is even, there is no middle value, and the 
median is the mean of the two middle values.

For example, for arr = [2,3,4], the median is 3.

For example, for arr = [2,3], the median is (2 + 3) / 2 = 2.5.

Implement the MedianFinder class:

    - MedianFinder() initializes the MedianFinder object.

    - void addNum(int num) adds the integer num from the data stream 
        to the data structure.

    - double findMedian() returns the median of all elements so far. 

    Answers within 10-5 of the actual answer will be accepted.

Example 1:

    Input:
        ["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
        [[], [1], [2], [], [3], []]
    
    Output:
        [null, null, null, 1.5, null, 2.0]

    Explaination:
        
        MedianFinder medianFinder = new MedianFinder();
        medianFinder.addNum(1);    // arr = [1]
        medianFinder.addNum(2);    // arr = [1, 2]
        medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
        medianFinder.addNum(3);    // arr[1, 2, 3]
        medianFinder.findMedian(); // return 2.0
 
Constraints:

    -10^5 <= num <= 10^5

    There will be at least one element in the data structure 
        before calling findMedian.
    
    At most 5 * 104 calls will be made to addNum and findMedian.
 
Follow up:

    If all integer numbers from the stream are in the range [0, 100], 
        how would you optimize your solution?

    If 99% of all integer numbers from the stream are in the 
        range [0, 100], how would you optimize your solution?

Takeaway:

    Of course first approach woul be using a list.

    The idea is basically to use two heaps a small and a large heap
    Adding and removing elements from the heap will be o(logn)

    Finding the max / min in constant time - o(1)

    small heap will be a max heap, 
    large heap will be a min heap

    Size of the heaps should be approximately Equal
    AT MOST difference of 1.

    Also, every element in small heap 
    should be smaller than the big heap

"""

from heapq import heapify, heappop, heappush, heappushpop

class MedianFinderObvious:
    # Yeah

    def __init__(self):
        self.stream = []

    def addNum(self, num: int) -> None:
        self.stream.append(num)

    def findMedian(self) -> float:
        # size 5
        # 0, 1, 2, 3, 4
        size = len(self.stream)
        self.stream.sort()
        if size % 2 == 0:
            return (self.stream[(size//2)] + self.stream[(size//2) - 1]) / 2
        else:
            return self.stream[size//2]


class MedianFinder:

    def __init__(self):
        # two heaps, small and large
        # small is the max heap and large is the minheap
        self.small , self.large = [], []

    def addNum(self, num: int) -> None:
        heappush(self.small, -1 * num)
        
        # make sure every elem in small is <= than every elem in large
        # small heap is max heap so root is biggest
        # large heap is min heap so root is smallest 
        if (self.small and self.large and (-1 * self.small[0]) > self.large[0]):
            val = -1 * heappop(self.small)
            heappush(self.large, val)
        
        # check the size difference is only 1 or 0
        if len(self.small) > len(self.large) + 1:
            val = -1 * heappop(self.small)
            heappush(self.large, val)
            
        if len(self.large) > len(self.small) + 1:
            val = heappop(self.large)
            heappush(self.small, -1 * val)
        
    def findMedian(self) -> float:
        # odd number of elems
        if len(self.small)  > len(self.large):
            return self.small[0] * -1
        if len(self.large) > len(self.small):
            return self.large[0]

        # even number of elements
        # same sizes in small an large
        return ( - 1 * self.small[0] + self.large[0]) / 2
        

class MedianFinderFaster:
    # even faster
    # for the curious minds

    def __init__(self):
        self.small = []  # the smaller half of the list, max heap (invert min-heap)
        self.large = []  # the larger half of the list, min heap

    def addNum(self, num):
        if len(self.small) == len(self.large):
            heappush(self.small, -heappushpop(self.large, num))
        else:
            heappush(self.large, -heappushpop(self.small, -num))

    def findMedian(self):
        if len(self.small) == len(self.large):
            return float(self.large[0] - self.small[0]) / 2.0
        else:
            return float(-self.small[0])


# Your MedianFinder object will be instantiated and called as such:
# obj = MedianFinder()
# obj.addNum(num)
# param_2 = obj.findMedian()

**Stock Trading Idea PQ - `stocktrading`**

2 PQ's:

    - buy (max price at root)
    - sell (min price at root)

Process orders:

    - When buy order is entered insert into `BuyPQ`. When sell order is entered insert into `SellPQ`

Match Orders:

    - Check root of both PQ's.

    - If sell has smaller than buy - remove both!

In [2]:
"""The idea can be written down"""

# C-9.50 
# An online computer system for trading stocks needs to process orders of
# the form “buy 100 shares at $x each” or “sell 100 shares at $y each.” 
# 
# A buy order for $x can only be processed if there is an existing sell order
# with price $y such that y ≤x. 
# 
# Likewise, a sell order for $y can only be processed if there is an existing
#  buy order with price $x such that y ≤x.

# If a buy or sell order is entered but cannot be processed, it must wait for a
# future order that allows it to be processed. 
# 
# Describe a scheme that allows buy and sell orders to be entered in O(log n) time,
#  independent of whether or not they can be immediately processed.

"""Solution"""

"""2 Priority Queues"""

# To efficiently process buy and sell orders in O(log n) time, you can use
#  two priority queues: one for buy orders and one for sell orders. Each priority
#  queue is implemented as a binary heap. Here's a scheme to achieve this:

# 1) Buy Priority Queue: This priority queue will store buy orders, and its elements
#  will be sorted in descending order of price. The maximum-priced buy order will be at the root.

# 2) Sell Priority Queue: This priority queue will store sell orders, and its elements will
#  be sorted in ascending order of price. The minimum-priced sell order will be at the root.

# 3) Processing Orders:

#       When a buy order is entered (e.g., "buy 100 shares at $x each"), insert it into
#  the Buy Priority Queue.
#    
#       When a sell order is entered (e.g., "sell 100 shares at $y each"), insert it into the
#  Sell Priority Queue.

# 4) Processing Matched Orders:

#      To match a buy order with a sell order, check the root of both the Buy and Sell Priority Queues.

#      If the root of the Sell Priority Queue has a price less than or equal to the root of 
# the Buy Priority Queue, a match is found.

#      Process the matched buy and sell orders and remove them from their respective priority queues.

# 5) Waiting for Future Orders:

#       If a buy or sell order is entered but cannot be immediately processed, it remains in its 
# respective priority queue.

#       Orders will wait until a future order with a matching or better price is entered.


# 6) Time Complexity:

#       Inserting an order into a binary heap takes O(log n) time.

#       Finding the root of the heap (max or min price) takes O(1) time.

#       Matching orders involves checking the roots of both priority queues, which 
# also takes O(1) time.

#       Overall, the scheme allows orders to be entered and matched in O(log n) time complexity.

# This scheme efficiently handles orders by maintaining two priority queues that allow for
#  quick insertion and matching of buy and sell orders based on price. The priority queues
#  ensure that orders are processed in the order of their priority (price), and unmatched
#  orders can wait until future orders arrive.

'2 Priority Queues'

In [1]:
# C-5.30 
# When Bob wants to send Alice a message M on the Internet, he breaks M
# into n data packets, numbers the packets consecutively, and injects them
# into the network. When the packets arrive at Alice’s computer, they may
# be out of order, so Alice must assemble the sequence of n packets in order
# before she can be sure she has the entire message. Describe an efficient
# scheme for Alice to do this, assuming that she knows the value of n. What 
# is the running time of this algorithm?

# PRIORITY QUEUES MAN

# To efficiently assemble the sequence of n data packets in the correct order, Alice
#  can use a priority queue (e.g., a min-heap) to keep track of the packets as
#  they arrive. Each packet should be labeled with its packet number, allowing Alice 
# to identify the correct order once all packets have been received.

# Here's a step-by-step description of the algorithm:

# 1. Alice initializes an empty priority queue.

# 2. As each packet arrives, she inserts it into the priority queue, using
#  the packet number as the priority key.

# 3. Alice repeatedly extracts packets from the priority queue until
#  it is empty, which automatically sorts the packets in ascending order based on their packet numbers.

# 4. The packets extracted from the priority queue form the message M in the correct order.

# Since Alice uses a priority queue to sort the packets, the time complexity of
#  this algorithm is O(n log n). This is because inserting each packet into the
#  priority queue takes O(log n) time, and she needs to insert n packets.
#  Additionally, extracting n packets from the priority queue also takes O(n log n) time.
#  Therefore, the overall running time of the algorithm is O(n log n), which 
# is efficient for a large number of data packets.

In [8]:
"""
You are given a sorted integer array arr 
containing 1 and prime numbers, where all 
the integers of arr are unique. 

You are also given an integer k.

For every i and j where 0 <= i < j < arr.length, 
we consider the fraction arr[i] / arr[j].

Return the kth smallest fraction considered. 

Return your answer as an array of integers 
of size 2, where answer[0] == arr[i] and 
answer[1] == arr[j].

Example 1:

    Input: arr = [1,2,3,5], k = 3
    
    Output: [2,5]
    
    Explanation: 
        
        The fractions to be considered in sorted order are:
    
            1/5, 1/3, 2/5, 1/2, 3/5, and 2/3.

            The third fraction is 2/5.

Example 2:

    Input: arr = [1,7], k = 1
    
    Output: [1,7]
    
Constraints:

    2 <= arr.length <= 1000
    
    1 <= arr[i] <= 3 * 104
    
    arr[0] == 1
    
    arr[i] is a prime number for i > 0.
    
    All the numbers of arr are unique and 
    sorted in strictly increasing order.
    
    1 <= k <= arr.length * (arr.length - 1) / 2

Takeaway:

    Brute force -> optimized. Heaps!

"""

import heapq

class Solution:
    def kthSmallestPrimeFraction_(self, arr: list[int], 
                                   k: int) -> list[int]:
        # works, but really slow
        
        # sorted array, 1 and primes
        # unique elements
        # brute force would be just to find all fractions 00
        # return the kth one
        fractions = []
        for i in range(len(arr)):
            for j in range(i + 1, len(arr)):
                fractions.append((arr[i]/arr[j], 
                                  (arr[i], arr[j])))
        # print(fractions)
        fractions.sort(key = lambda x: x[0])
        # print(fractions)
        
        return fractions[k-1][1]
      
    def kthSmallestPrimeFraction(self, arr: list[int], 
                                   k: int) -> list[int]:
        
        # the two pointer approach wont work, 
        # because there are a lot of cases to 
        # consider 
        
        # let's try a heap approach!
        
        pq = []
        
        for j in range(1, len(arr)):
            # push all fractions
            # that are the lowest
            heapq.heappush(pq, (arr[0] / arr[j], 0, j))
        
        # this will be automatically sorted 
        # based on the fractions
        
        # because that's how tuples are sorted
        # (1,2,3) < (2,3,4)
        
        # for k times
        for _ in range(k - 1):
            # pop a value
            _, i, j = heapq.heappop(pq)
            
            if i + 1 < j:
                # between i and j
                # difference is bigger than 1
                # push the element onto heap
                heapq.heappush(pq, 
                               (arr[i + 1] / arr[j], 
                                i + 1, 
                                j))
        
        # min element is at 
        # the root of the heap
        _, i, j = pq[0]
        
        return [arr[i], arr[j]]
    

sol = Solution()
print(sol.kthSmallestPrimeFraction(arr = [1,2,3,5], k = 3))

[2, 5]
