# Heaps

Heaps are useful in optimizing the insert, find min and delete min operations

Heaps can be implemented as a binary tree

Binary heaps are complete binary trees where each node has 2 child nodes and all the leaf nodes are as left as possible

# Array Representation of a heap

Heap can be represented as a array, where parent can be at ith index, its left child can be at 2*i +1 th index and its right child can be at 2*i +2 th index

To get the parent node of any node, floor(i-1/2), will give the parent node

Reason why complete binary tree is recommended is because otherwise we might end up an array with gaps

# Properties of Complete Binary Tree

<strong>Height of a node</strong> = Number of edges from a particular node to the longest possible leaf

<strong>Height of a Tree</strong> = Height of the root node

<strong>Given height of a tree, max number of nodes </strong> = 2<sup>h+1</sup>-1

<strong>Given n nodes, minimum height of a tree</strong> = floor(log<sub>2</sub>n)

# Build a heap

One way is to sort the array in descending order(max heap), time for this will be nlogn

Another Property of a complete binary tree is, The leaves start from floor(n/2) till n [with n being the size of the binary tree]. We are interested in finding the leaves because every leaf is a tree with 1 node. Each tree with 1 node is already a heap with either min heap property or max heap property

Leaves are already a heap so we need to take the largest index which is a non leaf and start heapifying it

In [3]:
#Building a maxheap using the array
from math import floor

def maxHeapify(arr,i):
    heapsize=len(arr)
    l=2*i+1
    r=2*i+2
    if l<heapsize and arr[i]<arr[l]:
        largest=l
    else:
        largest=i
    if r<heapsize and arr[r]>arr[largest]:
        largest=r
    if largest!=i:
        arr[i],arr[largest]=arr[largest],arr[i]
        maxHeapify(arr,largest)

def buildHeap(arr,n):
    for i in range(floor(n//2)-1,-1,-1):
        maxHeapify(arr,i)

if __name__ == '__main__':
    arr=[9,5,10,11,12,1,3,2]
    buildHeap(arr,len(arr))
    print(arr)


[12, 11, 10, 9, 5, 1, 3, 2]


Space Complexity -> O(logn), (number of recursive calls made= number of levels (logn)), where n is the number of nodes in the subtree for which i is the root. So space complexity depends on where the maxHeapify is called

Time Complexity-> O(n) (notes) <strong>TODO</strong>

Building a heap takes O(n) time and heapify takes O(logn) time

<pre>
Heaps can be of two types:

<strong>Max-Heap:</strong> In a Max-Heap the key present at the root node must be greatest among the keys present at all of it’s children. The same property must be recursively true for all sub-trees in that Binary Tree.

<strong>Min-Heap:</strong> In a Min-Heap the key present at the root node must be minimum among the keys present at all of it’s children. The same property must be recursively true for all sub-trees in that Binary Tree.
</pre>

The traversal method use to achieve Array representation is Level Order

# Applications of Heaps:

Heap Sort-> Uses binary heap to sort the array in O(nlogn) time

Priority Queue: Priority queues can be efficiently implemented using Binary Heap because it supports insert(), delete() and extractmax(), decreaseKey() operations in O(logn) time. Binomoial Heap and Fibonacci Heap are variations of Binary Heap. These variations perform union also efficiently.

Graph Algorithms: The priority queues are especially used in Graph Algorithms like Dijkstra’s Shortest Path and Prim’s Minimum Spanning Tree.

Order Statistics

# Operations on Min Heap:

getMini(): It returns the root element of Min Heap. Time Complexity of this operation is O(1).

extractMin(): Removes the minimum element from MinHeap. Time Complexity of this Operation is O(Logn) as this operation needs to maintain the heap property (by calling heapify()) after removing root.

decreaseKey(): Decreases value of key. The time complexity of this operation is O(Logn). If the decreases key value of a node is greater than the parent of the node, then we don’t need to do anything. Otherwise, we need to traverse up to fix the violated heap property.

insert(): Inserting a new key takes O(Logn) time. We add a new key at the end of the tree. IF new key is greater than its parent, then we don’t need to do anything. Otherwise, we need to traverse up to fix the violated heap property.

delete(): Deleting a key also takes O(Logn) time. We replace the key to be deleted with minum infinite by calling decreaseKey(). After decreaseKey(), the minus infinite value must reach root, so we call extractMin() to remove the key.

In [5]:
from heapq import heapify,heappop,heappush

class MinHeap:
    def __init__(self):
        self.heap=[]
        heapify(self.heap)

    def getMin(self):
        return self.heap[0]

    def insertKey(self,item):
        heappush(self.heap,item)

    def parent(self,i):
        return (i-1)//2

    def extractMin(self):
        return heappop(self.heap)

    def decreaseKey(self,i,newVal):
        self.heap[i]=newVal
        while (i!=0 and self.heap[self.parent(i)]>self.heap[i]):
            parent=self.parent(i)
            self.heap[i],self.heap[self.parent(i)]=(self.heap[self.parent(i)],self.heap[i])
            i=parent
        #heapify(self.heap)

    def deleteKey(self,i):
        self.decreaseKey(i,float('-infinity'))
        #print(self.heap)
        self.extractMin()

if __name__ == '__main__':
    heapObj=MinHeap()
    heapObj.insertKey(1)
    heapObj.insertKey(9)
    heapObj.insertKey(10)
    heapObj.insertKey(5)
    heapObj.insertKey(200)
    # print(heapObj.heap)
    heapObj.decreaseKey(4,0)
    # heapObj.deleteKey(2)
    #print(heapObj.getMin())
    #print(heapObj.heap)
    print(heapObj.heap)


[0, 1, 10, 9, 5]


# Binomial Heap

The main application of Binary Heap is in implementing priority queue.

Binomial Heap is an extension of Binary Heap that provides faster union or merge operation together with other operations provided by Binary Heap.

A Binomial Heap is a collection of Binomial Trees

<pre>
What is a Binomial Tree?
A Binomial Tree of order 0 has 1 node. A Binomial Tree of order k can be constructed by taking two binomial trees of order k-1 and making one as leftmost child or other.
A Binomial Tree of order k has following properties.
a) It has exactly 2^k nodes.
b) It has depth as k.
c) There are exactly kCi nodes at depth i for i = 0, 1, . . . , k.
d) The root has degree k and children of root are themselves Binomial Trees with order k-1, k-2,.. 0 from left to right.
</pre>

<pre>
k = 0 (Single Node)

 o

k = 1 (2 nodes) 
[We take two k = 0 order Binomial Trees, and
make one as child of other]
 o
   \
     o

k = 2 (4 nodes)
[We take two k = 1 order Binomial Trees, and
make one as child of other]
   o
 /   \
o     o
       \
        o

k = 3 (8 nodes)
[We take two k = 2 order Binomial Trees, and
make one as child of other]
    o   
 /  |  \ 
o   o    o
    |   /  \
    o  o    o
             \
              o
</pre>

<strong>Binomial Heap</strong>

A Binomial Heap is a set of Binomial Trees where each Binomial Tree follows Min Heap property. And there can be at most one Binomial Tree of any degree.

<pre>
12------------10--------------------20
             /  \                 /  | \
           15    50             70  50  40
           |                  / |    |     
           30               80  85  65 
                            |
                           100
A Binomial Heap with 13 nodes. It is a collection of 3 
Binomial Trees of orders 0, 2 and 3 from left to right. 
</pre>

<strong>Binary Representation of a number and Binomial Heaps</strong>

A Binomial Heap with n nodes has the number of Binomial Trees equal to the number of set bits in the Binary representation of n. For example let n be 13, there 3 set bits in the binary representation of n (00001101), hence 3 Binomial Trees. We can also relate the degree of these Binomial Trees with positions of set bits. With this relation, we can conclude that there are O(Logn) Binomial Trees in a Binomial Heap with ‘n’ nodes.

<strong>Operations of Binomial Heap</strong>

The main operation in Binomial Heap is union(), all other operations mainly use this operation. The union() operation is to combine two Binomial Heaps into one.

insert(H, k): Inserts a key ‘k’ to Binomial Heap ‘H’. This operation first creates a Binomial Heap with single key ‘k’, then calls union on H and the new Binomial heap.

getMin(H): A simple way to getMin() is to traverse the list of root of Binomial Trees and return the minimum key. This implementation requires O(Logn) time(because there will be logn binomial trees for n nodes, so total logn nodes to be checked). It can be optimized to O(1) by maintaining a pointer to minimum key root.

extractMin(H): This operation also uses union(). We first call getMin() to find the minimum key Binomial Tree, then we remove the node and create a new Binomial Heap by connecting all subtrees of the removed minimum node. Finally, we call union() on H and the newly created Binomial Heap. This operation requires O(Logn) time.

delete(H): Like Binary Heap, delete operation first reduces the key to minus infinite, then calls extractMin().

decreaseKey(H): decreaseKey() is also similar to Binary Heap. We compare the decreases key with it parent and if parent’s key is more, we swap keys and recur for the parent. We stop when we either reach a node whose parent has a smaller key or we hit the root node. Time complexity of decreaseKey() is O(Logn).

<strong>Union operation in Binomial Heap</strong>

<pre>
Given two Binomial Heaps H1 and H2, union(H1, H2) creates a single Binomial Heap.

1) The first step is to simply merge the two Heaps in non-decreasing order of degrees. In the following diagram, figure(b) shows the result after merging.

2) After the simple merge, we need to make sure that there is at most one Binomial Tree of any order. To do this, we need to combine Binomial Trees of the same order. We traverse the list of merged roots, we keep track of three-pointers, prev, x and next-x. There can be following 4 cases when we traverse the list of roots.
—–Case 1: Orders of x and next-x are not same, we simply move ahead.
In following 3 cases orders of x and next-x are same.
—–Case 2: If the order of next-next-x is also same, move ahead.
—–Case 3: If the key of x is smaller than or equal to the key of next-x, then make next-x as a child of x by linking it with x.
—–Case 4: If the key of x is greater, then make x as the child of next.
</pre>

![](https://media.geeksforgeeks.org/wp-content/uploads/Bionomial_tree_2.png)

# Fibonacci Heap

In terms of Time Complexity, Fibonacci Heap beats both Binary and Binomial Heaps.

<pre>
1) Find Min:      Θ(1)     [Same as both Binary and Binomial]
2) Delete Min:    O(Log n) [Θ(Log n) in both Binary and Binomial]
3) Insert:        Θ(1)     [Θ(Log n) in Binary and Θ(1) in Binomial]
4) Decrease-Key:  Θ(1)     [Θ(Log n) in both Binary and Binomial]
5) Merge:         Θ(1)     [Θ(m Log n) or Θ(m+n) in Binary and
                            Θ(Log n) in Binomial]
</pre>

Like Binomial Heap, Fibonacci Heap is a collection of trees with min-heap or max-heap property. In Fibonacci Heap, trees can can have any shape even all trees can be single nodes (This is unlike Binomial Heap where every tree has to be Binomial Tree).

![](https://media.geeksforgeeks.org/wp-content/uploads/Fibonacci-Heap.png)

Fibonacci Heap maintains a pointer to minimum value (which is root of a tree). All tree roots are connected using circular doubly linked list, so all of them can be accessed using single ‘min’ pointer.

<pre>
Facts about Fibonacci Heap

The reduced time complexity of Decrease-Key has importance in Dijkstra and Prim algorithms. With Binary Heap, time complexity of these algorithms is O(VLogV + ELogV). If Fibonacci Heap is used, then time complexity is improved to O(VLogV + E)

Although Fibonacci Heap looks promising time complexity wise, it has been found slow in practice as hidden constants are high.

Fibonacci heap are mainly called so because Fibonacci numbers are used in the running time analysis. Also, every node in Fibonacci Heap has degree at most O(log n) and the size of a subtree rooted in a node of degree k is at least Fk+2, where Fk is the kth Fibonacci number.
</pre>

# Leftist Tree / Leftist Heap

# K-ary Heap

# HeapSort

In [11]:
from math import floor

def maxHeapify(arr,n,i):
    left=2*i+1
    right=2*i+2
    if left<n and arr[i]<arr[left]:
        largest=left
    else:
        largest=i
    if right<n and arr[right]>arr[largest]:
        largest=right
    if largest!=i:
        arr[largest],arr[i]=arr[i],arr[largest]
        maxHeapify(arr,n,largest)

def buildMaxHeap(arr,n):
    for i in range(floor(n//2)-1,-1,-1):
        maxHeapify(arr,n,i)
    # print(arr)

def heapSort(arr):
    n=len(arr)
    buildMaxHeap(arr,n)
    #get the largest element and store it at the end, now maxheapify the first replaced node with the updated heapsize
    for i in range(n-1,-1,-1):
        arr[0],arr[i]=arr[i],arr[0]
        maxHeapify(arr,i,0)

if __name__ == '__main__':
    arr=[ 12, 11, 13, 5, 6, 7]
    heapSort(arr)
    print(arr)


[5, 6, 7, 11, 12, 13]


Heap sort is an in-place algorithm.

Time Complexity: Time complexity of heapify is O(Logn). Time complexity of createAndBuildHeap() is O(n) and overall time complexity of Heap Sort is O(nLogn).

Not stable (check on this)

# Iterative HeapSort

HeapSort is a comparison based sorting technique where we first build Max Heap and then swaps the root element with last element (size times) and maintains the heap property each time to finally make it sorted.

In [12]:
def buildHeap(arr,n):
    for i in range(n):
        if arr[i]>arr[int((i-1)/2)]:
            j=i
            while arr[j]>arr[int((j-1)/2)]:
                arr[j],arr[int((j-1)/2)]=arr[int((j-1)/2)],arr[j]
                j=int((j-1)/2)

def heapSort(arr,n):
    buildHeap(arr,n)
    # print(arr)
    for i in range(n-1,0,-1):
        arr[0],arr[i]=arr[i],arr[0]
        j,index=0,0
        while True:
            index=2*j+1
            # if left child is less than right child
            if index<i-1 and arr[index]<arr[index+1]:
                index+=1
            # if parent is less than the child
            if index<i and arr[index]>arr[j]:
                arr[index],arr[j]=arr[j],arr[index]

            j=index
            if index>=i:
                break

if __name__ == '__main__':
    arr=[10,20,15,17,9,21]
    n=len(arr)
    heapSort(arr,n)
    print(arr)


[9, 10, 15, 17, 20, 21]


Time Complexity - O(nlogn)

# K largest(or smallest) elements in an array

Naive Approach -> Sort the array and then return the k elements. Time Complexity O(nlogn)

Approach 1-> Use Max heap

In [13]:
from heapq import _heapify_max, _heappop_max

def getKLargest(arr,k):
    n=len(arr)
    _heapify_max(arr)
    for i in range(k):
        value=_heappop_max(arr)
        print(value,end=" ")

if __name__ == '__main__':
    arr=[1, 23, 12, 9, 30, 2, 50]
    k=3
    result=getKLargest(arr,k)


50 30 23 

Time Complexity -> O(n +klogn) (n times to build a maxeheap and k times you will have to maxheapify)

Approach 2-> Use Min heap. Maintain a heap of size k, then for each element greater than the root of the heap, include that value as root of the heap and heapify

In [14]:
from heapq import heapify,heappop

def getKLargest(arr,k):
    heapArr=[]
    for i in range(k):
        heapArr.append(arr[i])
    heapify(heapArr)
    for i in range(k,len(arr)):
        if arr[i]>heapArr[0]:
            heapArr[0]=arr[i]
            heapify(heapArr)
    print(" ".join(map(str,heapArr)))

if __name__ == '__main__':
    arr=[1, 23, 12, 9, 30, 2, 50]
    k=3
    result=getKLargest(arr,k)


23 50 30


Time Complexity-> O(k+(n-k)logk) this does not give result in sorted order. If results are to be included in sorted manner than O(klogk) extra time is required

# K’th Smallest/Largest Element in Unsorted Array

Naive Approach -> Sort the array and return the kth element. Time Complexity O(nlogn)

In [16]:
from heapq import heapify,heappop

def getKSmallestElement(arr,k):
    heapify(arr)
    for i in range(k):
        value=heappop(arr)

    return value

if __name__ == '__main__':
    arr=[7, 10, 4, 3, 20, 15]
    k=3
    result=getKSmallestElement(arr,k)
    print(result)


7


Time Complexity -> O(n +klogn)

Another Approach -> Using max heap

In [17]:
from heapq import _heapify_max, _heappop_max

def getKSmallestElement(arr,k):

    heapArr=[]
    for i in range(k):
        heapArr.append(arr[i])
    _heapify_max(heapArr)

    for i in range(k,len(arr)):
        if arr[i]<heapArr[0]:
            heapArr[0]=arr[i]
            _heapify_max(heapArr)

    return heapArr[0]


if __name__ == '__main__':
    arr=[7, 10, 4, 3, 20, 15]
    k=3
    result=getKSmallestElement(arr,k)
    print(result)


7


Time Complexity -> O(k + (n-k)logk)

# Sort a nearly sorted (or K sorted) array

Given an array of n elements, where each element is at most k away from its target position, devise an algorithm that sorts in O(n log k) time. For example, let us consider k is 2, an element at index 7 in the sorted array, can be at indexes 5, 6, 7, 8, 9 in the given array.

In [18]:
from heapq import heapify,heappop,heappush

def sortNearlySortedArray(arr,k):
    heapArr=[]

    # size k+1 because element is having its actual location atmost k positions away
    for i in range(k+1):
        heapArr.append(arr[i])

    heapify(heapArr)

    index=0
    for i in range(k+1,len(arr)):
        arr[index]=heappop(heapArr)
        heappush(heapArr,arr[i])
        index+=1

    while heapArr:
        arr[index]=heappop(heapArr)
        index+=1

if __name__ == '__main__':
    arr=[6, 5, 3, 2, 8, 10, 9]
    k=3
    sortNearlySortedArray(arr,k)
    print(arr)


[2, 3, 5, 6, 8, 9, 10]


Time Complexity -> O(k +(n-k)logk)

Another Approach-> Using insertion sort

In [19]:
def sortNearlySortedArray(arr,k):
    for i in range(len(arr)):
        j=i-1
        key=arr[i]
        # this will run atmost k times
        while j>=0 and arr[j]>key:
            arr[j+1]=arr[j]
            j-=1
        arr[j+1]=key


if __name__ == '__main__':
    arr=[6, 5, 3, 2, 8, 10, 9]
    k=3
    sortNearlySortedArray(arr,k)
    print(arr)


[2, 3, 5, 6, 8, 9, 10]


Time Complexity-> O(nk), the inner loop will run atmost k times and outer loop will run for n times

We can also use a Balanced Binary Search Tree instead of Heap to store K+1 elements. The insert and delete operations on Balanced BST also take O(Logk) time. So Balanced BST based method will also take O(nLogk) time, but the Heap bassed method seems to be more efficient as the minimum element will always be at root. Also, Heap doesn’t need extra space for left and right pointers.

# Tournament Tree TODO

# Check if a given Binary Tree is Heap

<pre>
Given a binary tree, we need to check it has heap property or not, Binary tree need to fulfill the following two conditions for being a heap –

It should be a complete tree (i.e. all levels except last should be full).
Every node’s value should be greater than or equal to its child node (considering max-heap).
</pre>

In [1]:
class Node:
    def __init__(self,data):
        self.data=data
        self.left=None
        self.right=None

def isComplete(root,index,total):
    if root is None:
        return True
    if index>=total:
        return False
    return isComplete(root.left,2*index+1,total) and isComplete(root.right,2*index+2,total)

def countNodes(root):
    if root is None:
        return 0
    return 1+countNodes(root.left)+countNodes(root.right)

def doesSatisfyProperty(root):
    if root.left is None and root.right is None:
        return True
    if root.right is None:
        return root.data>root.left.data
    else:
        if root.data>root.left.data and root.data>root.right.data:
            return doesSatisfyProperty(root.left) and doesSatisfyProperty(root.right)
        else:
            return False

def isHeap(root):
    if root is None:
        return True
    n=countNodes(root)
    return isComplete(root,0,n) and doesSatisfyProperty(root)


if __name__ == '__main__':
    root = Node(5)
    root.left = Node(2)
    root.right = Node(3)
    root.left.left = Node(1)
    print(isHeap(root))


True


Time Complexity ->O(n)

Approach 2-> Level Order Traversal

In [1]:
class Node:
    def __init__(self,data):
        self.data=data
        self.left=None
        self.right=None


def isBinaryHeap(root):
    if root is None:
        return True
    queue=[]
    queue.append(root)
    leavesStart=False
    while queue:
        ele=queue.pop(0)
        if ele.left:
            if leavesStart or ele.left.data>ele.data:
                return False
            queue.append(ele.left)
        else:
            leavesStart=True
        if ele.right:
            if leavesStart or ele.right.data>ele.data:
                return False
            queue.append(ele.right)
        else:
            leavesStart=True
    return True

if __name__ == '__main__':
    root=Node(97)
    root.left=Node(46)
    root.right=Node(37)
    root.left.left=Node(12)
    root.left.right=Node(3)
    root.right.left=Node(7)
    root.right.right=Node(31)
    # root.left.left.left=Node(6)
    # root.left.left.right=Node(9)
    root.left.right.left=Node(6)
    root.left.right.right=Node(9)
    print(isBinaryHeap(root))


False


Time Complexity O(n)

# How to check if a given array represents a Binary Heap?

In [2]:
def isBinaryHeap(arr,n):
    for i in range(n//2):
        left=2*i+1
        right=2*i+2
        if (left<n and arr[left]>arr[i]) or (right<n and arr[right]>arr[i]):
            return False
    return True

if __name__ == '__main__':
    arr=[90, 15, 10, 7, 12, 2]
    # arr=[9, 15, 10, 7, 12, 11]
    n=len(arr)
    print(isBinaryHeap(arr,n))


True


Time Complexity -> O(n)

# Connect n ropes with minimum cost

<pre>
There are given n ropes of different lengths, we need to connect these ropes into one rope. The cost to connect two ropes is equal to the sum of their lengths. We need to connect the ropes with minimum cost.

For example, if we are given 4 ropes of lengths 4, 3, 2, and 6. We can connect the ropes in the following ways.
1) First, connect ropes of lengths 2 and 3. Now we have three ropes of lengths 4, 6, and 5.
2) Now connect ropes of lengths 4 and 5. Now we have two ropes of lengths 6 and 9.
3) Finally connect the two ropes and all ropes have connected.

Total cost for connecting all ropes is 5 + 9 + 15 = 29.
</pre>

In [3]:
from heapq import heapify,heappop,heappush

def connectRopes(arr,n):
    totalCost=0
    heapify(arr)
    while len(arr)>1:
        partA=heappop(arr)
        partB=heappop(arr)
        total=partA+partB
        totalCost+=total
        heappush(arr,total)
        # print(arr)
    # print(totalCost)
    return totalCost

if __name__ == '__main__':
    arr=[4, 3, 2, 6]
    n=len(arr)
    # print(arr)
    result=connectRopes(arr,n)
    print(result)


29


Time Complexity -> O(nlogn)

Approach used-> Maintain a min heap and take out 2 minimum cost ropes, join the ropes and push the joined rope in the heap back again

# Design an efficient data structure for given operations


Design a Data Structure for the following operations. The data structure should be efficient enough to accommodate the operations according to their frequency.

1) findMin() : Returns the minimum item.
   Frequency: Most frequent

2) findMax() : Returns the maximum item.
    Frequency: Most frequent

3) deleteMin() : Delete the minimum item.
    Frequency: Moderate frequent 

4) deleteMax() : Delete the maximum item.
    Frequency: Moderate frequent 

5) Insert() : Inserts an item.
    Frequency: Least frequent

6) Delete() : Deletes an item.
    Frequency: Least frequent. 

Maintain a sorted array where smallest element is at first position and largest element is at last. The time complexity of findMin(), findMAx() and deleteMax() is O(1). But time complexities of deleteMin()[first element to be deleted and shifting takes O(n) time], insert() and delete() will be O(n).

We can do the most frequent two operations in O(1) and other operations in O(Logn) time

The idea is to use two binary heaps (one max and one min heap). The main challenge is, while deleting an item, we need to delete from both min-heap and max-heap. So, we need some kind of mutual data structure. In the following design, we have used doubly linked list as a mutual data structure. The doubly linked list contains all input items and indexes of corresponding min and max heap nodes. The nodes of min and max heaps store addresses of nodes of doubly linked list. The root node of min heap stores the address of minimum item in doubly linked list. Similarly, root of max heap stores address of maximum item in doubly linked list. Following are the details of operations.

1) findMax(): We get the address of maximum value node from root of Max Heap. So this is a O(1) operation.

2) findMin(): We get the address of minimum value node from root of Min Heap. So this is a O(1) operation.

3) deleteMin(): We get the address of minimum value node from root of Min Heap. We use this address to find the node in doubly linked list. From the doubly linked list, we get node of Max Heap. We delete node from all three. We can delete a node from doubly linked list in O(1) time. delete() operations for max and min heaps take O(Logn) time.

4) Insert(): We always insert at the beginning of linked list in O(1) time. Inserting the address in Max and Min Heaps take O(Logn) time. So overall complexity is O(Logn)

5) Delete(): We first search the item in Linked List. Once the item is found in O(n) time, we delete it from linked list. Then using the indexes stored in linked list, we delete it from Min Heap and Max Heaps in O(Logn) time. So overall complexity of this operation is O(n). 

In [5]:
# Implementation

# Merge k sorted arrays

Given k sorted arrays of size n each, merge them and print the sorted output.

One simple solution can be to create the output array of size n* k, put all the elements of k lists into this array and sort this array

Time Complexity -> O(n* k log(n* k)) and Space Complexity -> O(n* k)

Another Approach-> The first element will be among the first 3 elements of each array, so maintain a array with the values from each array. Find the minimum of them, put it in the output array, now include the element from the removed element array as the new element.

In [7]:
def mergeKSortedArrays(arr,k):
    n=len(arr[0])
    output=[None]*(n*k)
    nextElement=[None]*k
    for i in range(k):
        nextElement[i]=arr[i][0]

    nextElementIndex=[0]*k
    # print(output)
    for i in range(n*k):
        nextEle=min(nextElement)
        output[i]=nextEle
        for j in range(k):
            if nextEle==nextElement[j]:#arr[j][nextElementIndex[j]]:
                nextElementIndex[j]+=1
                # print(nextElementIndex[j])
                if nextElementIndex[j]<len(arr[j]):
                    nextElement[j]=arr[j][nextElementIndex[j]]
                else:
                    nextElement[j]=float('infinity')
                break
    print(output)

if __name__ == '__main__':
    arr=[[2,6,12,34],[1,9,20,1000],[23,34,90,2000]]
    k=len(arr)
    mergeKSortedArrays(arr,k)


[1, 2, 6, 9, 12, 20, 23, 34, 34, 90, 1000, 2000]


Time Complexity-> O(n* k* (k)) and Space Complexity -> O(n* k)

Another Approach-> Using min heap 

In [8]:
class Node:
    def __init__(self,data,i,j):
        self.data=data
        self.i=i #array from which the data is taken
        self.j=j #next data in the array to be picked

class MinHeap:
    def __init__(self,arr,k):
        self.heapArr=arr
        self.heapSize=k
        for i in range((self.heapSize//2)-1,-1,-1):
            self.heapify(i)

    def heapify(self,i):
        left=2*i+1
        right=2*i+2
        if left<self.heapSize and self.heapArr[left].data<self.heapArr[i].data:
            smallest=left
        else:
            smallest=i
        if right<self.heapSize and self.heapArr[right].data<self.heapArr[smallest].data:
            smallest=right
        if smallest!=i:
            self.heapArr[smallest],self.heapArr[i]=self.heapArr[i],self.heapArr[smallest]
            self.heapify(smallest)

    def getMin(self):
        # print(minheap.heapArr[0])
        return self.heapArr[0]

def mergeKSortedArrays(arr,k):
    resultSize=0
    heapArr=[]
    for i in range(len(arr)):
        resultSize+=len(arr[i])
        heapNode=Node(arr[i][0],i,1)
        heapArr.append(heapNode)
    minheap=MinHeap(heapArr,k)
    result=[0]*(resultSize)
    for i in range(resultSize):
        root=minheap.getMin()
        # print(root.data)
        result[i]=root.data
        if root.j<len(arr[root.i]):
            root.data=arr[root.i][root.j]
            root.j+=1
        else:
            root.data=float('infinity')
        minheap.heapArr[0]=root
        minheap.heapify(0)
    print(result)

if __name__ == '__main__':
    arr=[[2, 6, 12, 34], [1, 9, 20, 1000],   [23, 34, 90, 2000] ]
    mergeKSortedArrays(arr,len(arr))


[1, 2, 6, 9, 12, 20, 23, 34, 34, 90, 1000, 2000]


Time Complexity -> O(n* k logk)

# Merge Sort Tree for Range Order Statistics

Prerequisite-> Segment Tree

# Sort numbers stored on different machines

Same as merge k sorted lists

# Smallest Derangement of Sequence

Given the sequence   S = {1, 2, 3 .... N}   find the lexicographically smallest (earliest in dictionary order) derangement of  S  .

A derangement of S is as any permutation of S such that no two elements in S and its permutation occur at same position.

Since we are interested in generating smallest derangement, we start putting smaller elements in more significant positions.

Start from left, at any position  i  place the next smallest element among the values of the sequence which have not yet been placed in positions before  i .

In [9]:
from heapq import heapify,heappop,heappush

def getSmallestDerangement(n):
    sequence=list(range(1,n+1))
    result=[0]*n
    heapArr=[]
    for i in sequence:
        heapArr.append(i)
    heapify(heapArr)

    for i in range(n):
        value=heappop(heapArr)
        if value!=sequence[i] or i==n-1:
            result[i]=value
        else:
            result[i]=heappop(heapArr)
            heappush(heapArr,value)
    if result[n-1]==sequence[n-1]:
        result[n-1],result[n-2]=result[n-2],result[n-1]

    print(result)

if __name__ == '__main__':
    n=5
    getSmallestDerangement(n)


[2, 1, 4, 5, 3]


Time Complexity -> O(n +nlogn)

Another Approach

Since we are given a very specific sequence i.e  S_i = i \ \ \forall i <= N  we can calculate the answer even more efficiently.

Divide the original sequence into pairs of two elements, and then swap the elements of each pair.

If N is odd then the last pair needs to be swapped again.

![](https://media.geeksforgeeks.org/wp-content/uploads/Derrangements-Algo-Even-1.png)

![](https://media.geeksforgeeks.org/wp-content/uploads/Derrangements-Algo-Odd-1.png)

Complexity: We perform at most N/2 + 1 swaps, so the complexity is O(N).

Why does this method work
At position i we already know the least element that can be put, which is either i+1 or i-1. Since we are already given the least permutation of S it is clear that the derangement must start from 2 and not 1 ie of the form i+1 (i = 1). The next element will be of the form i – 1 . The element after this will be i + 1 and then next i – 1. This pattern will continue until the end.

In [11]:
def getSmallestDerangement(n):
    sequence=list(range(1,n+1))
    result=[0]*n
    for i in range(n-1):
        if i%2==0:
            result[i]=sequence[i+1]
        else:
            result[i]=sequence[i-1]

    if n%2!=0:
        result[n-1]=result[n-2]
        result[n-2]=sequence[n-1]
    else:
        result[n-1]=sequence[n-2]


    print(result)

if __name__ == '__main__':
    n=6
    getSmallestDerangement(n)


[2, 1, 4, 3, 6, 5]


# Largest Derangement of a Sequence

Similar Approach to smallest derangement

In [1]:
from heapq import heapify,heappop,heappush,_heapify_max,_heappop_max

def getLargestDerangement(n):
    sequence=list(range(1,n+1))
    # sequence=[5, 4, 3, 2, 1]
    result=[0]*n
    heapArr=[]
    for i in sequence:
        heapArr.append(i)
    _heapify_max(heapArr)

    for i in range(n):
        value=_heappop_max(heapArr)
        if value!=sequence[i] or i==n-1:
            result[i]=value
        else:
            result[i]=_heappop_max(heapArr)
            heapArr.append(value)
            _heapify_max(heapArr)
    if result[n-1]==sequence[n-1]:
        result[n-1],result[n-2]=result[n-2],result[n-1]

    print(result)

if __name__ == '__main__':
    n=5
    getLargestDerangement(n)


[5, 4, 2, 3, 1]


Time Complexity-> O(n+nlogn)

# K maximum sum combinations from two arrays

Given two equally sized arrays (A, B) and N (size of both arrays).

A sum combination is made by adding one element from array A and another element of array B. Display the maximum K valid sum combinations from all the possible sum combinations.

Naive Approach-> Put all the combinations of the sum into the heap, and take the largest element out for k times

In [2]:
from heapq import _heapify_max,_heappop_max

def getMaxSumCombinations(arr1,arr2,n,k):
    heapArray=[]
    for i in range(n):
        for j in range(n):
            heapArray.append(arr1[i]+arr2[j])
    # print(heapArray)
    _heapify_max(heapArray)
    count=0
    while count<k:
        largest=_heappop_max(heapArray)
        print(largest,end=" ")
        count+=1


if __name__ == '__main__':
    arr1=[4, 2, 5, 1]
    arr2=[8, 0, 3, 5]
    k=3
    getMaxSumCombinations(arr1,arr2,len(arr1),k)


13 12 10 

In [1]:
def getSumCombinations(a,b,k):
    combinations=[]
    for i in range(len(a)):
        for j in range(len(b)):
            combinations.append(a[i]+b[j])
    combinations.sort(reverse=True)
    for i in range(k):
        print(combinations[i])

if __name__ == '__main__':
    a=[4, 2, 5, 1]
    b=[8, 0, 3, 5]
    k=3
    getSumCombinations(a,b,k)
    print()
    a=[3,2]
    b=[1,4]
    k=2
    getSumCombinations(a,b,k)


13
12
10

7
6


Time Complexity-> O(n* n)

Another Approach-> 

<pre>
Limiting our search space using maxheap(priority queue)

1. Sort both arrays array A and array B.

2. Create a max heap i.e priority_queue to store the sum combinations along with the indices of elements from both arrays A and B which make up the sum. Heap is ordered by the sum.

3. Initialize the heap with the maximum possible sum combination i.e (A[N – 1] + B[N – 1] where N is the size of array) and with the indices of elements from both arrays (N – 1, N – 1). The tuple inside max heap will be (A[N-1] + B[N – 1], N – 1, N – 1). Heap is ordered by first value i.e sum of both elements.

4. Pop the heap to get the current largest sum and along with the indices of the element that make up the sum. Let the tuple be (sum, i, j).

4.1. Next insert (A[i – 1] + B[j], i – 1, j) and (A[i] + B[j – 1], i, j – 1) into the max heap but make sure that the pair of indices i.e (i – 1, j) and (i, j – 1) are not
already present in the max heap. 
</pre>

In [3]:
from queue import PriorityQueue

def getMaxSumCombinations(arr1,arr2,n,k):
    arr1.sort()
    arr2.sort()
    # print(arr1)
    # print(arr2)
    s=set()
    largest=(arr1[n-1]+arr2[n-1],n-1,n-1)
    pQueue=PriorityQueue()
    pQueue.put(((-1)*largest[0],largest))
    # print(largest)
    s.add((largest[1],largest[2]))
    for i in range(k):
        item=pQueue.get()
        # print(item)
        i=item[1][1]
        j=item[1][2]
        print(item[1][0],end=" ")
        if i>0:
            temp1=(arr1[i-1]+arr2[j],i-1,j)
            if (i-1,j) not in s:
                pQueue.put((-1*temp1[0],temp1))
                s.add((i-1,j))
        if j>0:
            temp2=(arr1[i]+arr2[j-1],i,j-1)
            if ((i,j-1)) not in s:
                pQueue.put((-1*temp2[0],temp2))
                s.add((i,j-1))
    # print(s)


if __name__ == '__main__':
    arr1=[4, 2, 5, 1]
    arr2=[8, 0, 3, 5]
    k=9
    getMaxSumCombinations(arr1,arr2,len(arr1),k)


13 12 10 10 9 9 8 7 7 

Time Complexity-> Assuming k<=N O(nlogn)

# Maximum distinct elements after removing k elements

Given an array arr[] containing n elements. The problem is to find maximum number of distinct elements (non-repeating) after removing k elements from the array.
Note: 1 <= k <= n.

In [4]:
def getMaxDistinctElements(arr,k):
    n=len(arr)
    map={}
    for i in range(len(arr)):
        if arr[i] in map:
            k-=1
        else:
            map[arr[i]]=1
        if k==0:
            break
    if i+1<n:
        for j in range(i+1,n):
            if arr[j] in map:
                map[arr[j]]+=1
            else:
                map[arr[j]]=1
    count=0
    for i in map:
        if map[i]==1:
            count+=1
    if k!=0:
        return count-k
    return count


if __name__ == '__main__':
    arr=[5, 7, 5, 5, 1, 2, 2]
    # arr=[1, 2, 3, 4, 5, 6, 7]
    # arr=[1,2,2,2]
    k=3
    result=getMaxDistinctElements(arr,k)
    print(result)


4


Approach -> Using Priority Queue

In [2]:
from queue import PriorityQueue

def findMaxDistinctElements(arr,k):
    freqDict={}
    for i in arr:
        if i in freqDict:
            freqDict[i]+=1
        else:
            freqDict[i]=1
    pQueue=PriorityQueue()
    for i in freqDict:
        pQueue.put((-1*freqDict[i],i))
    for i in range(k):
        element=pQueue.get()
        freqDict[element[1]]-=1
        if freqDict[element[1]]>0:
            pQueue.put((-1*freqDict[element[1]],element[1]))
    count=0
    for i in freqDict:
        if freqDict[i]>0:
            count+=1
    return count

if __name__ == '__main__':
    arr=[5, 7, 5, 5, 1, 2, 2]
    k=3
    print(findMaxDistinctElements(arr,k))
    arr=[1, 2, 3, 4, 5, 6, 7]
    k=5
    print(findMaxDistinctElements(arr,k))
    arr=[1, 2, 2, 2]
    k=1
    print(findMaxDistinctElements(arr,k))


4
2
2


Time Complexity O(klogd) where d is the number of distinct elements

# Maximum difference between two subsets of m elements

Given an array of n integers and a number m, find the maximum possible difference between two sets of m elements chosen from given array.

In [5]:
def getMaxDiffSubset(arr,n,k):
    arr.sort()
    maxSubSum=0
    minSubSum=0
    j=n-1
    for i in range(k):
        minSubSum+=arr[i]
        maxSubSum+=arr[j]
        j-=1
    return maxSubSum-minSubSum

if __name__ == '__main__':
    arr=[5, 8, 11, 40, 15]
    # arr=[1, 2, 3, 4, 5]
    n=len(arr)
    k=2
    result=getMaxDiffSubset(arr,n,k)
    print(result)


42


Time Complexity -> O(nlogn)

Another Approach-> Using minheap and maxheap

In [6]:
from heapq import heapify,heappop,heappush,_heapify_max,_heappop_max

def getMaxDiffSubset(arr,n,k):
    maxHeap=[]#for minSubSum
    minHeap=[]#for maxSubSum
    for i in range(k):
        maxHeap.append(arr[i])
        minHeap.append(arr[i])
    _heapify_max(maxHeap)
    heapify(minHeap)
    for i in range(k,n):
        if arr[i]>minHeap[0]:
            minHeap[0]=arr[i]
            heapify(minHeap)
        if arr[i]<maxHeap[0]:
            maxHeap[0]=arr[i]
            _heapify_max(maxHeap)
    # print(minHeap)
    # print(maxHeap)
    return sum(minHeap)-sum(maxHeap)

if __name__ == '__main__':
    arr=[5, 8, 11, 40, 15]
    # arr=[1, 2, 3, 4, 5]
    n=len(arr)
    k=2
    result=getMaxDiffSubset(arr,n,k)
    print(result)


42


Time Complexity-> O(n-k* logk) and Space Complexity-> O(k)

# Height of a complete binary tree (or Heap) with N nodes

Height of a complete binary tree is ceil(log2(N+1)) – 1.

In [7]:
import math 
def height(N): 
    return math.ceil(math.log2(N + 1)) - 1
 
N = 6
print(height(N)) 

2


# Heap Sort for decreasing order using min heap

In [8]:
def heapify(arr,i,n):
    left=2*i+1
    right=2*i+2
    if left<n and arr[left]<arr[i]:
        smallest=left
    else:
        smallest=i
    if right<n and arr[right]<arr[smallest]:
        smallest=right
    if smallest!=i:
        arr[smallest],arr[i]=arr[i],arr[smallest]
        heapify(arr,smallest,n)

def buildHeap(arr,n):
    for i in range(n//2-1,-1,-1):
        heapify(arr,i,n)

def heapsort(arr,n):
    buildHeap(arr,n)

    for i in range(n-1,-1,-1):
        arr[0],arr[i]=arr[i],arr[0]
        heapify(arr,0,i)

if __name__ == '__main__':
    arr=[4, 6, 3, 2, 9]
    n=len(arr)
    heapsort(arr,n)
    print(arr)


[9, 6, 4, 3, 2]


Time Complexity -> O(nlogn)

# Print all nodes less than a value x in a Min Heap.

In [1]:
class Node:
    def __init__(self,data):
        self.data=data
        self.left=None
        self.right=None

def printNodes(root,x):
    if root is None:
        return
    if root.data<x:
        print(root.data,end=" ")
        printNodes(root.left,x)
        printNodes(root.right,x)


if __name__ == '__main__':
    root=Node(2)
    root.left=Node(3)
    root.right=Node(15)
    root.left.left=Node(5)
    root.left.right=Node(4)
    root.right.left=Node(45)
    root.right.right=Node(80)
    root.left.left.left=Node(6)
    root.left.left.right=Node(150)
    root.left.right.left=Node(77)
    root.left.right.right=Node(120)
    printNodes(root,80)


2 3 5 6 4 77 15 45 

Time Complexity -> O(n) and Space Complexity-> O(n)(Function call stack)

# Median of Stream of Running Integers

Given that integers are being read from a data stream. Find the median of all the elements read so far starting from the first integer till the last integer. This is also called the Median of Running Integers

Median can be defined as the element in the data set which separates the higher half of the data sample from the lower half. In other words, we can get the median element as, when the input size is odd, we take the middle element of sorted data. If the input size is even, we pick an average of middle two elements in the sorted stream.

Naive solution is to maintain a sorted array and insert each element is a sorted manner and for median use the logic of middle elements

Time Complexity -> O(n^2)

Efficient Solution is to use min heap and max heap to segregate the smaller half and greater half elements

<pre>
Create two heaps. One max heap to maintain elements of lower half and one min heap to maintain elements of higher half at any point of time..

Take initial value of median as 0.

For every newly read element, insert it into either max heap or min-heap and calculate the median based on the following conditions:

If the size of max heap is greater than the size of min-heap and the element is less than the previous median then pop the top element from max heap and insert into min-heap and insert the new element to max heap else insert the new element to min-heap. Calculate the new median as the average of top of elements of both max and min heap.

If the size of max heap is less than the size of min-heap and the element is greater than the previous median then pop the top element from min-heap and insert into the max heap and insert the new element to min heap else insert the new element to the max heap. Calculate the new median as the average of top of elements of both max and min heap.

If the size of both heaps is the same. Then check if the current is less than the previous median or not. If the current element is less than the previous median then insert it to the max heap and a new median will be equal to the top element of max heap. If the current element is greater than the previous median then insert it to min-heap and new median will be equal to the top element of min heap.
</pre>

In [3]:
from heapq import heapify,heappop,heappush,_heapify_max,_heappop_max

def getMedian(arr,n):
    greaterHalf=[]#minheap
    smallerHalf=[]#maxheap
    _heapify_max(smallerHalf)
    heapify(smallerHalf)
    heappush(smallerHalf,arr[0])
    result=[]
    result.append(arr[0])
    for i in range(1,n):
        x=arr[i]
        if len(smallerHalf)>len(greaterHalf):
            if x<smallerHalf[0]:
                smallerHalf.append(x)
                _heapify_max(smallerHalf)
                heappush(greaterHalf,_heappop_max(smallerHalf))
            else:
                heappush(greaterHalf,x)
            result.append((smallerHalf[0]+greaterHalf[0])/2)
        else:
            #extra elements always needs to be put into the smallerhalf
            if x>greaterHalf[0]:
                heappush(greaterHalf,x)
                smallerHalf.append(heappop(greaterHalf))
                _heapify_max(smallerHalf)
            else:
                smallerHalf.append(x)
                _heapify_max(smallerHalf)
            result.append(smallerHalf[0])
#     print(smallerHalf)
#     print(greaterHalf)
    print(result)
if __name__ == '__main__':
    arr=[20,10,30,7]
    # arr=[5, 15, 10, 20, 3]
    getMedian(arr,len(arr))


[20, 15.0, 20, 15.0]


In [1]:
from heapq import heapify,heappop,heappush,_heapify_max,_heappop_max

def getMedian(arr):
    result=[]
    smallerHalf=[]#maxheap
    greaterHalf=[]#minheap
    smallerHalf.append(arr[0])
    _heapify_max(smallerHalf)
    heapify(greaterHalf)
    result.append(arr[0])
    for i in range(1,len(arr)):
        x=arr[i]
        if len(smallerHalf)==len(greaterHalf):
            if x<result[-1]:
                smallerHalf.append(x)
                _heapify_max(smallerHalf)
                result.append(smallerHalf[0])
            else:
                heappush(greaterHalf,x)
                result.append(greaterHalf[0])
        elif len(smallerHalf)>len(greaterHalf):
            if x<result[-1]:
                smallerHalf.append(x)
                _heapify_max(smallerHalf)
                heappush(greaterHalf,_heappop_max(smallerHalf))
            else:
                heappush(greaterHalf,x)
            result.append((smallerHalf[0]+greaterHalf[0])/2)
        else:
            if x<result[-1]:
                smallerHalf.append(x)
                _heapify_max(smallerHalf)
            else:
                heappush(greaterHalf,x)
                smallerHalf.append(heappop(greaterHalf))
                _heapify_max(smallerHalf)
            result.append((smallerHalf[0]+greaterHalf[0])/2)
    print(result)


if __name__ == '__main__':
    arr=[5,10,15]
    arr=[1, 2, 3, 4]
    arr=[20,10,30,7]
    getMedian(arr)


[20, 15.0, 20, 15.0]


# Largest triplet product in a stream

Given a stream of integers represented as arr[]. For each index i from 0 to n-1, print the multiplication of largest, second largest, third largest element of the subarray arr[0…i]. If i < 2 print -1.

Approach-> Use Min heap to maintain the three largest element in a stream

In [4]:
from heapq import heapify,heappop,heappush

def largestTriplet(arr,n):
    heapArr=[]
    for i in range(3):
        heapArr.append(arr[i])
    for i in range(3-1):
        print(-1,end=" ")
    print(heapArr[0]*heapArr[1]*heapArr[2],end=" ")
    for i in range(3,n):
        if heapArr[0]<arr[i]:
            heapArr[0]=arr[i]
            heapify(heapArr)
        print(heapArr[0]*heapArr[1]*heapArr[2],end=" ")


if __name__ == '__main__':
    arr=[1, 2, 3, 4, 5]
    largestTriplet(arr,len(arr))


-1 -1 6 24 60 

Time Complexity -> ((n-3)* log(3))

# Find k numbers with most occurrences in the given array

Given an array of n numbers and a positive integer k. The problem is to find k numbers with most occurrences, i.e., the top k numbers having the maximum frequency. If two numbers have the same frequency then the larger number should be given preference. The numbers should be displayed in decreasing order of their frequencies. It is assumed that the array consists of k numbers with most occurrences.

In [1]:
class Node:
    def __init__(self,data,freq):
        self.data=data
        self.freq=freq

def heapify(heap,i,n):
    left=2*i+1
    right=2*i+2
    # print(left,right)
    # if left<n and heap[left].freq>heap[i].freq:
    #     largest=left
    # else:
    #     largest=i
    if left<n:
        if heap[left].freq!=heap[i].freq:
            if heap[left].freq>heap[i].freq:
                largest=left
            else:
                largest=i
        else:
            if heap[left].data>heap[i].data:
                largest=left
            else:
                largest=i
    else:
        largest=i
    # if right<n and heap[right].freq>heap[largest].freq:
    #     largest=right
    if right<n:
        if heap[right].freq!=heap[largest].freq:
            if heap[right].freq>heap[largest].freq:
                largest=right
        else:
            if heap[right].data>heap[largest].data:
                largest=right
    if largest!=i:
        heap[largest],heap[i]=heap[i],heap[largest]
        heapify(heap,largest,n)

def buildHeap(hashMap,k,n):
    heap=[]
    for i in hashMap:
        node=Node(i,hashMap[i])
        heap.append(node)
    # for i in range(len(heap)):
    #     print((heap[i].data,heap[i].freq))
    for i in range((len(heap)//2)-1,-1,-1):
        heapify(heap,i,len(heap))

    return heap


def getKFrequentElements(arr,k,n):
    hashMap={}
    for i in arr:
        if i in hashMap:
            hashMap[i]+=1
        else:
            hashMap[i]=1
    heap=buildHeap(hashMap,k,n)
    result=[]
    for i in range(k):
        item=heap[0]
        # result.append((item.data,item.freq))
        result.append(item.data)
        heap[0].freq=float('-infinity')
        heapify(heap,0,len(heap))

    return result


if __name__ == '__main__':
    # arr=[3, 1, 4, 4, 5, 2, 6, 1]
    arr=[7, 10, 11, 5, 2, 5, 5, 7, 11, 8, 9]
    # k=2
    k=4
    result=getKFrequentElements(arr,k,len(arr))
    print(" ".join(map(str,result)))


5 11 7 10


Time Complexity -> klogd

Another Approach -> Hashing and Sorting with Custom Comparator

In [2]:
from functools import cmp_to_key

def compare(a,b):
    if a[1]<b[1]:
        return 1
    elif a[1]>b[1]:
        return -1
    else:
        if a[0]<b[0]:
            return 1
        elif a[0]>b[0]:
            return -1
        else:
            return 0

def getKFrequentElements(arr,k):
    hashMap={}
    for i in arr:
        if i in hashMap:
            hashMap[i]+=1
        else:
            hashMap[i]=1
    result=[]
    for i in hashMap:
        result.append((i,hashMap[i]))
    result.sort(key=cmp_to_key(compare))

    for i in range(k):
        print(result[i][0],end=" ")
    print('\n')


if __name__ == '__main__':
    arr=[3, 1, 4, 4, 5, 2, 6, 1]
    k=2
    getKFrequentElements(arr,k)
    arr=[7, 10, 11, 5, 2, 5, 5, 7, 11, 8, 9]
    k=4
    getKFrequentElements(arr,k)


4 1 

5 11 7 10 



# Convert BST to Min Heap

Given a binary search tree which is also a complete binary tree. The problem is to convert the given BST into a Min Heap with the condition that all the values in the left subtree of a node should be less than all the values in the right subtree of the node. This condition is applied on all the nodes in the so converted Min Heap.

Approach-> Store the inorder traversal, it will always be in sorted manner. Now preorderly replace the elements

In [2]:
class Node:
    def __init__(self,data):
        self.data=data
        self.left=None
        self.right=None

def storeInorder(root,arr):
    if root is None:
        return
    storeInorder(root.left,arr)
    arr.append(root.data)
    storeInorder(root.right,arr)

def convertBSTToMinHeapUtil(root,arr,i):
    if root is None:
        return
    root.data=arr[i[0]]
    i[0]+=1
    convertBSTToMinHeapUtil(root.left,arr,i)
    convertBSTToMinHeapUtil(root.right,arr,i)


def convertBSTToMinHeap(root):
    arr=[]
    storeInorder(root,arr)
    i=[0]
    convertBSTToMinHeapUtil(root,arr,i)

def preorder(root):
    if root is None:
        return
    print(root.data,end=' ')
    preorder(root.left)
    preorder(root.right)

if __name__ == '__main__':
    root=Node(4)
    root.left=Node(2)
    root.right=Node(6)
    root.left.left=Node(1)
    root.left.right=Node(3)
    root.right.left=Node(5)
    root.right.right=Node(7)
    preorder(root)
    convertBSTToMinHeap(root)
    print()
    preorder(root)


4 2 1 3 6 5 7 
1 2 3 4 5 6 7 

Time Complexity -> O(n) and space complexity -> O(n)

# Merge two binary Max Heaps

Approach-> Put all the elements in a separate array and then heapify it

In [3]:
def heapify(arr,i,n):
    left=2*i+1
    right=2*i+2
    if left<n and arr[left]>arr[i]:
        largest=left
    else:
        largest=i
    if right<n and arr[right]>arr[largest]:
        largest=right
    if largest!=i:
        arr[largest],arr[i]=arr[i],arr[largest]
        heapify(arr,largest,n)


def mergeMaxHeaps(a,b):
    heapArr=[]
    for i in range(len(a)):
        heapArr.append(a[i])
    for i in range(len(b)):
        heapArr.append(b[i])
    for i in range(len(heapArr)//2-1,-1,-1):
        heapify(heapArr,i,len(heapArr))
    return heapArr


if __name__ == '__main__':
    a=[10, 5, 6, 2]
    b=[12, 7, 9]
    result=mergeMaxHeaps(a,b)
    print(result)


[12, 10, 9, 2, 5, 7, 6]


Time Complexity -> O(m+n) (building a max heap)

# K-th Largest Sum Contiguous Subarray

A brute force approach approach is to store all the contiguous sums in another array and sort it, and print the k-th largest. But in case of number of elements being large, the array in which we store the contiguous sums will run out of memory as the number of contiguous subarrays will be large (quadratic order)

An efficient approach is store the pre-sum of the array in a sum[] array. We can find sum of contiguous subarray from index i to j as sum[j]-sum[i-1]

In [4]:
from heapq import heapify,heappop,heappush

def getKLargestContiguousSum(arr,k,n):
    sum=[]
    sum.append(0)
    sum.append(arr[0])
    for i in range(2,n+1):
        sum.append(arr[i-1]+sum[i-1])
    # print(sum)
    heapArr=[]
    heapify(heapArr)
    for i in range(1,n+1):
        for j in range(i,n+1):
            x=sum[j]-sum[i-1]
            if len(heapArr)<k:
                heappush(heapArr,x)
            else:
                if x>heapArr[0]:
                    heappop(heapArr)
                    heappush(heapArr,x)
    return heapArr[0]


if __name__ == '__main__':
    arr=[10, -10, 20, -40]
    k=6
    # arr=[20, -5, -1]
    # k=3
    result=getKLargestContiguousSum(arr,k,len(arr))
    print(result)


-10


Time Complexity-> n^2log(k) and heaps are cache friendly

# Minimum product of k integers in an array of positive Integers

Given an array of n positive integers. We are required to write a program to print the minimum product of k integers of the given array.

Approach-> Use a minheap, get the minimum k elements from the heap and find the product

Time Complexity-> O(n+klogn)

# Leaf starting point in a Binary Heap data structure

n/2 till n-1

# Why is Binary Heap Preferred over BST for Priority Queue?

<pre>
A typical Priority Queue requires following operations to be efficient.

Get Top Priority Element (Get minimum or maximum)
Insert an element
Remove top priority element
Decrease Key
A Binary Heap supports above operations with following time complexities:

O(1)
O(Logn)
O(Logn)
O(Logn)
</pre>

<pre>
A Self Balancing Binary Search Tree like AVL Tree, Red-Black Tree, etc can also support above operations with same time complexities.

Finding minimum and maximum are not naturally O(1), but can be easily implemented in O(1) by keeping an extra pointer to minimum or maximum and updating the pointer with insertion and deletion if required. With deletion we can update by finding inorder predecessor or successor.

Inserting an element is naturally O(Logn)

Removing maximum or minimum are also O(Logn)

Decrease key can be done in O(Logn) by doing a deletion followed by insertion.
</pre>

Since Binary Heap is implemented using arrays, there is always better locality of reference and operations are more cache friendly.

Although operations are of same time complexity, constants in Binary Search Tree are higher.

We can build a Binary Heap in O(n) time. Self Balancing BSTs require O(nLogn) time to construct.

Binary Heap doesn’t require extra space for pointers.

Binary Heap is easier to implement.

There are variations of Binary Heap like Fibonacci Heap that can support insert and decrease-key in Θ(1) time

<pre><strong>Is Binary Heap always better?</strong>

Although Binary Heap is for Priority Queue, BSTs have their own advantages and the list of advantages is in-fact bigger compared to binary heap.

Searching an element in self-balancing BST is O(Logn) which is O(n) in Binary Heap.
We can print all elements of BST in sorted order in O(n) time, but Binary Heap requires O(nLogn) time.

Floor and ceil can be found in O(Logn) time.

K’th largest/smallest element be found in O(Logn) time by augmenting tree with an additional field.
<pre>

# Convert min Heap to max Heap

Idea is - we simply build Max Heap without caring about the input. We start from bottom-most and rightmost internal mode of min Heap and heapify all internal modes in bottom up way to build the Max heap.

Time Complexity -O(n) [Time to build a heap]

# Given level order traversal of a Binary Tree, check if the Tree is a Min-Heap

In [5]:
def isMinHeap(arr,n):
    for i in range(n//2):
        if arr[2*i+1]<arr[i]:
            return False
        if arr[2*i+2]<arr[i]:
            return False
    return True

if __name__ == '__main__':
    # arr=[10, 15, 14, 25, 30]
    arr=[30, 56, 22, 49, 30, 51, 2, 67]
    print(isMinHeap(arr,len(arr)))


False


Time Complexity -> O(n)

# Rearrange characters in a string such that no two adjacent are same

In [6]:
from heapq import heappop

class Node:
    def __init__(self,data,freq):
        self.data=data
        self.freq=freq

def heapify(arr,i):
    n=len(arr)
    left=2*i+1
    right=2*i+2
    if left<n and arr[left].freq>arr[i].freq:
        largest=left
    else:
        largest=i
    if right<n and arr[right].freq>arr[largest].freq:
        largest=right
    if largest!=i:
        arr[largest],arr[i]=arr[i],arr[largest]
        heapify(arr,largest)

def buildHeap(heapArr):
    n=len(heapArr)
    for i in range(n//2-1,-1,-1):
        heapify(heapArr,i)

def rearrangeString(strValue):
    mapStr={}
    for i in strValue:
        if i in mapStr:
            mapStr[i]+=1
        else:
            mapStr[i]=1
    heapArr=[]
    for i in mapStr:
        node=Node(i,mapStr[i])
        heapArr.append(node)

    buildHeap(heapArr)
    size=len(heapArr)
    result=[None]*len(strValue)
    prev='#'
    # print(heapArr)
    # item=heapArr[0]
    # heapArr.pop(0)
    # print(heapArr)
    # print(item.data,item.freq)
    for i in range(len(result)):
        # print('inside')
        d=heapArr.pop(0)
        if prev!=d.data:
            result[i]=d.data
            prev=d.data
            d.freq-=1
            if d.freq>0:
                heapArr.insert(0,d)
                heapify(heapArr,0)
        else:
            if not len(heapArr):
                return "Not Possible"
            x=heapArr.pop(0)
            result[i]=x.data
            prev=x.data
            x.freq-=1
            if x.freq>0:
                heapArr.insert(0,x)
            heapArr.insert(0,d)
            heapify(heapArr,0)

    # print(result)
    return "".join(result)


if __name__ == '__main__':
    strValue="aaabc"
    # strValue="aaabb"
    # strValue="aaaabc"
    print(rearrangeString(strValue))


abaca


Same approach used in Smallest Derangement of a sequence

Time Complexity-> O(nlogd)

# Array Representation Of Binary Heap

Discussed

# Sum of all elements between k1’th and k2’th smallest elements

Approach-> Maintain a min heap, extract k1 elements, then extract k2-k1-1 elements (make sum of these elements)

# Minimum sum of two numbers formed from digits of an array

Approach 1-> Sort the array, and form two numbers with alternate indices

Time Complexity -> O(nlogn)

Approach 2-> Maintain a min heap and alternately form a number by popping out the elements

In [7]:
from heapq import heapify,heappop

def getMinSum(arr):
    num1=0
    num2=0
    heapify(arr)
    flag=True
    while len(arr):
        if flag:
            x=heappop(arr)
            num1=num1*10+x
        else:
            x=heappop(arr)
            num2=num2*10+x
        flag=not flag

    return num1+num2

if __name__ == '__main__':
    # arr=[6, 8, 4, 5, 2, 3]
    arr=[5, 3, 0, 7, 4]
    result=getMinSum(arr)
    print(result)


82


# K’th largest element in a stream

In [8]:
from heapq import heapify

def getKLargest(k):
    count=1
    heapArr=[]
    heapify(heapArr)
    while True:
        num=int(input('Enter number or enter 9999 to quit\n'))
        if num==9999:
            break
        if count<=k:
            if count<k:
                print('_',end="\n")
            heapArr.append(num)
            heapify(heapArr)
            if count==k:
                print(heapArr[0])
        else:
            if num>heapArr[0]:
                heapArr[0]=num
                heapify(heapArr)
            print(heapArr[0],end='\n')
        count+=1

if __name__ == '__main__':
    k=3
    getKLargest(k)


Enter number or enter 9999 to quit
10
_
Enter number or enter 9999 to quit
20
_
Enter number or enter 9999 to quit
11
10
Enter number or enter 9999 to quit
70
11
Enter number or enter 9999 to quit
50
20
Enter number or enter 9999 to quit
40
40
Enter number or enter 9999 to quit
100
50
Enter number or enter 9999 to quit
5
50
Enter number or enter 9999 to quit
9999


Another Approach-> A Simple Solution is to keep an array of size k. The idea is to keep the array sorted so that the k’th largest element can be found in O(1) time (we just need to return first element of array if array is sorted in increasing order)
How to process a new element of stream?
For every new element in stream, check if the new element is smaller than current k’th largest element. If yes, then ignore it. If no, then remove the smallest element from array and insert new element in sorted order. Time complexity of processing a new element is O(k).

Another Approach-> A Better Solution is to use a Self Balancing Binary Search Tree of size k. The k’th largest element can be found in O(Logk) time.
How to process a new element of stream?
For every new element in stream, check if the new element is smaller than current k’th largest element. If yes, then ignore it. If no, then remove the smallest element from the tree and insert new element. Time complexity of processing a new element is O(Logk).

Time Complexity for finding the element-> logk