# 16. Heap data structure in python
A heap is a specialized tree-based data structure that satisfies the heap property. In Python, heaps are implemented through the `heapq` module, which provides an efficient way of implementing priority queues.

![](https://media.geeksforgeeks.org/wp-content/cdn-uploads/20221220165711/MinHeapAndMaxHeap1.png)

## Types of Heaps
There are two types of heaps:
1. **Min-Heap**: In a min-heap, for any given node I, the value of I is less than or equal to the values of its children. The root node will always have the smallest value.
2. **Max-Heap**: In a max-heap, for any given node I, the value of I is greater than or equal to the values of its children. The root node will always have the largest value.

Python's `heapq` module implements a min-heap.

## Operations on Heaps
Here's a breakdown of common operations you can perform on a heap using Python's `heapq` module:

### Importing the Module
```python
import heapq
```

### Creating a Heap
You can start with a regular list and transform it into a heap using `heapq.heapify`.
```python
heap = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
heapq.heapify(heap)
print(heap)  # Output will be a min-heap
```

### Inserting an Element
To add an element to the heap, use `heapq.heappush`.
```python
heapq.heappush(heap, 0)
print(heap)  # The smallest element will be at the root
```

### Removing the Smallest Element
To remove and return the smallest element from the heap, use `heapq.heappop`.
```python
smallest = heapq.heappop(heap)
print(smallest)  # Smallest element
print(heap)  # Heap after removal
```

### Peek at the Smallest Element
To access the smallest element without removing it, simply use `heap[0]`.
```python
smallest = heap[0]
print(smallest)  # Smallest element
```

### Merging Two Heaps
You can merge two heaps using `heapq.merge`. This returns an iterator over the sorted values.
```python
heap1 = [1, 3, 5]
heap2 = [2, 4, 6]
merged = heapq.merge(heap1, heap2)
print(list(merged))  # Merged sorted list
```

### Transforming a List into a Heap
To transform a list into a heap in-place, use `heapq.heapify`.
```python
data = [3, 1, 4, 1, 5, 9, 2, 6, 5]
heapq.heapify(data)
print(data)  # Now 'data' is a heap
```

### Finding the n Largest or Smallest Elements
You can use `heapq.nlargest` and `heapq.nsmallest` to find the n largest or smallest elements.
```python
largest_three = heapq.nlargest(3, heap)
smallest_three = heapq.nsmallest(3, heap)
print(largest_three)
print(smallest_three)
```
## Conclusion
Heaps are useful data structures for efficiently managing a dynamically changing dataset when you need to frequently access the smallest or largest elements. The `heapq` module in Python makes it easy to implement heaps and perform various heap operations with minimal code and optimal performance.

In [1]:
import heapq

# Create a list
nums = [4, 1, 7, 3, 8, 5]

# Convert list to a heap
heapq.heapify(nums)
print(f"Heapified list: {nums}")

# Add an element to the heap
heapq.heappush(nums, 2)
print(f"After pushing 2: {nums}")

# Pop the smallest element
smallest = heapq.heappop(nums)
print(f"Popped smallest element: {smallest}")
print(f"Heap after pop: {nums}")

# Peek at the smallest element
smallest_peek = nums[0]
print(f"Peeked smallest element: {smallest_peek}")

# Find the three largest elements
largest_three = heapq.nlargest(3, nums)
print(f"Three largest elements: {largest_three}")

# Find the three smallest elements
smallest_three = heapq.nsmallest(3, nums)
print(f"Three smallest elements: {smallest_three}")

Heapified list: [1, 3, 5, 4, 8, 7]
After pushing 2: [1, 3, 2, 4, 8, 7, 5]
Popped smallest element: 1
Heap after pop: [2, 3, 5, 4, 8, 7]
Peeked smallest element: 2
Three largest elements: [8, 7, 5]
Three smallest elements: [2, 3, 4]


## We are performing heap data structure internally

### Minimum Heap

In [2]:
class HeapNode:
    def __init__(self, data):
        self.data = data  # Store the data
        self.left = None  # Initialize left child as None
        self.right = None  # Initialize right child as None
        
class MinHeap:
    def __init__(self):
        self.root = None  # Initialize the root of the heap as None
        
    def add(self, data):
        if self.root is None:  # If heap is empty
            self.root = HeapNode(data)  # Create root node
            return
        
        # If root is not None, add the new data
        self.recursive_add(data, self.root)
            
    def recursive_add(self, data, node):
        if node.left is None:  # If left child is None
            node.left = HeapNode(data)  # Add new data to left child
            self.heapify_up(node.left)  # Perform heapify up operation
        elif node.right is None:  # If right child is None
            node.right = HeapNode(data)  # Add new data to right child
            self.heapify_up(node.right)  # Perform heapify up operation
        else:
            # If both left and right children are not None, add to the subtree with fewer nodes
            if self.get_count(node.left) <= self.get_count(node.right):
                self.recursive_add(data, node.left)
            else:
                self.recursive_add(data, node.right)
    
    def get_count(self, node):
        if node is None:  # Base case: if node is None, return 0
            return 0
        
        # Recursively count the nodes in the left and right subtrees
        return 1 + self.get_count(node.left) + self.get_count(node.right)
    
    def heapify_up(self, node):
        while node is not None and node != self.root:  # While node is not None and not root
            parentnode = self.get_parent(node, self.root)  # Get the parent node
            
            # If parent node's data is greater than current node's data, swap them
            if parentnode.data > node.data:
                parentnode.data, node.data = node.data, parentnode.data  # Swap the data
                node = parentnode  # Move up to the parent node
            else:
                break
    
    def get_parent(self, node, root):
        if root.left == node or root.right == node:  # If the root is the parent of the node
            return root
        
        if root.left is not None:
            parent = self.get_parent(node, root.left)  # Recursively find the parent in the left subtree
            if parent is not None:
                return parent
            
        if root.right is not None:
            parent = self.get_parent(node, root.right)  # Recursively find the parent in the right subtree
            if parent is not None:
                return parent
            
    def extract_min(self):
        if self.root is None:  # If heap is empty
            print("Heap is empty")
            return
        
        min = self.root.data  # Store the minimum value (root data)
        last_node_value = self.remove_last_node()  # Remove the last node and get its value
        
        if last_node_value:
            self.root.data = last_node_value  # Replace root data with last node value
            self.heapify_down(self.root)  # Perform heapify down operation
        else:
            self.root = None  # If no last node value, set root to None
        
        return min  # Return the minimum value
    
    def remove_last_node(self):
        queue = [self.root]  # Initialize queue with root
        last_node = None
        
        while len(queue) != 0:  # While queue is not empty
            current = queue.pop(0)  # Get the first node in the queue
            
            if current.left:
                queue.append(current.left)  # Add left child to the queue
            if current.right:
                queue.append(current.right)  # Add right child to the queue
            
            if not current.left and not current.right:
                last_node = current  # Set the last node
                
        if last_node:
            parentnode = self.get_parent(last_node, self.root)  # Get the parent of the last node
            
            # Remove the last node from its parent
            if parentnode.left == last_node:
                parentnode.left = None
            else:
                parentnode.right = None
                
            return last_node.data  # Return the data of the last node
        
        return None
    
    def heapify_down(self, node):
        while node:
            small = node  # Initialize smallest as the current node
            
            if node.left and node.left.data < small.data:
                small = node.left  # Update smallest if left child is smaller
            
            if node.right and node.right.data < small.data:
                small = node.right  # Update smallest if right child is smaller
            
            if small != node:
                small.data, node.data = node.data, small.data  # Swap data
                node = small  # Move down to the smallest node
            else:
                break

# Create a MinHeap instance
heap = MinHeap()

# Add elements to the heap
heap.add(10)
heap.add(7)
heap.add(6)
heap.add(5)
heap.add(4)

# Extract the minimum element from the heap and print it
print(heap.extract_min())

4


### Maximum Heap

In [3]:
class HeapNode:
    def __init__(self, data):
        self.data = data  # Store the data
        self.left = None  # Initialize left child as None
        self.right = None  # Initialize right child as None
        
class MaxHeap:
    def __init__(self):
        self.root = None  # Initialize the root of the heap as None
        
    def add(self, data):
        if self.root is None:  # If heap is empty
            self.root = HeapNode(data)  # Create root node
            return
        
        # If root is not None, add the new data
        self.recursive_add(data, self.root)
            
    def recursive_add(self, data, node):
        if node.left is None:  # If left child is None
            node.left = HeapNode(data)  # Add new data to left child
            self.heapify_up(node.left)  # Perform heapify up operation
        elif node.right is None:  # If right child is None
            node.right = HeapNode(data)  # Add new data to right child
            self.heapify_up(node.right)  # Perform heapify up operation
        else:
            # If both left and right children are not None, add to the subtree with fewer nodes
            if self.get_count(node.left) <= self.get_count(node.right):
                self.recursive_add(data, node.left)
            else:
                self.recursive_add(data, node.right)
    
    def get_count(self, node):
        if node is None:  # Base case: if node is None, return 0
            return 0
        
        # Recursively count the nodes in the left and right subtrees
        return 1 + self.get_count(node.left) + self.get_count(node.right)
    
    def heapify_up(self, node):
        while node is not None and node != self.root:  # While node is not None and not root
            parentnode = self.get_parent(node, self.root)  # Get the parent node
            
            # If parent node's data is less than current node's data, swap them
            if parentnode.data < node.data:
                parentnode.data, node.data = node.data, parentnode.data  # Swap the data
                node = parentnode  # Move up to the parent node
            else:
                break
    
    def get_parent(self, node, root):
        if root.left == node or root.right == node:  # If the root is the parent of the node
            return root
        
        if root.left is not None:
            parent = self.get_parent(node, root.left)  # Recursively find the parent in the left subtree
            if parent is not None:
                return parent
            
        if root.right is not None:
            parent = self.get_parent(node, root.right)  # Recursively find the parent in the right subtree
            if parent is not None:
                return parent
            
    def extract_max(self):
        if self.root is None:  # If heap is empty
            print("Heap is empty")
            return
        
        max_value = self.root.data  # Store the maximum value (root data)
        last_node_value = self.remove_last_node()  # Remove the last node and get its value
        
        if last_node_value:
            self.root.data = last_node_value  # Replace root data with last node value
            self.heapify_down(self.root)  # Perform heapify down operation
        else:
            self.root = None  # If no last node value, set root to None
        
        return max_value  # Return the maximum value
    
    def remove_last_node(self):
        queue = [self.root]  # Initialize queue with root
        last_node = None
        
        while len(queue) != 0:  # While queue is not empty
            current = queue.pop(0)  # Get the first node in the queue
            
            if current.left:
                queue.append(current.left)  # Add left child to the queue
            if current.right:
                queue.append(current.right)  # Add right child to the queue
            
            if not current.left and not current.right:
                last_node = current  # Set the last node
                
        if last_node:
            parentnode = self.get_parent(last_node, self.root)  # Get the parent of the last node
            
            # Remove the last node from its parent
            if parentnode.left == last_node:
                parentnode.left = None
            else:
                parentnode.right = None
                
            return last_node.data  # Return the data of the last node
        
        return None
    
    def heapify_down(self, node):
        while node:
            large = node  # Initialize largest as the current node
            
            if node.left and node.left.data > large.data:
                large = node.left  # Update largest if left child is larger
            
            if node.right and node.right.data > large.data:
                large = node.right  # Update largest if right child is larger
            
            if large != node:
                large.data, node.data = node.data, large.data  # Swap data
                node = large  # Move down to the largest node
            else:
                break

# Create a MaxHeap instance
heap = MaxHeap()

# Add elements to the heap
heap.add(10)
heap.add(7)
heap.add(6)
heap.add(5)
heap.add(4)

# Extract the maximum element from the heap and print it
print(heap.extract_max())

10


### Applications:
1. **Priority Queues:**
   - Heaps are often used to implement priority queues, where elements are assigned priorities and the element with the highest priority is dequeued first. This is useful in scenarios like task scheduling, job queue management, and event simulation.

2. **Heap Sort:**
   - Heap sort is a comparison-based sorting algorithm that uses a binary heap to sort elements. It has a time complexity of O(n log n) and is efficient for both in-place and external sorting.

3. **Graph Algorithms:**
   - **Dijkstra's Algorithm:** Heaps are used to efficiently find the shortest path in weighted graphs by selecting the vertex with the minimum distance.
   - **Prim's Algorithm:** Heaps help in finding the minimum spanning tree by selecting the edge with the minimum weight.

4. **Order Statistics:**
   - Heaps can be used to efficiently find the k-th smallest or largest element in an unsorted array. This is particularly useful in applications like median finding, percentile calculations, and data stream processing.

5. **Median Maintenance:**
   - Heaps are used to maintain the median of a stream of numbers. Two heaps are used: a max-heap to store the lower half of the numbers and a min-heap to store the upper half. The median is easily derived from the roots of the two heaps.

6. **Load Balancing:**
   - In load balancing, heaps can be used to efficiently distribute tasks among servers by always assigning the task to the server with the minimum load.

7. **Merge Operations:**
   - Heaps are used in algorithms that need to merge multiple sorted lists into a single sorted list, such as in external sorting (e.g., multi-way merge sort) and in database query processing.

8. **Cache Implementations:**
   - Priority heaps can be used in cache implementations to manage and prioritize which elements to remove when the cache is full (e.g., Least Recently Used (LRU) cache).

9. **Event Simulation:**
   - In discrete event simulation, heaps are used to manage the event queue where events are processed in chronological order. The event with the earliest time is always processed next.

10. **Range Queries:**
    - Heaps can be used in data structures that support range queries, where the goal is to efficiently find the minimum or maximum element in a given range.

11. **Memory Management:**
    - In memory management and garbage collection, heaps can be used to manage the allocation and deallocation of memory blocks.

#### Prepared By,
Ahamed Basith