# Find Median from Data Stream

The median is the middle value in an ordered integer list. If the size of the list is eve, there is no middle value, and the median is the mean of the two middle values. 

- For example, for `arr = [2,3,4]`, the median is 3

- For example, for `arr = [2,3]`, the median is $(2 + 3) / 2 = 2.5$

Implement the MedianFinder class:

- `MedianFinder()` initializes the `MedianFinder` object. 

- `void addNum(int num)` adds the integer `num` from the data stream to the data structure. 

- `double findMedian()` returns the median of all elements so far. Answer within $10^{-5}$ of the actual answer will be accepted. 

### Example:

```
Input:
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output:
[null, null, null, 1.5, null, 2.0]

Explanation:
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1);    // arr = [1]
medianFinder.addNum(2);    // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3);    // arr[1, 2, 3]
medianFinder.findMedian(); // return 2.0
```

In [17]:
class MedianFinder():
    # Constructor
    def __init__(self):
        self.number_list = []

    def add_num(self, num_to_add):
        self.number_list.append(num_to_add)
    
    def find_median(self):
        # Sort the list in ascending order
        self.number_list.sort()

        if len(self.number_list) % 2 == 0:
            # Compute the average between the two values in between the list
            median = ((self.number_list[(len(self.number_list) // 2) - 1]) + (self.number_list[len(self.number_list) // 2])) / 2
            return median
        else:
            # Return the median value inside the list
            median = self.number_list[len(self.number_list) // 2]
            return median

MedianFinder = MedianFinder()
MedianFinder.add_num(1)
print(MedianFinder.number_list)
MedianFinder.add_num(5)
print(MedianFinder.number_list)

print(MedianFinder.find_median())

MedianFinder.add_num(2)
print(MedianFinder.number_list)

print(MedianFinder.find_median())

[1]
[1, 5]
3.0
[1, 5, 2]
2


## Solution

To solve this problem efficiently, we'll use _two heaps_: a max heap and a min heap. The max heap contains the smaller half of the numbers. The min heap contains the larger half of the number.

We make sure sizes of these heaps differ by at most 1. This makes the median lookup really fast. If the count of total numbers is odd, the median would be the top of larger heap. Otherwise, the median would be the average of tops. 

This approach allows us to quickly find the median in $O(1)$ time and add numbers in $O(log n)$ time.

One thing to node is that Python does not support a max heap. We use negative values to simulate a max with Python's min heap. 

In [16]:
import heapq

class MedianFinder:
    def __init__(self):
        # Max heap for the lower half of numbers
        self.max_heap = []
        # Min heap for the upper half of numbers
        self.min_heap = []
    
    def add_num(self, num):
        # First, add to max heap
        heapq.heappush(self.max_heap, -num)

        # Ensure every element in max heap
        # is <= elements in min heap
        if (self.max_heap and self.min_heap and -self.max_heap[0] > self.min_heap[0]):
            max_top = -heapq.heappop(self.max_heap)
            heapq.heappush(self.min_heap, max_top)

        # Balance the heaps
        if len(self.max_heap) > len(self.min_heap) + 1:
            max_top = -heapq.heappop(self.max_heap)
            heapq.heappush(self.min_heap, max_top)
        
        if len(self.min_heap) > len(self.max_heap) + 1:
            min_top = heapq.heappop(self.min_heap)
            heapq.heappush(self.max_heap, -min_top)
    
    def find_median(self):
        # If heaps are equal in size,
        # return average of tops
        if len(self.max_heap) == len(self.min_heap):
            return (-self.max_heap[0] + self.min_heap[0]) / 2.0
        
        # Otherwise, return to of larger heap
        if len(self.max_heap) > len(self.min_heap):
            return -self.max_heap[0]
        else:
            return -self.min_heap[0]


MedianFinder = MedianFinder()
MedianFinder.add_num(1)
MedianFinder.add_num(5)
print(MedianFinder.find_median())

MedianFinder.add_num(2)
print(MedianFinder.find_median())

3.0
2
