# 295. Find Median from Data Stream

The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.For example, for arr = [2,3,4], the median is 3.For example, for arr = [2,3], the median is (2 + 3) / 2 = 2.5.Implement the MedianFinder class:MedianFinder() initializes the MedianFinder object.void addNum(int num) adds the integer num from the data stream to the data structure.double findMedian() returns the median of all elements so far. Answers within 10-5 of the actual answer will be accepted. **Example 1:**Input["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"][[], [1], [2], [], [3], []]Output[null, null, null, 1.5, null, 2.0]ExplanationMedianFinder medianFinder = new MedianFinder();medianFinder.addNum(1);    // arr = [1]medianFinder.addNum(2);    // arr = [1, 2]medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)medianFinder.addNum(3);    // arr[1, 2, 3]medianFinder.findMedian(); // return 2.0 **Constraints:**-105 <= num <= 105There will be at least one element in the data structure before calling findMedian.At most 5 * 104 calls will be made to addNum and findMedian. Follow up:If all integer numbers from the stream are in the range [0, 100], how would you optimize your solution?If 99% of all integer numbers from the stream are in the range [0, 100], how would you optimize your solution?

## Solution Explanation
To find the median of a stream of numbers efficiently, we need a data structure that allows us to:1. Add new numbers quickly2. Find the middle element(s) quicklyA key insight is that we don't need to keep the entire array sorted - we only need to know the middle element(s). This suggests using two heaps:* A max heap for the smaller half of the numbers* A min heap for the larger half of the numbersWith this structure:* The max of the smaller half (top of max heap) and the min of the larger half (top of min heap) are the middle elements* We can balance the heaps to ensure they differ in size by at most 1* When the total count is odd, the median is the top of the heap with one extra element* When the total count is even, the median is the average of the tops of both heapsThe algorithm works as follows:1. When adding a number, we decide which heap to put it in:* If the number is smaller than the max of the smaller half, it goes to the max heap* Otherwise, it goes to the min heap2. After adding, we rebalance the heaps if needed to maintain the size property3. To find the median, we check the sizes of the heaps and return accordingly

In [None]:
import heapqclass MedianFinder:    def __init__(self):        # Max heap for the smaller half (multiply by -1 to simulate max heap)        self.small = []          # Min heap for the larger half        self.large = []      def addNum(self, num: int) -> None:        # Default strategy: add to small heap first        heapq.heappush(self.small, -num)                # Ensure every element in small is <= every element in large        # Move the largest element from small to large if needed        if self.small and self.large and -self.small[0] > self.large[0]:            val = -heapq.heappop(self.small)            heapq.heappush(self.large, val)                # Balance the heaps (difference in size <= 1)        if len(self.small) > len(self.large) + 1:            val = -heapq.heappop(self.small)            heapq.heappush(self.large, val)        elif len(self.large) > len(self.small):            val = heapq.heappop(self.large)            heapq.heappush(self.small, -val)    def findMedian(self) -> float:        if len(self.small) > len(self.large):            # Odd number of elements, small heap has the extra element            return -self.small[0]        else:            # Even number of elements, average the two middle values            return (-self.small[0] + self.large[0]) / 2

## Time and Space Complexity
* *Time Complexity:*** `addNum`: O(log n) - Both heap operations (push and pop) take O(log n) time, where n is the number of elements.* `findMedian`: O(1) - We just need to look at the top elements of the heaps.* *Space Complexity:*** O(n) - We store all n elements across the two heaps.* *Follow-up Optimizations:**1. If all numbers are in range [0, 100]:* We could use a counting array of size 101 to track frequencies* Finding the median would involve counting from both ends until we reach the middle* This would give O(1) time for addNum and O(100) = O(1) time for findMedian* Space complexity would be O(1) as we only need 101 counters2. If 99% of numbers are in range [0, 100]:* Use the counting array for numbers in the range* Use separate data structures (like the heap solution) for outliers* Keep track of how many numbers are in each structure to find the median* This would give near-O(1) operations for most cases

## Test Cases


In [None]:
def test_median_finder():    # Test case 1: Example from the problem    mf1 = MedianFinder()    mf1.addNum(1)    mf1.addNum(2)    assert mf1.findMedian() == 1.5    mf1.addNum(3)    assert mf1.findMedian() == 2.0        # Test case 2: Larger sequence with odd count    mf2 = MedianFinder()    for num in [5, 15, 1, 3, 2, 8, 7]:        mf2.addNum(num)    assert mf2.findMedian() == 5.0        # Test case 3: Larger sequence with even count    mf3 = MedianFinder()    for num in [5, 15, 1, 3, 2, 8, 7, 9]:        mf3.addNum(num)    assert mf3.findMedian() == 6.0        # Test case 4: Duplicate values    mf4 = MedianFinder()    for num in [1, 1, 1, 2, 3]:        mf4.addNum(num)    assert mf4.findMedian() == 1.0        # Test case 5: Negative numbers    mf5 = MedianFinder()    for num in [-5, -10, 4, 0, 3]:        mf5.addNum(num)    assert mf5.findMedian() == 0.0        print("All test cases passed!")# Run the teststest_median_finder()