Skip to content

Commit 7b49406

Browse files
committed
Problem: 169, Find the median of the stream
1 parent b84580b commit 7b49406

File tree

2 files changed

+142
-1
lines changed

2 files changed

+142
-1
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
| Current Status| Stats |
88
| :------------: | :----------: |
9-
| Total Problems | 168 |
9+
| Total Problems | 169 |
1010

1111
</center>
1212

@@ -240,4 +240,5 @@ Include contains single header implementation of data structures and some algori
240240
|Product of Array Except Self. Given an array of n integers where n > 1, nums, return an array output such that output[i] is equal to the product of all the elements of nums except nums[i].| [product_except_self.cpp](leet_code_problems/product_except_self.cpp)|
241241
|Given a sorted array, remove duplicates in place and return the new length. It doesn't matter what is in array beyond the unique elements size. Expected O(1) space and O(n) time complexity.| [remove_duplicates.cpp](leet_code_problems/remove_duplicates.cpp) |
242242
| Count the number of islands in a grid. Given a grid representing 1 as land body, and 0 as water body, determine the number of islands (more details in problem comments)|[count_islands.cpp](leet_code_problems/count_islands.cpp)|
243+
| Find median from a data stream. Design a data structure that supports addNum to add a number to the stream, and findMedian to return the median of the current numbers seen so far. Also, if the count of numbers is even, return average of two middle elements, return median otherwise.|[median_stream.cpp](leet_code_problems/median_stream.cpp)
243244

leet_code_problems/median_stream.cpp

Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
/*
2+
* Source : Leetcode, 295. Find Median from Data Stream
3+
* Category : hard
4+
*
5+
* Find median from a data stream. Design a data structure that supports addNum to add a number to the stream,
6+
* and findMedian to return the median of the current numbers seen so far. Also, if the count of numbers is
7+
* even, return average of two middle elements, return median otherwise.
8+
*
9+
* In naive approach, we could maintain an internal vector and sort the array everytime median is requested,
10+
* and return the middle element(when odd count) or mean of two middle elements (when mean count)
11+
*
12+
* However, this doesn't scale well as the size of the vector grows, and if 'findMedian' is accessed too
13+
* frequently. We would want a potentially better solution which could reduce the cost of finding median,
14+
* even though we might make add a little slow.
15+
*
16+
* We could use something which maintains a ordered stream of numbers, and median could be retrieved in O(1)
17+
* time, and add could be done in O(logn) time.
18+
*
19+
* This could be achieved using a Self-Balanced Binary search tree, or using two heaps.
20+
* We will currently solve with a heap approach.
21+
*
22+
* Why two heaps?
23+
* We want two middle numbers (in even count case), so if we maintain two heaps, one min-heap and other
24+
* max-heap, the max-heap will maintain the lower ordered half of the numbers while the min-heap would
25+
* maintained the upper half of the numbers.
26+
* Consider even number of numbers in stream so far, in that case, the top of the lower heap (max heap)
27+
* will contain the left middle number of the stream and top of upper heap (min heap) will contain the
28+
* right middle number of the stream. Thus we could retrive the two middle numbers in O(1) time.
29+
*
30+
* [1, 2, 3, 4, 5] <-- values in lower heap. (MAX heap)
31+
* [6, 7, 8, 9, 10] <-- Values in upper heap. (MIN heap)
32+
*
33+
* lower.top() = 5
34+
* upper.top() = 6
35+
* So we will want (6+5)/2 = 5.5 as answer.
36+
*
37+
* Now, the algorithm:
38+
* In order to maintain the two middle numbers of the stream on the top of the two heaps,
39+
* we will have to balance them with each addition of new number.
40+
*
41+
* In case of odd numbers, the lower heap will contain 1 number extra then the upper heap.
42+
* In case of even numbers, both the heaps will have equal numbers of elements.
43+
*
44+
* As we add a new number, we will add it to lower, then we will balance the heaps by transferring
45+
* the top to upper heap, and then we will maintain the size property maintained above.
46+
*/
47+
48+
#include <iostream>
49+
#include <queue>
50+
#include <vector>
51+
#include <random>
52+
53+
class MedianFinder {
54+
public:
55+
MedianFinder() {}
56+
void addNum(int num)
57+
{
58+
// first add it to lower.
59+
lower.push(num);
60+
61+
// balance the heaps to set the right values at the top.
62+
upper.push(lower.top());
63+
lower.pop();
64+
65+
// Maintain the size property.
66+
if (lower.size() < upper.size()) {
67+
lower.push(upper.top());
68+
upper.pop();
69+
}
70+
}
71+
72+
double findMedian()
73+
{
74+
return (lower.size() > upper.size()) ? static_cast<double>(lower.top()) :
75+
(static_cast<double>(lower.top() + upper.top()) / 2);
76+
}
77+
78+
// utility function to print current sorted stream.
79+
void printSortedStream()
80+
{
81+
std::priority_queue<int> lc{lower};
82+
std::priority_queue<int, std::vector<int>, std::greater<int>> uc{upper};
83+
std::vector<int> temp;
84+
// lower half first;
85+
while(!lc.empty()){
86+
temp.push_back(lc.top());
87+
lc.pop();
88+
}
89+
90+
std::reverse(temp.begin(), temp.end());
91+
92+
for (auto n: temp) {
93+
std::cout << n << " ";
94+
}
95+
96+
while (!uc.empty()) {
97+
std::cout << uc.top() << " ";
98+
uc.pop();
99+
}
100+
101+
std::cout << std::endl;
102+
}
103+
private:
104+
// max heap maintaining the lower half of ordered stream.
105+
std::priority_queue<int> lower;
106+
// min heap maintaining the upper half of ordered stream.
107+
std::priority_queue<int, std::vector<int>, std::greater<int>> upper;
108+
};
109+
110+
111+
int main()
112+
{
113+
std::default_random_engine generator;
114+
std::uniform_int_distribution<int> distribution(0,20);
115+
MedianFinder medianFinder;
116+
117+
for (int i=0; i<5; ++i)
118+
{
119+
medianFinder.addNum(distribution(generator));
120+
}
121+
122+
// current state of stream
123+
medianFinder.printSortedStream();
124+
std::cout << "Median of current stream: " << medianFinder.findMedian()
125+
<< std::endl;
126+
127+
// Add 5 more.
128+
//
129+
for (int i=0; i<5; ++i)
130+
{
131+
medianFinder.addNum(distribution(generator));
132+
}
133+
134+
// current state of stream
135+
medianFinder.printSortedStream();
136+
std::cout << "Median of current stream: " << medianFinder.findMedian()
137+
<< std::endl;
138+
139+
return 0;
140+
}

0 commit comments

Comments
 (0)