Approximately counting unique items in a stream. Algorithms course project at KAUST.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Streaming Unique Counting

Algorithms, KAUST, Fall 2013 - with Prof. Mikhail Moshkov.

Implementations of HyperLogLog and adaptive sampling for approximate counting of unique items in a stream. A report describing some interesting empirical comparision results is also available.


pip install mmh3
git clone
cd streaming-unique-counting

python "test_data/wuthering_heights.txt" 0 exact
python "test_data/wuthering_heights.txt" 0 adaptive 1150

python "test_data/big_test/" 1 exact
python "test_data/big_test/" 1 adaptive 5000