Streaming Count Sketches with HyperLogLog in Amazon MemoryDB for Redis
-
Updated
Mar 11, 2024 - Python
Streaming Count Sketches with HyperLogLog in Amazon MemoryDB for Redis
Approximate Privacy-Preserving Neighbourhood Estimations
This repository represents several projects completed in IE HST's MS in Business Analytics and Big Data's Stream Processing Analytics course.
Distributed Cardinality Tracking
UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting
Yet Another Lame Algorithm Library
Implementation and experimental tests of various algorithms.
Experiments with RedisBloom and the text from Moby Dick
python implementations of the Flajolet-Martin, LogLog, SuperLogLog, and HyperLogLog cardinality estimation algorithms, specifically used to estimate the cardinality of unique traffic violations in NYC in the 2019 fiscal year
A simple, time-tested, family of random hash functions in Python, based on CRC32 and xxHash, affine transformations, and the Mersenne Twister. 🎲
Exploring Probabilistic Data Structures in Python - my 2021 Pycon USA and Australia and Pycon MEA 2022 talk.
HyperLogLog and other probabilistic data structures for mining in data streams
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Add a description, image, and links to the hyperloglog topic page so that developers can more easily learn about it.
To associate your repository with the hyperloglog topic, visit your repo's landing page and select "manage topics."