A streaming data pipeline to perform basic analytics with scalability in mind
-
Updated
Feb 25, 2018 - Scala
A streaming data pipeline to perform basic analytics with scalability in mind
Implementation of HyperLogLog algorithms for distinct count estimate
Spark with probabilistic algortighmts - Bloom filter, HLL, QTree and Count-min sketch
Add a description, image, and links to the hyperloglog topic page so that developers can more easily learn about it.
To associate your repository with the hyperloglog topic, visit your repo's landing page and select "manage topics."