sketching algorithms implemented in chapel and python
Chapel Python Other
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md
driver-hh.chpl
driver-hh.py
driver-hll.chpl
driver-hll.py
driver-quant.chpl
driver-quant.py
driver-sampling.chpl
driver-sampling.py
driver-theta.chpl
driver-theta.py
heavyhitter.chpl
heavyhitter.py
hll.chpl
hll.py
makefile
quantile.chpl
quantile.py
run.sh
sampling.chpl
sampling.py
theta.chpl
theta.py

README.md

CHIUW2017

chpl-sketching

Sketching algorithms implemented in Chapel

Sketch Origins:

Sketching is a relatively recent development in the theoretical field of Stochastic Streaming Algorithms, which deals with algorithms that can extract information from a stream of data in a single pass (sometimes called “one-touch” processing) using various randomization techniques.

HyperLogLog

HyperLogLog on Wikipedia:

HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset ... The HyperLogLog algorithm can estimate cardinalities well beyond 10^9 with a relative accuracy (standard error) of 2% while only using 1.5kb of memory.

count-distinct problem on Wikipedia:

count-distinct problem (also known in applied mathematics as the cardinality estimation problem) is the problem of finding the number of distinct elements in a data stream with repeated elements.

Cardinality Estimation for Big Data:

HyperLogLog takes advantage of the randomized distribution of bits from hashing functions in order to estimate how many things you would’ve needed to see in order to experience a specific phenomenon.