Algebird's HyperLogLog support for Facebook Presto (prestodb.io).
There are two types of functions, aggregation and scalar.
merge(<hll: HLL>) -> HLL
hll_create(<element:VARCHAR>, <bits: BIGINT>) -> HLL
cardinality(<hll: HLL>) -> BIGINT
As this is a Presto plugin it relies on presto-spi. This means you will have to build Presto first and this project expects to find Presto as a parent project.
mvn clean install in /presto. Once that has finished run
mvn clean install in /presto/presto-hyperloglog.
Finally add the plugin to
Deployment on AWS EMR
The uber jar is now built by maven.
mkdir /usr/lib/presto/plugin/presto-hyperloglog cd /usr/lib/presto/plugin/presto-hyperloglog sudo wget https://github.com/vitillo/presto-hyperloglog/raw/master/target/presto-hyperloglog-$PRESTO_VERSION-jar-with-dependencies.jar