Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

METRON-627: Add HyperLogLogPlus implementation to Stellar #397

Closed
wants to merge 9 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions dependencies_with_url.csv
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ com.maxmind.geoip2:geoip2:jar:2.8.0:compile,Apache v2,https://github.com/maxmind
com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:compile,CDDL,http://jaxb.java.net/
com.sun.xml.bind:jaxb-impl:jar:2.2.5-2:compile,CDDL,http://jaxb.java.net/
com.twitter:jsr166e:jar:1.1.0:compile,CC0 1.0 Universal,http://github.com/twitter/jsr166e
it.unimi.dsi:fastutil:jar:7.0.6:compile,ASLv2,https://github.com/vigna/fastutil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea if we can rely on just 7.0.6? rather than 6.5.11 and 7.0.6? Is streamlib depending on both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to look at this further. I just did another cursory dependency review and I only see 7.0.6 referenced. 6.5.11 was definitely showing up in the report.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update - this 6.5.11 dep was added somewhere else along the way: org.apache.solr:solr-test-framework:jar:5.2.1:test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the transitive test dependencies for Solr. I excluded fastutil and ran the integration tests for metron-solr and nothing broke. Worst case scenario, we can later add 7.0.6 as a test dep if needed, or simply keep the original. But seeing as it works without it, I'm inclined to keep the exclusion unless anyone has any objections.

javassist:javassist:jar:3.12.1.GA:compile,Apache v2,http://www.javassist.org/
javax.activation:activation:jar:1.1:compile,Common Development and Distribution License (CDDL) v1.0,http://java.sun.com/products/javabeans/jaf/index.jsp
javax.annotation:jsr250-api:jar:1.0:compile,COMMON DEVELOPMENT AND DISTRIBUTION LICENSE (CDDL) Version 1.0,http://jcp.org/aboutJava/communityprocess/final/jsr250/index.html
Expand Down Expand Up @@ -91,6 +92,7 @@ com.github.tony19:named-regexp:jar:0.2.3:compile,Apache License, Version 2.0,
com.google.code.findbugs:jsr305:jar:1.3.9:compile,The Apache Software License, Version 2.0,http://findbugs.sourceforge.net/
com.google.code.findbugs:jsr305:jar:3.0.0:compile,The Apache Software License, Version 2.0,http://findbugs.sourceforge.net/
com.carrotsearch:hppc:jar:0.7.1:compile,ASLv2,
com.clearspring.analytics:stream:jar:2.9.5:compile,ASLv2,https://github.com/addthis/stream-lib
com.codahale.metrics:metrics-core:jar:3.0.2:compile,MIT,https://github.com/codahale/metrics
com.codahale.metrics:metrics-graphite:jar:3.0.2:compile,MIT,https://github.com/codahale/metrics
com.esotericsoftware.reflectasm:reflectasm:jar:shaded:1.07:compile,BSD,https://github.com/EsotericSoftware/reflectasm
Expand Down
646 changes: 646 additions & 0 deletions metron-analytics/metron-statistics/HLLP.md

Large diffs are not rendered by default.

28 changes: 28 additions & 0 deletions metron-analytics/metron-statistics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,34 @@ functions can be used from everywhere where Stellar is used.

## Stellar Functions

### Approximation Statistics

### `HLLP_ADD`
* Description: Add value to the HyperLogLogPlus estimator set. See [HLLP README](HLLP.md)
* Input:
* hyperLogLogPlus - the hllp estimator to add a value to
* value+ - value to add to the set. Takes a single item or a list.
* Returns: The HyperLogLogPlus set with a new value added

### `HLLP_CARDINALITY`
* Description: Returns HyperLogLogPlus-estimated cardinality for this set. See [HLLP README](HLLP.md)
* Input:
* hyperLogLogPlus - the hllp set
* Returns: Long value representing the cardinality for this set

### `HLLP_INIT`
* Description: Initializes the HyperLogLogPlus estimator set. p must be a value between 4 and sp and sp must be less than 32 and greater than 4. See [HLLP README](HLLP.md)
* Input:
* p - the precision value for the normal set
* sp - the precision value for the sparse set. If p is set, but sp is 0 or not specified, the sparse set will be disabled.
* Returns: A new HyperLogLogPlus set

### `HLLP_MERGE`
* Description: Merge hllp sets together. The resulting estimator is initialized with p and sp precision values from the first provided hllp estimator set. See [HLLP README](HLLP.md)
* Input:
* hllp - List of hllp estimators to merge. Takes a single hllp set or a list.
* Returns: A new merged HyperLogLogPlus estimator set

### Mathematical Functions

#### `ABS`
Expand Down
7 changes: 7 additions & 0 deletions metron-analytics/metron-statistics/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,13 @@
<artifactId>metron-common</artifactId>
<version>${project.parent.version}</version>
</dependency>
<dependency>
<groupId>org.apache.metron</groupId>
<artifactId>metron-common</artifactId>
<version>${project.parent.version}</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
Expand Down