Skip to content

Latest commit

 

History

History
117 lines (80 loc) · 2.85 KB

README.md

File metadata and controls

117 lines (80 loc) · 2.85 KB

ddth-simplehll

Simplify the usage of HyperLogLog library.

Project home: https://github.com/DDTH/ddth-simplehll

There are many excellent HLL libraries. This small library is not another HLL implementation. Its aim is to simplify the usage of various HLL libraries, by:

  • Wrap a unified and simple API over other HLL libraries.
  • Default parameters for each HLL library that balance between accuracy and storage.

License

See LICENSE.txt for details. Copyright (c) 2016 Thanh Ba Nguyen.

Third party libraries are distributed under their own license(s).

Installation

Latest release version: 0.1.2. See RELEASE-NOTES.md.

Maven dependency: if only a sub-set of ddth-simplehll functionality is used, choose the corresponding dependency artifact(s) to reduce the number of unused jar files.

ddth-simplehll-core: include Prasanth Jayachandran's HLL implementation.

<dependency>
	<groupId>com.github.ddth</groupId>
	<artifactId>ddth-simplehll-core</artifactId>
	<version>0.1.2</version>
</dependency>

ddth-simplehll-ak: include ddth-simplehll-core and Aggregate Knowledge's HLL implementation.

<dependency>
    <groupId>com.github.ddth</groupId>
    <artifactId>ddth-simplehll-al</artifactId>
    <version>0.1.2</version>
    <type>pom</type>
</dependency>

ddth-simplehll-ats: include ddth-simplehll-core and AddThis' Stream HLL implementation.

<dependency>
    <groupId>com.github.ddth</groupId>
    <artifactId>ddth-simplehll-ats</artifactId>
    <version>0.1.2</version>
    <type>pom</type>
</dependency>

ddth-simplehll-all: include all HLL implementations.

<dependency>
    <groupId>com.github.ddth</groupId>
    <artifactId>ddth-simplehll-all</artifactId>
    <version>0.1.2</version>
    <type>pom</type>
</dependency>

Usage

Initialize a IHLL instance:

IHLL hll = new PjHll().init();  //Prasanth Jayachandran implementation

IHLL hll = new AkHll().init();  //Aggregate Knowledge implementation

IHLL hll = new AtsHll().init(); // or AddThis Stream implementation

Add items and count cardinality:

//add an item
hll.add(obj);

//count number of distinct items
long cardinality = hll.count();

Merge/Union:

hll.merge(anotherHLL);

Serialize && Deserialize:

byte[] data = HLLUtils.toBytes();

IHLL hll = HLLUtils.fromBytes(data);

Comparison of HLL implementations

See http://koff.io/posts/comparison-of-hll/ and COMPARE.md.

Credits