Set cardinality estimates using HyperLogLog implementation
Switch branches/tags
Nothing to show
Pull request Compare This branch is 11 commits ahead, 2 commits behind buryat:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Cardinality estimation using HyperLogLog algorithm Build Status

The HyperLogLog algorithm estimates the cardinality of the data set (i.e. number of distinct elements in the data set) without having to store the actual elements seen, which would be required for a naive unique count implementation. In order to achieve a high degree of accuracy with a low memory footprint, a good hash algorithm must be chosen.


npm install cardinality



Recognizing that other people might not use the algorithm in the exact same way I do, I have attempted to preserve the integrity of the core algorithm while allowing end-users to extend many pieces of the implementation; in particular, the hash algorithm and the storage mechanisms are designed to be easily replaced in a modular fashion.

Known extensions:


Many tech bloggers and scalability evangelists have been writing about HyperLogLog and related ideas recently; however, this work is principally derived from the following pieces of work:

  1. [](The GitHub repo of reference PHP and Javascript implementations of the LogLog and HyperLogLog algorithms by Vadim Semenov), from which this repository was originally forked.

  2. The paper by Philippe Flajolet, Éric Fusy, Olivier Gandouet and Frédéric Meunier entitled "HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm", available as well as blob/master/HyperLogLog.pdf for your reference.

  3. (For future work) [](a description of a minor HyperLogLog variation which provides for sliding windows of estimation)