Cardinality Estimators.
See HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm
Example:
HyperLogLog hll = new HyperLogLog();
int elements = 100000;
for (int i = 0; i < elements; i++) {
hll.add(new Object());
}
double cardinality = hll.cardinality();
System.out.println("Elements: " + elements);
System.out.println("Cardinality: " + cardinality);
double epsilon = (1.0 - (cardinality / elements)) * 100;
System.out.println("Epsilon: " + epsilon + "%");
- Add Murmur hashing
- implement concurrent version
- implement HyperLogLog++ (see HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardinality Estimation Algorithm)
- Optimize and clean code