You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Otmar Ertl (@oertl) did a great job on HyperLogLog improvements in his article "New cardinality estimation algorithms for HyperLogLog sketches" https://arxiv.org/abs/1702.01284. Later he proposed to change Redis implementation of HyperLogLog (redis/redis#4749) and @antirez found that the new implementation is around 20% faster, simpler and has lower theoretical error.
We could improve ClickHouse uniqHLL12, uniqCombined aggregate functions and maybe introduce new function uniqHLL(q, p)(x) using same algorithm.
The text was updated successfully, but these errors were encountered:
…d overflow)
uniqCombined() return type is UInt64, but uniqCombined() uses
CombinedCardinalityEstimator, and CombinedCardinalityEstimator::size()
return type is UInt32, while the underlying HyperLogLog::size() is
UInt64.
So after this patch uniqCombined() can be used for >UINT_MAX values, the
outcome is not ideal (ClickHouse#2073) but at least sane.
Otmar Ertl (@oertl) did a great job on HyperLogLog improvements in his article "New cardinality estimation algorithms for HyperLogLog sketches" https://arxiv.org/abs/1702.01284. Later he proposed to change Redis implementation of HyperLogLog (redis/redis#4749) and @antirez found that the new implementation is around 20% faster, simpler and has lower theoretical error.
We could improve ClickHouse
uniqHLL12
,uniqCombined
aggregate functions and maybe introduce new functionuniqHLL(q, p)(x)
using same algorithm.The text was updated successfully, but these errors were encountered: