Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve HyperLogLog documentation #1590

Merged
merged 2 commits into from
Sep 13, 2022
Merged

Conversation

syvb
Copy link
Member

@syvb syvb commented Sep 13, 2022

Description

Updates the HyperLogLog documentation:

  • Emphasizes the warning about low bucket count: using a value for buckets that's less than 1024 results in very poor
    accuracy, and is almost never wanted.
  • Remove outdated claim about poor performance with low cardinality: this used to be true, but no longer is, since Toolkit now uses HyperLogLog++ which has better accuracy for low cardinality data.

Links

https://docs.timescale.com/api/latest/hyperfunctions/approx_count_distincts/hyperloglog/

Writing help

For information about style and word usage, see the style guide

Review checklists

Reviewers: use this section to ensure you have checked everything before approving this PR:

Subject matter expert (SME) review checklist

  • Is the content technically accurate?
  • Is the content complete?
  • Is the content presented in a logical order?
  • Does the content use appropriate names for features and products?
  • Does the content provide relevant links to further information?

Documentation team review checklist

  • Is the content free from typos?
  • Does the content use plain English?
  • Does the content contain clear sections for concepts, tasks, and references?
  • Have any images been uploaded to the correct location, and are resolvable?
  • If the page index was updated, are redirects required
    and have they been implemented?
  • Have you checked the built version of this content?

@github-actions
Copy link

Please allow 10 minutes from last push for the staging site to build. For internal reviewers, check web-documentation repo actions for staging build status. Link to build for this PR: http://docs-dev.timescale.com/docs-sv-hyperloglog-low-bucket-warning

Using a value for buckets that's less than 1024 results in very poor
accuracy, and is almost never wanted. As such, emphasize the warning
about this pitfall.
This used to be true, but since this was written Toolkit has been
updated to use HyperLogLog++, which has better accuarcy for low
cardinality datasets.
@syvb syvb force-pushed the sv/hyperloglog-low-bucket-warning branch from 47da524 to 28fe612 Compare September 13, 2022 20:39
@syvb syvb merged commit 42f0a6b into latest Sep 13, 2022
@syvb syvb deleted the sv/hyperloglog-low-bucket-warning branch September 13, 2022 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants