Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(metrics): Another statsd metric to measure bucket duplication [INGEST-421] #1128

Merged
merged 5 commits into from
Nov 18, 2021

Conversation

untitaker
Copy link
Member

Add a set metric that measures the number of unique bucket keys observed at a point/interval in time. Combined with other metrics this can help us in measuring how much bucket duplication happens because of horizontal scaling.

This requires us to implement some way of hashing bucket keys such that statsd can consume them. BucketKey already implements Hash to be used in hashmaps. We cannot however use the std hasher:

  • docs for DefaultHasher explicitly state that the hashes may change across rust releases (i guess this would not be too much of a blocker for our purpose, but it's annoying to keep in mind)

  • SipHasher is deprecated (but could probably be used)

we already have crc32fast in our dependency tree so let's depend on it
explicitly and just use that

there's still one caveat in that the impls of Hash may call different methods depending on cpu architecture (as stated in https://docs.rs/deterministic-hash/1.0.1/deterministic_hash/), but I think we can live with that for now.

…NGEST-421]

Add a set metric that measures the number of unique bucket keys observed at a point/interval in time.

This requires us to implement some way of hashing bucket keys such that statsd can consume them. BucketKey already implements Hash to be used in hashmaps. We cannot however use the std hasher:

* docs for DefaultHasher explicitly state that the hashes may change across rust releases (i guess this would not be too much of a blocker for our purpose, but it's annoying to keep in mind)

* SipHasher is deprecated (but could probably be used)

we already have crc32fast in our dependency tree so let's depend on it
explicitly and just use that

there's still one caveat in that the impls of Hash may call different methods depending on cpu architecture (as stated in https://docs.rs/deterministic-hash/1.0.1/deterministic_hash/), but I think we can live with that for now.
@untitaker untitaker requested a review from a team November 17, 2021 11:49
@untitaker untitaker merged commit 113b719 into master Nov 18, 2021
@untitaker untitaker deleted the feat/bucket-set-metric branch November 18, 2021 10:57
jan-auer added a commit that referenced this pull request Nov 24, 2021
* master:
  test(outcomes): Fix sort order in flaky test (#1135)
  feat(outcomes): Aggregate more outcomes (#1134)
  ref(outcomes): Fold processing vs non-processing into single actor (#1133)
  build: Update symbolic to support UE5 (#1132)
  feat(metrics): Extract measurement ratings, port from frontend (#1130)
  feat(metrics): Another statsd metric to measure bucket duplication (#1128)
  feat(outcomes): Emit outcomes as client reports (#1119)
  fix: Move changelog line to right version (#1129)
  fix(dangerjs): Do not suggest to add JIRA ticket to changelog (#1125)
  feat(metrics): Tag metrics by transaction name [INGEST-542] (#1126)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants