Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[statsd] Add caching to tag normalization for Python3.2+ #674

Merged
merged 1 commit into from
Jul 1, 2021

Conversation

sgnn7
Copy link
Contributor

@sgnn7 sgnn7 commented Jun 30, 2021

What does this PR do?

On Python3.2 we now use the built-in @lru_cache decorator to add a small (size: 512) cache on
normalize_tags to avoid expensive re.sub calls when previously-seen tags are used in metrics.
This decreases the statsd latency, CPU, and benchmark test duration significantly (~10-30%) on
Python3.2+ with negligible impact on Python2.

Since this function is used in the submit() API too, it may offer
significant performance improvement there as well.

Description of the Change

Since tag normalization is still the highest bottleneck in metrics
submission latency, this change adds small caching (512 entries) to
that method's calls via built-in @lru_cache where available
(Python3.2+). When the cache is hit, we avoid the ultra-expensive
re.sub operation and increase the performance.

Fixes #673

Alternate Designs

We could either add a new dependency or roll our own lru_cache to support ancient Python versions but that
seems like possibly wasted effort and/or bloat increase.

Possible Drawbacks

  • Possible memory usage increase in clients that use a lot of custom tags but the cache size limit should keep
    that in check.

Verification Process

  • Run general tests on Python2 and Python3 (or do manual statsd metric sending)
  • Ensure that there are no failures

Additional Notes

Benchmark results:

Note: Benchmark code uses a limited amount of mostly-static global and metric tags
Single-threaded:

  • Python2:
    • Single-threaded UDP: +1% CPU/rss/test duration
    • Single-threaded UDS: +3% CPU/rss/test duration
  • Python3:
    • Single-threaded UDP: -22% CPU/rss/test duration
    • Single-threaded UDS: -23% CPU/rss/test duration

Multi-threaded:

  • Python2:
    • Multi-threaded UDP: +5.2% CPU/rss/test duration
    • Multi-threaded UDS: +3.2% CPU/rss/test duration
  • Python3:
    • Multi-threaded UDP: -10% CPU/rss/test duration
    • Multi-threaded UDS: -29% CPU/rss/test duration

Memory overhead: Negligible (see note about tags)

Release Notes

Review checklist (to be filled by reviewers)

  • Feature or bug fix MUST have appropriate tests (unit, integration, etc...)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have one changelog/ label attached. If applicable it should have the backward-incompatible label attached.
  • PR should not have do-not-merge/ label attached.
  • If Applicable, issue must have kind/ and severity/ labels attached at least.

@sgnn7 sgnn7 added changelog/Changed Changed features results into a major version bump kind/feature-request Feature request related issue severity/normal Normal severity issue labels Jun 30, 2021
@sgnn7 sgnn7 added this to the Next milestone Jun 30, 2021
@sgnn7 sgnn7 requested review from a team as code owners June 30, 2021 21:31
Since tag normalization is still the highest bottleneck in metrics
submission latency, this change adds small caching (512 entries) to
that method's calls via built-in `@lru_cache` where available
(Python3.2+). When the cache is hit, we avoid the ultra-expensive
`re.sub` operation and increase the performance.
@sgnn7 sgnn7 force-pushed the sgnn7/cache-tag-normalization-results-2 branch from 3a054f2 to db5a014 Compare June 30, 2021 21:34
@sgnn7 sgnn7 merged commit 6b18d6c into master Jul 1, 2021
@sgnn7 sgnn7 deleted the sgnn7/cache-tag-normalization-results-2 branch July 1, 2021 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/Changed Changed features results into a major version bump kind/feature-request Feature request related issue severity/normal Normal severity issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[statsd] Optimize tag normalization further
3 participants