Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[statsd] Improve tag normalization speed #672

Merged
merged 2 commits into from
Jun 30, 2021

Conversation

sgnn7
Copy link
Contributor

@sgnn7 sgnn7 commented Jun 29, 2021

What does this PR do?

Speeds up the tag normalization function. By being in the hot path in statsd, this improves the
performance of statsd significantly (~30% reduction in latency, CPU utilization, and benchmark duration across the board).

Since this function is used in the submit() API too, it may offer significant performance improvement there as well.

Description of the Change

  • Old code uses a string for sanitization of tags instead of a compiled regex, causing an excessive CPU overhead. This has now been changed to use the pre-compiled version.
  • Adds profiling support to the statsd benchmark
  • Adds tests for normalize_tags function

Fixes #671

Alternate Designs

Possible Drawbacks

None

Verification Process

  • Run the attached tests with python3 -m unittest -vvv tests.unit.util.test_format.TestNormalizeTags and/or python2 -m unittest -vvv tests.unit.util.test_format.TestNormalizeTags
  • Ensure that tests pass

Additional Notes

Benchmark results vs main branch:

  • Python 3 - single thread - UDS: -28% in latency/CPU/duration
  • Python 3 - single thread - UDP: -28% in latency/CPU/duration
  • Python 2 - single thread - UDS: -33% in latency/CPU/duration
  • Python 2 - single thread - UDP: -30% in latency/CPU/duration

Release Notes

Review checklist (to be filled by reviewers)

  • Feature or bug fix MUST have appropriate tests (unit, integration, etc...)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have one changelog/ label attached. If applicable it should have the backward-incompatible label attached.
  • PR should not have do-not-merge/ label attached.
  • If Applicable, issue must have kind/ and severity/ labels attached at least.

This change allows us to exactly see where statsd is spending the
majority of its time when processing metrics. The profiling can be
activated via `BENCHMARK_PROFILING` variable.
@sgnn7 sgnn7 added changelog/Changed Changed features results into a major version bump kind/feature-request Feature request related issue severity/normal Normal severity issue labels Jun 29, 2021
@sgnn7 sgnn7 added this to the Next milestone Jun 29, 2021
@sgnn7 sgnn7 requested review from a team as code owners June 29, 2021 21:11
@sgnn7 sgnn7 self-assigned this Jun 29, 2021
@sgnn7 sgnn7 force-pushed the sgnn7/statsd-improve-tag-normalization-speed-main branch 2 times, most recently from ffa4dfc to edac974 Compare June 29, 2021 21:28
Previously, we used a raw string in `re.sub` which is in the hot-path
and needed compilation every time so now we use the precompiled version
of the regex to do the substitution, yielding a much better performance.
@sgnn7 sgnn7 force-pushed the sgnn7/statsd-improve-tag-normalization-speed-main branch from edac974 to 68a1ebc Compare June 29, 2021 21:31
@therve
Copy link
Contributor

therve commented Jun 30, 2021

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@therve therve merged commit 1a90bea into master Jun 30, 2021
@therve therve deleted the sgnn7/statsd-improve-tag-normalization-speed-main branch June 30, 2021 07:46
Copy link
Member

@olivielpeau olivielpeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff, thanks!

@zippolyte zippolyte added changelog/Added Added features results into a minor version bump and removed changelog/Changed Changed features results into a major version bump labels Jul 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog/Added Added features results into a minor version bump kind/feature-request Feature request related issue severity/normal Normal severity issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[statsd] Optimize tag normalization
4 participants