Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimizations #61

Merged
merged 7 commits into from
Apr 20, 2023
Merged

Performance optimizations #61

merged 7 commits into from
Apr 20, 2023

Conversation

mariomac
Copy link
Contributor

@mariomac mariomac commented Apr 18, 2023

Addresses #41 .

Benchs are measured without traces (only metrics submission).

Taking the results in the main branch as base bench (20% CPU in my local dev environment), I added some changes that improved the CPU consumption:

  • Instead of forwarding individual traces through the pipeline graph, we send batches of traces. This reduced the CPU from 20% to ~13%.
  • Instead of waking up on each message from the kernel to the user space, we allow the user optionally select how many messages need to be submitted before waking up the user space (property BPF_WAKEUP_LEN). This reduced the CPU from ~13% to ~7%.

In addition, the users can now enable a Pprof port by setting the PROFILE_PORT option (see profiling.md). This will facilitate asking our users for performance stats for our own offline analysis.

@mariomac mariomac requested a review from grcevski April 19, 2023 09:17
@mariomac mariomac changed the title WIP: performance optimization via ring buffer Performance optimizations Apr 19, 2023
@codecov-commenter
Copy link

codecov-commenter commented Apr 19, 2023

Codecov Report

Merging #61 (f0007d0) into main (43ae3d4) will decrease coverage by 1.95%.
The diff coverage is 40.14%.

@@            Coverage Diff             @@
##             main      #61      +/-   ##
==========================================
- Coverage   48.77%   46.82%   -1.95%     
==========================================
  Files          15       15              
  Lines        1060     1119      +59     
==========================================
+ Hits          517      524       +7     
- Misses        501      548      +47     
- Partials       42       47       +5     
Flag Coverage Δ
unittests 46.82% <40.14%> (-1.95%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/ebpf/nethttp/nethttp.go 0.00% <0.00%> (ø)
pkg/export/debug/debug.go 14.28% <0.00%> (-0.53%) ⬇️
pkg/pipe/config.go 48.48% <0.00%> (-4.85%) ⬇️
pkg/export/otel/traces.go 73.10% <74.46%> (-7.85%) ⬇️
pkg/export/otel/metrics.go 78.30% <100.00%> (+0.20%) ⬆️
pkg/transform/routes.go 100.00% <100.00%> (ø)
pkg/transform/spanner.go 82.27% <100.00%> (+0.69%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@mariomac mariomac force-pushed the rbperf branch 2 times, most recently from db4e77f to 1edd26c Compare April 19, 2023 10:00
Copy link
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work!

@grcevski
Copy link
Contributor

I just wanted to report some numbers, with this branch on my tests setup, this reduces the overall overhead of the instrumentation to 30% on 10,000 QPS. I think it's pretty incredible. All we do in the test is just accept the request and ignore it. Running locally, both tempo and wrk2 take more CPU now than the instrumenter.

@mariomac mariomac merged commit cdf31a9 into grafana:main Apr 20, 2023
2 checks passed
@mariomac mariomac deleted the rbperf branch April 20, 2023 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants