Skip to content

ref(span-buffer): Introduce multiprocessed flusher #93824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jun 23, 2025

Conversation

untitaker
Copy link
Member

@untitaker untitaker commented Jun 18, 2025

Refs STREAM-266

Compression helped a lot bring the time on the main thread down. We now think we can scale down the consumers. However, we also need to ensure the flusher can keep up. So let's make the flusher spawn a process per redis shard.

We think this is easier to do than making the process_spans/insert_spans multiprocessed. Over time we can offload mroe things like statistics/metrics to the flusher, and keep the main thread minimal.

untitaker and others added 9 commits June 16, 2025 14:19
This reverts commit d124072.
Resolved merge conflicts in spans buffer compression logic:
- Kept origin/master's sophisticated compression level handling
- Maintained compatibility with multiprocessing changes
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jun 18, 2025
Copy link

sentry-io bot commented Jun 18, 2025

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: src/sentry/spans/consumers/process/flusher.py

Function Unhandled Issue
submit KafkaException: KafkaError{code=_DESTROY,val=-197,str="Commit failed: Local: Broker handle destroyed"} ...
Event Count: 254

Did you find this useful? React with a 👍 or 👎

@untitaker untitaker changed the title spans buffer multiprocessing ref(span-buffer): Introduce multiprocessed flusher Jun 18, 2025
Copy link

codecov bot commented Jun 18, 2025

Codecov Report

Attention: Patch coverage is 74.25743% with 26 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/spans/consumers/process/flusher.py 66.23% 26 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master   #93824   +/-   ##
=======================================
  Coverage   88.04%   88.04%           
=======================================
  Files       10358    10358           
  Lines      598302   598361   +59     
  Branches    23239    23239           
=======================================
+ Hits       526761   526813   +52     
- Misses      71073    71080    +7     
  Partials      468      468           

@untitaker untitaker marked this pull request as ready for review June 18, 2025 16:30
@untitaker untitaker requested review from a team as code owners June 18, 2025 16:30
with metrics.timer("spans.buffer.flusher.wait_produce"):
for shard in shards:
with metrics.timer("spans.buffer.flusher.produce", tags={"shard": shard}):
for _, flushed_segment in flushed_segments.items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this logic. This looks like you are sending the same spans to every shard?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is nonsense... leftover from a prev version

@untitaker untitaker marked this pull request as draft June 23, 2025 09:00
@untitaker untitaker marked this pull request as ready for review June 23, 2025 09:52
Copy link
Member

@jan-auer jan-auer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backpressure and healthy signals could be reduced to one per process. Other than that, LGTM

@untitaker untitaker enabled auto-merge (squash) June 23, 2025 10:54
@untitaker untitaker merged commit 9bf83b0 into master Jun 23, 2025
64 checks passed
@untitaker untitaker deleted the spans-buffer-multiprocessing branch June 23, 2025 11:07
@github-actions github-actions bot locked and limited conversation to collaborators Jul 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants