Skip to content

fix(spans): Prevent inflated drop counts from race condition in segment flushing#110467

Closed
victoria-yining-huang wants to merge 1 commit intomasterfrom
vic/fix_race_condition_dropped_segments
Closed

fix(spans): Prevent inflated drop counts from race condition in segment flushing#110467
victoria-yining-huang wants to merge 1 commit intomasterfrom
vic/fix_race_condition_dropped_segments

Conversation

@victoria-yining-huang
Copy link
Copy Markdown
Contributor

When process_spans() adds spans between SSCAN and GET operations during
_load_segment_data(), the ingested count includes new spans but SSCAN
doesn't see them. This causes inflated drop counts in outcome tracking:
dropped = (old + new spans) - (only old spans).

Fix by reading ingested_count once BEFORE SSCAN, ensuring the count
matches what SSCAN will see. Drop calculation is now accurate:
dropped = (spans at scan start) - (spans loaded by scan).

Note: Spans added during SSCAN will still be deleted by
done_flush_segments() without being flushed. This is a known data loss
issue requiring a more substantial fix (selective deletion, write locks,
or data model changes). This commit focuses on accurate outcome tracking
for billing and metrics.

Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

…nt flushing

When process_spans() adds spans between SSCAN and GET operations during
_load_segment_data(), the ingested count includes new spans but SSCAN
doesn't see them. This causes inflated drop counts in outcome tracking:
dropped = (old + new spans) - (only old spans).

Fix by reading ingested_count once BEFORE SSCAN, ensuring the count
matches what SSCAN will see. Drop calculation is now accurate:
dropped = (spans at scan start) - (spans loaded by scan).

Note: Spans added during SSCAN will still be deleted by
done_flush_segments() without being flushed. This is a known data loss
issue requiring a more substantial fix (selective deletion, write locks,
or data model changes). This commit focuses on accurate outcome tracking
for billing and metrics.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@victoria-yining-huang victoria-yining-huang requested review from a team as code owners March 11, 2026 19:12
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 11, 2026
@victoria-yining-huang victoria-yining-huang marked this pull request as draft March 11, 2026 19:13
Copy link
Copy Markdown
Member

@evanh evanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

# data loss bug that requires a more substantial fix (selective deletion,
# write locks, or data model changes). For now, we focus on accurate outcome
# tracking to avoid inflated billing/metrics.
initial_counts: dict[SegmentKey, int | None] = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you want to allow the value to be None ? Is there any reason to treat 0 and None differently ?
If not, please do not support None and ensure that you use 0 instead. int | None means that None represent a valid case that cannot be represented as 0. If this is not a valid use case, supporting adds cognitive overhead.

@getsantry getsantry bot added the Stale label Apr 7, 2026
@getsantry
Copy link
Copy Markdown
Contributor

getsantry bot commented Apr 7, 2026

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you remove the label Waiting for: Community, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@getsantry getsantry bot closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components Stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants