fix(spans): Prevent inflated drop counts from race condition in segment flushing#110467
Closed
victoria-yining-huang wants to merge 1 commit intomasterfrom
Closed
fix(spans): Prevent inflated drop counts from race condition in segment flushing#110467victoria-yining-huang wants to merge 1 commit intomasterfrom
victoria-yining-huang wants to merge 1 commit intomasterfrom
Conversation
…nt flushing When process_spans() adds spans between SSCAN and GET operations during _load_segment_data(), the ingested count includes new spans but SSCAN doesn't see them. This causes inflated drop counts in outcome tracking: dropped = (old + new spans) - (only old spans). Fix by reading ingested_count once BEFORE SSCAN, ensuring the count matches what SSCAN will see. Drop calculation is now accurate: dropped = (spans at scan start) - (spans loaded by scan). Note: Spans added during SSCAN will still be deleted by done_flush_segments() without being flushed. This is a known data loss issue requiring a more substantial fix (selective deletion, write locks, or data model changes). This commit focuses on accurate outcome tracking for billing and metrics. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
fpacifici
reviewed
Mar 17, 2026
| # data loss bug that requires a more substantial fix (selective deletion, | ||
| # write locks, or data model changes). For now, we focus on accurate outcome | ||
| # tracking to avoid inflated billing/metrics. | ||
| initial_counts: dict[SegmentKey, int | None] = {} |
Contributor
There was a problem hiding this comment.
Why do you want to allow the value to be None ? Is there any reason to treat 0 and None differently ?
If not, please do not support None and ensure that you use 0 instead. int | None means that None represent a valid case that cannot be represented as 0. If this is not a valid use case, supporting adds cognitive overhead.
Contributor
|
This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you remove the label "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When process_spans() adds spans between SSCAN and GET operations during
_load_segment_data(), the ingested count includes new spans but SSCAN
doesn't see them. This causes inflated drop counts in outcome tracking:
dropped = (old + new spans) - (only old spans).
Fix by reading ingested_count once BEFORE SSCAN, ensuring the count
matches what SSCAN will see. Drop calculation is now accurate:
dropped = (spans at scan start) - (spans loaded by scan).
Note: Spans added during SSCAN will still be deleted by
done_flush_segments() without being flushed. This is a known data loss
issue requiring a more substantial fix (selective deletion, write locks,
or data model changes). This commit focuses on accurate outcome tracking
for billing and metrics.
Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.