Skip to content

(span-repro): add metric to track hypothetical data loss#110364

Merged
victoria-yining-huang merged 2 commits into
masterfrom
vic/fix_silently_dropped_spans
Mar 11, 2026
Merged

(span-repro): add metric to track hypothetical data loss#110364
victoria-yining-huang merged 2 commits into
masterfrom
vic/fix_silently_dropped_spans

Conversation

@victoria-yining-huang
Copy link
Copy Markdown
Contributor

@victoria-yining-huang victoria-yining-huang commented Mar 10, 2026

from this linear ticket, part of span buffer quality investigations. This is a silent failure mode where due to a variety of reasons (including consumer backpressure), when metadata that's used for outcomes and span data expire at the same time in redis, spans are dropped from redis and not processed, nor tracked in outcomes.

This is part 1 of a series of PR to investigate this silent failure mode.

  • PR 1 (current): adds simple monitoring (metric represents number of occurrences of this failure mode, not number of spans dropped). Monitor the metric after to decide if/how to alert

below was removed after syncing with @untitaker

  • ~~ [ ] PR 2: adds alerting as I believe this requires immediate attention when it happens. A potential hotfix would be extending all TTL by 2x at the cost of more redis memory ~~
  • ~~ [ ] maybe PR 3 (long-term fix, tbd): maybe implement backpressure earlier? so stop writing to Redis when segments are close to expiring, give the flusher time to catch up. ~~

@victoria-yining-huang victoria-yining-huang requested review from a team as code owners March 10, 2026 21:10
@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 10, 2026
Comment thread src/sentry/spans/buffer.py
Copy link
Copy Markdown
Member

@evanh evanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, to me.

@victoria-yining-huang victoria-yining-huang changed the title add metric (span-repro): add metric to track hypothetical data loss Mar 11, 2026
@victoria-yining-huang victoria-yining-huang merged commit b978009 into master Mar 11, 2026
61 checks passed
@victoria-yining-huang victoria-yining-huang deleted the vic/fix_silently_dropped_spans branch March 11, 2026 17:57
@github-actions github-actions Bot locked and limited conversation to collaborators Mar 27, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants