ref(spans): introduce spans buffer store abstraction#116382
Conversation
Introduce LoadedSegment to keep a flush candidate, loaded payloads, payload keys, and ingest metadata together through the flush pipeline. This replaces parallel maps keyed by segment key and makes the next Redis store abstraction smaller and safer.
|
I'm very suspicious about this layer of abstraction. I feel that SpanBuffer itself becomes a very shallow and useless abstraction this way. A lot of the "orchestration" is deep inside Redis isn't slow, so I don't see a need to mock it out. It's about as fast as in-memory operations. |
@untitaker I agree with this, but I do feel like this makes the code easier to follow. |
SpanBuffer is still doing all the orchestration and other smaller operations that does not interact with redis (i.e. group spans by parent).
I agree, maybe we shouldn't word this as adding an "abstraction" over redis but we're really just adding a class that owns any operation that talks to redis. I would say this addition is mainly for readability. |
untitaker
left a comment
There was a problem hiding this comment.
didn't mean to block this. refactoring for readability is always good. i'm just not sure about using this as a testing strategy (i.e. writing most of our tests without redis)
That makes sense now, I misunderstood the initial comment. I thought we were talking about how separating spans buffer redis operations out of the main orchestration class would impact speed. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 27826fc. Configure here.
fpacifici
left a comment
There was a problem hiding this comment.
Mostly high leve suggestion for follow up. But there is one blocker: why are you not using the real redis in the test_buffer_store module?
Mocking redis is just removing test coverage here.
- remove SpansBuffer key-helper passthroughs - move payload preparation into SpansBufferStore - add docstrings for public store methods - simplify insert_subsegments and flush candidate mapping
|
|
||
| pytestmark = [pytest.mark.django_db] | ||
|
|
||
| # Keep these tests in their own Redis keyspace. CI runs test files in parallel, |
There was a problem hiding this comment.
the units of concurrency for running tests have their own redis each. you can use flushdb and in fact it's running automatically after every test.
There was a problem hiding this comment.
With flushdb, I ran into the problem locally where I ran pytest tests/sentry/spans/test_buffer* and I would have one singular test failing (test_deep2), but the test ran file in isolation. I will look into it more.
fpacifici
left a comment
There was a problem hiding this comment.
Please see the comment inline.
Re: deployment, while this code is reasonably well tested, it is a large change we generally ship behind feature flags. Please be extra careful to s4s2 (both correctness and performance) when you ship it. You'd have to catch issues before it goes out to prod.
For the future:
- please contain the size of your changes. That makes it not only easier to review but also easier to validate. The key is not necessarily the number of lines (copying a file in a different place makes a lot of lines but it is trivial), but the actual logic change. Here the risk is passing the wrong parameter to a function for example.
- consider moving some test coverage from test_buffer to test_buffer_store. Now that is a smaller system so we can test more corner cases. You do not have to necessarily remove tests from test_buffer.
| if compression_level == -1: | ||
| self._zstd_compressor = None | ||
| else: | ||
| self._zstd_compressor = zstandard.ZstdCompressor(level=compression_level) |
There was a problem hiding this comment.
You were doing this once per call to process_spans before. Now you are doing this at every batch.
Is this an expensive operation that had to be done rarely ? I'd consider moving into store_payloads unless we are sure this is cheap.

Refs STREAM-1002
SpansBufferhad orchestration, Redis command details, Lua result mapping, key construction, and observability all mixed together. This PR keeps the high-level buffer flow readable while moving low-level Redis mechanics behind SpansBufferStore.This PR Introduces
SpansBufferStoreas the redis store class for theSpansBuffer.This moves Redis-specific mechanics out of
SpansBufferand into the store, including:add-buffer.luascript loading andEVALSHAexecutionInsertedSubsegmentFlushCandidateflowchart TB Relay["Relay span payloads"] --> Process["process_spans"] SpanFlusher["SpanFlusher"] --> Flush["flush_segments"] SpanFlusher --> Done["done_flush_segments"] subgraph Buffer["SpansBuffer"] Process --> BuildSubsegments["build Subsegment objects"] BuildSubsegments --> ProcessStore ProcessStore --> ProcessObs["process metrics and logs"] Flush --> FlushStore FlushStore --> RecordLoss["record loss metrics"] RecordLoss --> BuildFlushed["build FlushedSegment objects"] BuildFlushed --> FlushObs["flush metrics and logs"] Done --> CleanupStore CleanupStore --> DoneObs["done_flush_segments metrics"] subgraph Store["self.store: SpansBufferStore"] ProcessStore["store_payloads, insert_subsegments, update_queue"] FlushStore["load_flush_candidates, acquire_flush_locks, load_segments, queue deadline"] CleanupStore["cleanup_flushed_segments"] end end ProcessStore -. "Redis commands and Lua result mapping" .-> Redis[(Redis)] FlushStore -. "Redis commands and result mapping" .-> Redis CleanupStore -. "Redis cleanup" .-> Redis BuildFlushed -->|"FlushedSegment objects"| SpanFlusher