Skip to content

Querier: store-gateway Series stream resources released only via shared errgroup context (make per-stream cleanup explicit) #7575

@sandy2008

Description

@sandy2008

AI Tool Usage Notice

Drafted with help from Claude Code. All code references, line numbers, failure excerpts, and reproduction steps were reviewed and validated against master before submitting.

Describe the bug

In the querier, fetchSeriesFromStores (pkg/querier/blocks_store_queryable.go) opens a
store-gateway Series server-streaming gRPC for each store-gateway client, inside a
goroutine managed by an errgroup. Each stream is created with the shared errgroup context
gCtx and the goroutine never cancels its own stream. As a result, a stream's client-side
resources (the per-RPC context + its grpc stream goroutine / HTTP/2 stream bookkeeping) are
reclaimed only when gCtx is cancelled — and errgroup's derived context is cancelled only
on the first goroutine error, or when g.Wait() returns
(i.e. when the slowest concurrent
store-gateway request for the whole query has finished).

So a store-gateway request that finishes early — drained to EOF, or returning early on a
per-query limit / validation error — keeps its per-RPC gRPC context alive until every other
concurrent store-gateway request for that query also completes
. With many store-gateways /
blocks fanned out per query, this delays release of per-stream resources and makes cleanup
implicitly dependent on errgroup timing rather than explicit and self-contained.

This is robustness / resource-hygiene hardening, not a report of a confirmed production leak:
on every current return path the stream is eventually reclaimed (drain-to-EOF, a terminal
Recv() error, or gCtx cancellation). The change makes per-stream cleanup deterministic and
prompt
, and brings the code in line with the gRPC ClientConn.NewStream contract that the
generated client itself points at.

Affected code

  • pkg/querier/blocks_store_queryable.gofetchSeriesFromStores — the g.Go(func() error { … })
    goroutine that calls stream, err := c.Series(gCtx, req) and then loops on stream.Recv().

Root cause

  1. c.Series(gCtx, req) binds the stream's lifetime to the shared errgroup context gCtx.
  2. The goroutine returns through several paths (EOF; retryable Recv error → return nil;
    gCtx.Err() != nil; per-query limit / processSeries error; hints-unmarshal error) but never
    cancels the stream it created.
  3. errgroup.WithContext cancels gCtx only on the first non-nil error or when Wait()
    returns. So for the common all-success fan-out, a fast store-gateway's stream context lingers
    until the slowest store-gateway finishes.

Why the "obvious" fixes are wrong

  • defer stream.CloseSend() does not fix this. The generated Series client
    (pkg/storegateway/storegatewaypb/gateway.pb.go, and likewise thanos storepb) already calls
    SendMsg and CloseSend() internally for this server-streaming RPC. Per gRPC semantics,
    CloseSend() only half-closes the send direction; it does not release the receive-side
    stream goroutine / context. Only cancelling the context or draining RecvMsg to a
    non-nil error
    reclaims those resources (see the doc referenced from the generated client:
    https://pkg.go.dev/google.golang.org/grpc#ClientConn.NewStream).
  • LabelNames / LabelValues are not affected. In StoreGatewayClient they are unary
    (*LabelNamesResponse / *LabelValuesResponse) — there is no client stream and nothing to
    close. fetchLabelNamesFromStore / fetchLabelValuesFromStore need no change.

To Reproduce

This is resource-hygiene hardening surfaced by code inspection, not a crash with a runtime repro. The lifecycle can be observed as follows:

  1. Run a querier with the store-gateway read path and issue a query that fans out to multiple store-gateways / blocks.
  2. Arrange for one store-gateway Series stream to finish early (drained to EOF, or returning on a per-query limit / validation error) while another store-gateway request for the same query is still in flight.
  3. Observe that the early-finishing stream's per-RPC gRPC context is not released until the shared errgroup context is cancelled — i.e. until the slowest concurrent request completes (or g.Wait() returns). A unit test demonstrating this is described under Acceptance criteria.

Expected behavior

Each store-gateway Series stream's client-side resources (per-RPC context, stream goroutine / HTTP/2 bookkeeping) are released as soon as its own goroutine returns — success or error — independent of the other concurrent store-gateway requests for the same query.

Proposed fix

Give each Series stream its own cancellable child context and cancel it on every return path,
scoped to fetchSeriesFromStores only:

// Use a dedicated cancellable context per stream so the gRPC Series stream's
// resources are released as soon as this goroutine returns (via any path),
// instead of lingering until the shared errgroup context is cancelled when the
// slowest concurrent request finishes.
streamCtx, cancel := context.WithCancel(gCtx)
defer cancel()
stream, err := c.Series(streamCtx, req)

This mirrors the intent of the existing distributor pattern
(pkg/distributor/query.go uses defer stream.CloseSend() for the ingester QueryStream) but
uses the mechanism that actually releases stream resources.

Trade-offs / notes

  • No change to query results, error handling, retries, or limits — purely lifecycle hygiene.
  • Negligible overhead: one extra context.WithCancel per store-gateway request per query.
  • Cancelling streamCtx on the success path is safe: by then all *storepb.Series have been
    copied into the in-memory result set, so nothing reads from the stream afterwards.

Acceptance criteria

  • Each Series stream is created with a per-stream child of gCtx and that context is
    cancelled when its goroutine returns (success or error), independent of the other
    concurrent store-gateway requests.
  • Regression test demonstrating that a store-gateway whose request finishes first has its
    stream context cancelled while another (still in-flight) store-gateway request is blocked.
  • No behavioral change to returned series / warnings / queried blocks.
  • go build, go test ./pkg/querier/..., and lint pass.

Environment

  • Infrastructure: N/A — found via code inspection; code is present on master and all currently supported releases. Affects the querier store-gateway read path.
  • Deployment tool: N/A

Additional context

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions