Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

active series: abort streaming on context cancelled #7387

Merged
merged 3 commits into from
Feb 15, 2024

Conversation

flxbk
Copy link
Contributor

@flxbk flxbk commented Feb 14, 2024

What this PR does

This makes sharded active series queries abort response streaming once the request context is cancelled. Without this, processing would continue in the background even after requests are cancelled by the client.

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@flxbk flxbk marked this pull request as ready for review February 14, 2024 18:02
@flxbk flxbk requested a review from a team as a code owner February 14, 2024 18:02
return true
})
items <- item
select {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you could save the select statement and simply check against ctx.Err at the beginning of each iteration. according to the documentation (ref), a non-nil value is always expected when the Done channel is closed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried about the performance implications of calling ctx.Err() on every iteration, what do you think about the periodic check I added in 2040eb8?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for a post-merge comment, but wanted to note on this one:

I'm a bit worried about the performance implications of calling ctx.Err() on every iteration

We actually have benchmarks for this code path, which we can execute with

% go test ./pkg/frontend/querymiddleware/ -run XX -bench 'BenchmarkActiveSeriesMiddlewareMergeResponses/encoding=none'

When one compares the overhead of an if inside the loop and allocating a ticker per response, the former adds a much lower overhead (+1% -vs- +7%) ✌️

% benchstat ~/tmp/{1,2,3}.txt                                                                                                                                                            ⋯
goos: darwin
goarch: arm64
pkg: github.com/grafana/mimir/pkg/frontend/querymiddleware
                                                                     │ /Users/v/tmp/1.txt │         /Users/v/tmp/2.txt         │         /Users/v/tmp/3.txt         │
                                                                     │       sec/op       │   sec/op     vs base               │   sec/op     vs base               │
ActiveSeriesMiddlewareMergeResponses/encoding=none/num-responses-2-4          7.400µ ± 0%   7.911µ ± 1%  +6.91% (p=0.000 n=10)   7.482µ ± 0%  +1.11% (p=0.000 n=10)

                                                                     │ /Users/v/tmp/1.txt │          /Users/v/tmp/2.txt          │         /Users/v/tmp/3.txt          │
                                                                     │        B/op        │     B/op      vs base                │     B/op      vs base               │
ActiveSeriesMiddlewareMergeResponses/encoding=none/num-responses-2-4         1.539Ki ± 0%   1.977Ki ± 0%  +28.46% (p=0.000 n=10)   1.586Ki ± 0%  +3.05% (p=0.000 n=10)

                                                                     │ /Users/v/tmp/1.txt │         /Users/v/tmp/2.txt         │       /Users/v/tmp/3.txt       │
                                                                     │     allocs/op      │ allocs/op   vs base                │ allocs/op   vs base            │
ActiveSeriesMiddlewareMergeResponses/encoding=none/num-responses-2-4           34.00 ± 0%   40.00 ± 0%  +17.65% (p=0.000 n=10)   34.00 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

Note, 1.txt are the results from the original code; 2.txt the merged version with ticker; 3.txt a version with if ctx.Err() != nil inside the loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, see follow up in #7396

default:
item := labelBuilderPool.Get().(*labels.Builder)
it.ReadMapCB(func(iterator *jsoniter.Iterator, s string) bool {
item.Set(s, iterator.ReadString())
Copy link
Contributor

@replay replay Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many iterations does on call to it.ReadMapCB typically have to perform?
if it's a small number (hundreds) then I think the current solution is fine, if it can sometimes be a large number (millions) then i'd consider putting something like this into the function that you pass into it.ReadMapCB() instead of only doing it outside of it:

select {
    case <- ticker.C:
        // check if ctx canceled
    default:
        item.Set(s, iterator.ReadString())
}
return true

Copy link
Contributor Author

@flxbk flxbk Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadMapCB iterates over the label set of one series, so it should typically be 40 or fewer iterations.

@flxbk flxbk merged commit da0cc34 into main Feb 15, 2024
28 checks passed
@flxbk flxbk deleted the felix/active-series-context-cancellation branch February 15, 2024 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants