-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
active series: abort streaming on context cancelled #7387
Conversation
return true | ||
}) | ||
items <- item | ||
select { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you could save the select
statement and simply check against ctx.Err
at the beginning of each iteration. according to the documentation (ref), a non-nil value is always expected when the Done
channel is closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried about the performance implications of calling ctx.Err()
on every iteration, what do you think about the periodic check I added in 2040eb8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for a post-merge comment, but wanted to note on this one:
I'm a bit worried about the performance implications of calling
ctx.Err()
on every iteration
We actually have benchmarks for this code path, which we can execute with
% go test ./pkg/frontend/querymiddleware/ -run XX -bench 'BenchmarkActiveSeriesMiddlewareMergeResponses/encoding=none'
When one compares the overhead of an if
inside the loop and allocating a ticker
per response, the former adds a much lower overhead (+1% -vs- +7%) ✌️
% benchstat ~/tmp/{1,2,3}.txt ⋯
goos: darwin
goarch: arm64
pkg: github.com/grafana/mimir/pkg/frontend/querymiddleware
│ /Users/v/tmp/1.txt │ /Users/v/tmp/2.txt │ /Users/v/tmp/3.txt │
│ sec/op │ sec/op vs base │ sec/op vs base │
ActiveSeriesMiddlewareMergeResponses/encoding=none/num-responses-2-4 7.400µ ± 0% 7.911µ ± 1% +6.91% (p=0.000 n=10) 7.482µ ± 0% +1.11% (p=0.000 n=10)
│ /Users/v/tmp/1.txt │ /Users/v/tmp/2.txt │ /Users/v/tmp/3.txt │
│ B/op │ B/op vs base │ B/op vs base │
ActiveSeriesMiddlewareMergeResponses/encoding=none/num-responses-2-4 1.539Ki ± 0% 1.977Ki ± 0% +28.46% (p=0.000 n=10) 1.586Ki ± 0% +3.05% (p=0.000 n=10)
│ /Users/v/tmp/1.txt │ /Users/v/tmp/2.txt │ /Users/v/tmp/3.txt │
│ allocs/op │ allocs/op vs base │ allocs/op vs base │
ActiveSeriesMiddlewareMergeResponses/encoding=none/num-responses-2-4 34.00 ± 0% 40.00 ± 0% +17.65% (p=0.000 n=10) 34.00 ± 0% ~ (p=1.000 n=10) ¹
¹ all samples are equal
Note, 1.txt
are the results from the original code; 2.txt
the merged version with ticker; 3.txt
a version with if ctx.Err() != nil
inside the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, see follow up in #7396
default: | ||
item := labelBuilderPool.Get().(*labels.Builder) | ||
it.ReadMapCB(func(iterator *jsoniter.Iterator, s string) bool { | ||
item.Set(s, iterator.ReadString()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how many iterations does on call to it.ReadMapCB
typically have to perform?
if it's a small number (hundreds) then I think the current solution is fine, if it can sometimes be a large number (millions) then i'd consider putting something like this into the function that you pass into it.ReadMapCB()
instead of only doing it outside of it:
select {
case <- ticker.C:
// check if ctx canceled
default:
item.Set(s, iterator.ReadString())
}
return true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadMapCB
iterates over the label set of one series, so it should typically be 40 or fewer iterations.
What this PR does
This makes sharded active series queries abort response streaming once the request context is cancelled. Without this, processing would continue in the background even after requests are cancelled by the client.
Which issue(s) this PR fixes or relates to
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.