Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not fail as consistency check for canceled requests #3837

Merged

Conversation

colega
Copy link
Contributor

@colega colega commented Jan 2, 2023

What this PR does

Stops ignoring the context.Canceled error from store gateway interaction: this made the entire request fail as "consistency check" because that store gateway didn't have the block, isntead of actually failing as context.Canceled which isn't a server's fault (usually).

Which issue(s) this PR fixes or relates to

Fixes internal.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
@colega colega marked this pull request as ready for review January 2, 2023 16:54
@colega colega requested a review from a team as a code owner January 2, 2023 16:54
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

@pracucci pracucci merged commit 6fca816 into main Jan 3, 2023
@pracucci pracucci deleted the do-not-fail-as-consistency-check-for-canceled-requests branch January 3, 2023 08:08
@@ -718,6 +718,10 @@ func (q *blocksStoreQuerier) fetchSeriesFromStores(

stream, err := c.Series(gCtx, req)
if err != nil {
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a stream, would it also make sense to handle the cancellations when receiving from the stream as well?

resp, err := stream.Recv()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be covered by the line above:

// Ensure the context hasn't been canceled in the meanwhile (eg. an error occurred
// in another goroutine).
if gCtx.Err() != nil {
return gCtx.Err()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about the store-gateway streaming implementation - most of the time in the sotre-gateway is spent waiting to load series and chunks. And much less sending to the querier. To me this means that most of the time in the querier will be spent waiting for a store-gateway response. So the context cancellation is more likely to occur while the querier is waiting on a response from the store-gateway.

unless i'm missing something i can open a PR for this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the context cancellation is more likely to occur while the querier is waiting on a response from the store-gateway.

True, but you will still catch it right after the next message is received from gRPC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't understand. I thought srv.Recv() will return a contect.Cancelled error here and we will just swallow it

resp, err := stream.Recv()
if errors.Is(err, io.EOF) {
break
}
if err != nil {
level.Warn(spanLog).Log("msg", "failed to receive series", "remote", c.RemoteAddress(), "err", err)
return nil
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I see what you mean and you're right. I misunderstood your previous message. We need to fix it and happy if you can work on it, as you proposed above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're definitely right @dimitarvdimitrov, good catch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants