-
Notifications
You must be signed in to change notification settings - Fork 464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enforce max_fetched_series_per_query in store-gateway #4056
Conversation
890b5a8
to
c89c168
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a quick look at the logic and left few comments. I will review tests once comments are addressed. Thanks!
Please updated the CLI flag description for MaxFetchedSeriesPerQuery
in pkg/util/validation/limits.go
to mention it's also enforced in the store-gateway. Then run make reference-help doc
to rebuild the auto-generated doc.
CHANGELOG.md
Outdated
@@ -4,7 +4,9 @@ | |||
|
|||
### Grafana Mimir | |||
|
|||
* [CHANGE]: Querier: returning 422 when query hits `max_fetched_chunks_per_query` and `max_fetched_series_per_query` limits in the store-gateway. #4056 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How was tested the querier now returns 422? Have you done any manual test? We may consider an integration test (I can give you guidance).
I'm asking because the fact that we return 422 via gRPC from store-gateway doesn't mean the querier will honor it. We have to check the whole path and having an integration test may be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pracucci As agreed, we will check the outcome of manual tests tomorrow. However, I run the existing integration test querier_test.TestQueryLimitsWithBlocksStorageRunningInMicroServices
, and realized that the error returned by the call to QueryRange()
does NOT contain the required status code 422
.
However, I did some debugging and came to the following conclusion: the call to QueryRange()
gets propagated to httpAPI.QueryRange()
, which calls apiClientImpl.DoGetFallback()
. The latter successfully returns (*http.Response, []byte, Warnings, error)
, which first output DOES contain the required status code 422
, but that output gets ignored by httpAPI.QueryRange()
. I don't know whether httpAPI
is used only for integration test purposes or it is actually used in production. In the first case, we would need to fix the integration test. In the second case, httpAPI
should be fixed. What do you think @pracucci @dimitarvdimitrov?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As i understand the problem is in the query client that we use in the integration tests. So the HTTP 422 gets from the store-gateway to the querier and from the querier to the HTTP response, but the golang client filters it out at some point.
In production we don't run HTTP requests against the querier. At least not directly. The querier receives HTTP requests over gRPC from the query-scheduler, executes the queries and returns the results to a query-frontend (address of QF is in the original request).
I believe that these are the two HTTP handlers that handle a query in the querier - query_range, instant query. They both use some injected queriables that we set up in NewqQurierHandler
They are HTTP handlers, but their HTTP response is sent over gRPC to the query-frontend. You can check the querier logic which receives HTTP requests over gRPC and send the responses to the query-frontend here frontendProcessor.runRequest
So i believe that as those two HTTP handlers return the right HTTP codes, they will also get to the query-frontend and to the client (it looks like they do, but I haven't looked at the call chain between these handlers and the store-gateway client)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @dimitarvdimitrov
I will then override the golang client in my integration test.
Could you please re-approve and merge my PR (I changed some names, as Marco recommended)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! Changes LGTM. I left a couple of minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Yuri manually tested it and looks working as expected, so going to merge it! |
What this PR does
There are a few limits related to the max amount of work a query may do. In particular:
Although both of these limits are enforced in the queriers, only max_fetched_chunks_per_query is enforced in the store-gateway. This PR enforces max_fetched_series_per_query in the store-gateway as well.
Which issue(s) this PR fixes or relates to
Fixes #3382
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]