-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow server to return HTTP 4xx errors #7045
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that this change will also force the httpgrpc server to return an error and ignore the httpgrpc response when the code is 4xx?
There is this place in the query-fronted which needs an http response (taken from http grpc) in order to propagate the error back to the user.
mimir/pkg/frontend/querymiddleware/codec.go
Lines 406 to 417 in 9fd2b19
switch r.StatusCode { | |
case http.StatusServiceUnavailable: | |
return nil, apierror.New(apierror.TypeUnavailable, string(mustReadResponseBody(r))) | |
case http.StatusTooManyRequests: | |
return nil, apierror.New(apierror.TypeTooManyRequests, string(mustReadResponseBody(r))) | |
case http.StatusRequestEntityTooLarge: | |
return nil, apierror.New(apierror.TypeTooLargeEntry, string(mustReadResponseBody(r))) | |
default: | |
if r.StatusCode/100 == 5 { | |
return nil, apierror.New(apierror.TypeInternal, string(mustReadResponseBody(r))) | |
} | |
} |
what's referred to as the r
response in this snippet should be what the querier computes here
response, err := sp.handler.Handle(ctx, request) |
If my understanding is correct, then we will convert all 4xx promQL errors into 4xx errors. At the same time no tests are failing, so maybe I'm missing something
pkg/alertmanager/multitenant.go
Outdated
@@ -400,7 +400,7 @@ func createMultitenantAlertmanager(cfg *MultitenantAlertmanagerConfig, fallbackC | |||
return nil, errors.Wrap(err, "failed to initialize Alertmanager's ring") | |||
} | |||
|
|||
am.grpcServer = server.NewServer(&handlerForGRPCServer{am: am}) | |||
am.grpcServer = server.NewServer(&handlerForGRPCServer{am: am}, []server.Option{server.WithReturn4XXErrors}...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it enough to do this?
am.grpcServer = server.NewServer(&handlerForGRPCServer{am: am}, []server.Option{server.WithReturn4XXErrors}...) | |
am.grpcServer = server.NewServer(&handlerForGRPCServer{am: am}, server.WithReturn4XXError) |
same applies to the other invocations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I will change this. Thank you
is this PR a duplicate of #6983? |
Yes, that's correct. I've done an analysis of possible side effects here.
Going to take a look now. |
@dimitarvdimitrov That's correct, but I live it's handled correctly there. The mimir/pkg/querier/worker/scheduler_processor.go Lines 247 to 257 in fb7f57b
|
c2297fd
to
183a8f9
Compare
2f47c23
to
7dd99d5
Compare
I didn't check the linked issue before. The analysis there makes sense to me |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pkg/ruler/remotequerier.go
Outdated
@@ -183,7 +183,9 @@ func (q *RemoteQuerier) Read(ctx context.Context, query *prompb.Query) (*prompb. | |||
|
|||
resp, err := q.client.Handle(ctx, &req) | |||
if err != nil { | |||
level.Warn(log).Log("msg", "failed to perform remote read", "err", err, "qs", query) | |||
if code := grpcutil.ErrorToStatusCode(err); code/100 == 5 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would do the opposite check, because we may get a code < 100 in case of a non HTTP error.
if code := grpcutil.ErrorToStatusCode(err); code/100 == 5 { | |
if code := grpcutil.ErrorToStatusCode(err); code/100 != 4 { |
pkg/ruler/remotequerier.go
Outdated
@@ -227,7 +229,9 @@ func (q *RemoteQuerier) query(ctx context.Context, query string, ts time.Time, l | |||
|
|||
resp, err := q.sendRequest(ctx, &req, logger) | |||
if err != nil { | |||
level.Warn(logger).Log("msg", "failed to remotely evaluate query expression", "err", err, "qs", query, "tm", ts) | |||
if code := grpcutil.ErrorToStatusCode(err); code/100 == 5 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
error4xx = status.Error(http.StatusUnprocessableEntity, "this is a 4xx error") | ||
error5xx = status.Error(http.StatusInternalServerError, "this is a 5xx error") | ||
) | ||
testCases := map[string]struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test case returning a non httpgrpc error (e.g. errors.New("mocked error")
). We expect the log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (modulo a couple of minor comments)
…on metrics Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
7a02a78
to
6d078b4
Compare
What this PR does
This PR initializes
dskit
'sserver.Server
andhttpgrpc.Server
in such a way that they return errors with HTTP status code4xx
, and therefore allow the latter to appear asstatus_code
label in the request duration metrics.Which issue(s) this PR fixes or relates to
Part of #1830
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.