Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPoGRPC masking errors #35

Closed
tomwilkie opened this issue May 25, 2017 · 3 comments · Fixed by #36
Closed

HTTPoGRPC masking errors #35

tomwilkie opened this issue May 25, 2017 · 3 comments · Fixed by #36
Assignees

Comments

@tomwilkie
Copy link
Contributor

As long as all the IO works, all HTTP responses will be returned as success. We should ensure we return HTTP 500s as error.

@bboreham
Copy link
Collaborator

Do you remember why 500s were singled out? Why should a 400 be a success?

@tomwilkie
Copy link
Contributor Author

For alerting/SLO purposes, 400's aren't normally considered errors (there problems with the request, not the serve). So the idea behind this change was that we could alert on gRPC-level errors if 500s are considered errors, but not if 400s are considered errors. Does that make sense?

@bboreham
Copy link
Collaborator

OK, thanks for the explanation.

In #40 code was added to extract the underlying status code and report that in the metric, which makes it possible to filter the errors at the time SLIs are computed. Filtering must already be necessary for http (not-over-grpc) errors which are not masked.

I find it confusing to see (say) Mimir Distributor reporting errors and not its caller; it makes it look like the problem is in the Distributor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants