-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ext_authz(http): Envoy turns AuthorizationResponse status 500 into 403 and drops body #14001
Comments
I don't think this a bug, handling errors from the auth service service is either fail-open or fail-closed (default). envoy/source/extensions/filters/http/ext_authz/ext_authz.cc Lines 292 to 307 in 2a25296
Proxying an auth service error to the client may just be a feature request. You can set the error status code with |
@derekargueta the documentation for status_on_error says:
However, this is not the case. If I understand correctly, filter == Envoy and authorization server == my http authservice. In this case, there is no network error between them, but something went wrong in the authservice. In addition, when I tried returning error code 400 Bad Request, the error code was not transformed into a 403 and the message was kept. Which means this is a problem only with 500 (and maybe 5xx) http codes. |
kind ping @derekargueta @mattklein123 |
Thanks @esmet! I see this is triaged as a question and I don't think it's correct. The |
@yanniszark I agree. It sounds like we want to add configuration to set the definition of an error for the external authorization server. Similar to https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/router_filter#x-envoy-retry-on but of course we can tweak it so that it makes the most sense for the ext_authz use case. What do you think? @alyssawilk feel free to assign this one to me |
@esmet sorry, I'm not sure I understood that. IMO, all error returned by the AuthService should be surfaced to the downstream client. So if the AuthService returns 503 with a body, this response will be proxied back to the client, same as other non-2xx codes (3xx, 4xx). |
By definition of error, I mean the case where a call to the authorization service is considered "failed" as opposed to "completed, but with an error". Today, the ext_authz filter considers "completed, but with a 5xx error" to be "failed" though one could argue that the call succeeded, it just came back with a 5xx denial. This would fit the use case that you are describing. We could add configuration to allow you to specify what you consider a failure. It could be just network errors, or network errors + 5xx errors, or something else, sort of like the 'retry-on' logic I linked above. The default value for this configuration would be the current behavior, which is network errors and 5xx codes. For your use case, you'd configure it to only consider network errors as "failures" (for which the response code to the client is the preconfigured error response code) and everything else would be considered successful communication to the auth service, such that 2XX means pass through and everything else means deny with the code, headers, and body provided by the auth service. |
@esmet got it! IMO the current behavior of
is a bug, but if you feel it makes sense to keep as a feature then all the power to you! |
I agree that probably making the "definition of errors" to be configurable seems useful. |
I could see why someone would want 5XX errors to be considered failures and not legitimate responses to send downstream. For example, one might not want someone to know that their auth service is actually down (or dying) or if some proxy between Envoy and the auth service is unhealthy. At any rate, the existing behavior is now canon so that's the primary reason we'd to keep it as the default. @dio do you think mimicking the retry-on configuration is the right path forward? I looked into the code and was sort of surprised to see that the retry-on values (gateway-error, 5xx, etc...) is largely implemented in code (retry_state_impl.cc) as opposed to a proto API that we could just reuse. |
I think so. Yes, it actually controlled by an array of pre-defined strings. Probably we can have something like: |
Sounds right. I'll try this soon. |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions. |
This is very much still an issue! |
Let's move this to "no stale bot" |
No new activity on this yet? it would be super useful to differentiate between network drops vs 5xx errors |
Is this one still on road map? +1 on |
there are people that still need this - for example we have a flow where we return 5xx codes but they are intended to be handled specially by some applications. |
hi, can I pick this up? Thanks! |
@imrenagi I think you can take it. Please let me know if you need help. Thanks! cc. @ggreenway. |
@imrenagi, after re-reading #14001 (comment). I think I agree with @yanniszark. So, we need to introduce runtime guarding, to deprecate the current behavior (with runtime guarding we allow users to fallback to the previous (current) behavior). Basically, you need the following patch: diff --git a/source/common/runtime/runtime_features.cc b/source/common/runtime/runtime_features.cc
index 36832d73f5..47e4297c58 100644
--- a/source/common/runtime/runtime_features.cc
+++ b/source/common/runtime/runtime_features.cc
@@ -43,6 +43,7 @@ RUNTIME_GUARD(envoy_reloadable_features_delta_xds_subscription_state_tracking_fi
RUNTIME_GUARD(envoy_reloadable_features_do_not_await_headers_on_upstream_timeout_to_emit_stats);
RUNTIME_GUARD(envoy_reloadable_features_do_not_count_mapped_pages_as_free);
RUNTIME_GUARD(envoy_reloadable_features_enable_compression_bomb_protection);
+RUNTIME_GUARD(envoy_reloadable_features_ext_authz_http_service_5xx_is_denied);
RUNTIME_GUARD(envoy_reloadable_features_fix_hash_key);
RUNTIME_GUARD(envoy_reloadable_features_get_route_config_factory_by_type);
RUNTIME_GUARD(envoy_reloadable_features_http2_delay_keepalive_timeout);
diff --git a/source/extensions/filters/common/ext_authz/ext_authz_http_impl.cc b/source/extensions/filters/common/ext_authz/ext_authz_http_impl.cc
index 3157136b43..a5d02c28dc 100644
--- a/source/extensions/filters/common/ext_authz/ext_authz_http_impl.cc
+++ b/source/extensions/filters/common/ext_authz/ext_authz_http_impl.cc
@@ -317,11 +317,14 @@ void RawHttpClientImpl::onBeforeFinalizeUpstreamSpan(
ResponsePtr RawHttpClientImpl::toResponse(Http::ResponseMessagePtr message) {
const uint64_t status_code = Http::Utility::getResponseStatus(message->headers());
- // Set an error status if the call to the authorization server returns any of the 5xx HTTP error
- // codes. A Forbidden response is sent to the client if the filter has not been configured with
- // failure_mode_allow.
- if (Http::CodeUtility::is5xx(status_code)) {
- return std::make_unique<Response>(errorResponse());
+ if (!Runtime::runtimeFeatureEnabled(
+ "envoy.reloadable_features.ext_authz_http_service_5xx_is_denied")) {
+ // Set an error status if the call to the authorization server returns any of the 5xx HTTP error
+ // codes. A Forbidden response is sent to the client if the filter has not been configured with
+ // failure_mode_allow.
+ if (Http::CodeUtility::is5xx(status_code)) {
+ return std::make_unique<Response>(errorResponse());
+ }
}
// Extract headers-to-remove from the storage header coming from the Hence with this, when this
Please add unit and integration tests. Good luck! |
Sure @dio . Will take a look and try to update my PR. thanks! |
+1 on this request. There should be a way to make a difference between network failures when communicating with the external authorization services vs the external authorization service denies the request. @yanniszark have you found any workarounds for this in the meantime? |
@dragoscalinescu the obvious workaround is to send a 40x error, which should go through, and include extra information in the response body. |
@yanniszark thanks for your reply. Initially I thought every error status code gets converted to the status_on_error value. I figured it out you can do a 4xx from the authz service to bypass this. |
Title: ext_authz(http): Envoy turns AuthorizationResponse status 500 into 403 and drops body
Description:
Repro steps:
This came up as part of a project so it's easier to include instructions for that project:
Config:
Here is the config dump taken from port 15000 of the istio ingressgateway proxy:
config_dump.log
Logs:
access_log.log
The text was updated successfully, but these errors were encountered: