Skip to content

Conversation

@yanjunxiang-google
Copy link
Contributor

@yanjunxiang-google yanjunxiang-google commented Apr 10, 2025

This is the 1st step to fix: #36982.
i.e, support default append to be "true".

Once it is in, then a follow up PR will be created to support append_action.

with a runtime

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@repokitteh-read-only
Copy link

CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @abeyad
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #39066 was opened by yanjunxiang-google.

see: more, trace.

@yanjunxiang-google yanjunxiang-google changed the title Ext_proc: changing the append default value to be true and protect it Ext_proc: changing the append default value to be true and protect it with a runtime Apr 10, 2025
@yanjunxiang-google yanjunxiang-google marked this pull request as draft April 10, 2025 20:36
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google yanjunxiang-google marked this pull request as ready for review April 10, 2025 21:12
@yanjunxiang-google
Copy link
Contributor Author

/assign @yanavlasov @adisuissa @stevenzzzz

// :ref:`append_action <envoy_v3_api_field_config.core.v3.HeaderValueOption.append_action>` as replacement.
//
// .. note::
// The :ref:`external authorization service <envoy_v3_api_msg_service.auth.v3.CheckResponse>` and
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking ext_authz code, it had moved to take default append as true. So, this comments is stale.

@yanjunxiang-google
Copy link
Contributor Author

yanjunxiang-google commented Apr 10, 2025

This PR is the 1st step to support append_action in ext_proc filter. Once this PR is in, under the default append true case(runtime flag protected), we can add below logic to support append_action with backward compatibility:

If append.has_value()
take append.value()
else
take append_action value

If default append is false, like what's in ext_proc right now, then there is no way to support append_action at same time maintain backward compatibility. This is due to default append = false is equal to append_action = OVERWRITE_IF_EXISTS_OR_ADD, however default append_action is APPEND_IF_EXISTS_OR_ADD. Thus default append false will create backward compatibility problem if neither append nor append_action is set----- with legacy code, it is OVERWRITE_IF_EXISTS_OR_ADD, with the above proposal it is APPEND_IF_EXISTS_OR_ADD

BTW, ext_proc filter is the only one in Envoy which take default append to be false. ext_authz already changed to default to be true based on this code:

abeyad
abeyad previously approved these changes Apr 10, 2025
Copy link
Contributor

@abeyad abeyad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm api

@agrawroh
Copy link
Member

@yanjunxiang-google Is there a merit in re-opening #37139 once we merge this in?

@stevenzzzz
Copy link
Contributor

Thanks for working on this feature!
how long would we keep the runtime flag before we deprecate the old field?

Copy link
Contributor

@stevenzzzz stevenzzzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

const LowerCaseString header_name(sh.header().key());
const bool append = PROTOBUF_GET_WRAPPED_OR_DEFAULT(sh, append, false);
bool append;
if (Runtime::runtimeFeatureEnabled("envoy.reloadable_features.ext_proc_append_default_true")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this as the default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the runtime flag as false to keep the legacy behavior for a while.

This will buy sometime for ext_proc servers to change to always explicitly encode "append" = "false" or "true", instead of leaving it un-encoded for "false" case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the behavior to have default append value as "true". This will be a breaking change.
Keep default append value to false will avoid breaking change now. But, it will be a breaking change when remove that runtime flag.


FALSE_RUNTIME_GUARD(envoy_reloadable_features_ext_proc_graceful_grpc_close);

FALSE_RUNTIME_GUARD(envoy_reloadable_features_ext_proc_append_default_true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me the reason for this being the default value?
Runtime flags should generally be true by default. The reason to make something false by default is introducing a feature that should be gradually tested in a prod environment, and allows specific instances to enable the feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two goals:

  1. Keep the legacy behavior for now.

  2. When we later on remove this runtime flag, the if () case is kept, the else case is removed:

    if (Runtime::runtimeFeatureEnabled("envoy.reloadable_features.ext_proc_append_default_true")) {
    append = PROTOBUF_GET_WRAPPED_OR_DEFAULT(sh, append, true);
    } else {
    append = PROTOBUF_GET_WRAPPED_OR_DEFAULT(sh, append, false);
    }

I can change the runtime to
RUNTIME_GUARD(envoy_reloadable_features_ext_proc_append_default_false);

Then the logic will be:

if (Runtime::runtimeFeatureEnabled("envoy.reloadable_features.ext_proc_append_default_false")) {
  append = PROTOBUF_GET_WRAPPED_OR_DEFAULT(sh, append, false);
} else {
  append = PROTOBUF_GET_WRAPPED_OR_DEFAULT(sh, append, true);
}

Then we need to add a comments that when the runtime flag is removed, the else logic need to be kept. I can make this change if it is better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, using a negated feature sounds better to me.
Otherwise, if the flag is "false" by default, the broken users will only notice after it is deprecated and removed.
I propose using a different name for the runtime-flag, something like: ext_proc_modified_append_default_value, that is set to true by default.

Then we need to add a comments that when the runtime flag is removed, the else logic need to be kept. I can make this change if it is better.

Not sure I fully grok the intention here.
Say you have:

append = PROTOBUF_GET_WRAPPED_OR_DEFAULT(sh, append,
                  Runtime::runtimeFeatureEnabled("envoy.reloadable_features.ext_proc_modified_append_default_value"));

And add a comment above it stating that when the runtime flag is removed, the argument should be replaced with true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can do that.

So, this will break those ext_proc users who did not explicitly encode append=false. I guess they can toggle the runtime flag in Envoy to false avoid the issue. At same time start make changes in the ext_proc server code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE default values of WKTs (taken from the API guide):

Note that changes to default values for wrapped types, e.g. google.protobuf.UInt32Value are not governed by the above policy. Any management server requiring stability across Envoy API or implementations within a major version should set explicit values for these fields.

So the change is allowed. The use of a runtime-flag in this case is actually making it nicer for users to move to the newer version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. In that case, maybe we can remove the runtime-guard a little faster, like two months after this PR merged, if nobody reporting any issues?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest giving the normal six-months if you want to gather more certainty.
The motivation behind the 6 months time frame was 2 version upgrades, as many users upgrade when a new release is cut.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@yanjunxiang-google
Copy link
Contributor Author

@yanjunxiang-google Is there a merit in re-opening #37139 once we merge this in?

Thanks for working on #37139.

Once #39066 is landed, the plan is to having a logic like :

If append.has_value()
take append.value()
else
take append_action value

and implementing it in a common place, so the logic can be shared by every place which need to decode "append" and "append_action":

if (header_value_option.has_append()) {

@yanjunxiang-google
Copy link
Contributor Author

Thanks for working on this feature! how long would we keep the runtime flag before we deprecate the old field?

Normally 6 months to give enough time for ext_proc servers to migrate to explicitly encode "append" with "false" or "true".

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
…d_def_true

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Copy link
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits, but otherwise LGTM, thanks!

FALSE_RUNTIME_GUARD(envoy_reloadable_features_quic_disable_client_early_data);

FALSE_RUNTIME_GUARD(envoy_reloadable_features_ext_proc_graceful_grpc_close);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add empty line back to keep the next block visually separated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@yanavlasov
Copy link
Contributor

/wait

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Copy link
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added one minor request for a test.
I think this PR should be merged after the next Envoy release (probably tomorrow).

s = mutation.add_set_headers();
s->mutable_header()->set_key(":status");
s->mutable_header()->set_raw_value("418");
// Default of "append" is "false" and mutations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add/modify a test that shows the default behavior now is "true".

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
…d_def_true

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google yanjunxiang-google marked this pull request as draft April 14, 2025 21:55
@yanjunxiang-google
Copy link
Contributor Author

Thanks, added one minor request for a test. I think this PR should be merged after the next Envoy release (probably tomorrow).

Let me convert the PR into draft for now to avoid it being accidentally merged.

…nd error logs

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
FALSE_RUNTIME_GUARD(envoy_reloadable_features_quic_disable_client_early_data);

FALSE_RUNTIME_GUARD(envoy_reloadable_features_ext_proc_graceful_grpc_close);
FALSE_RUNTIME_GUARD(envoy_reloadable_features_ext_proc_modified_append_default_value);
Copy link
Contributor Author

@yanjunxiang-google yanjunxiang-google Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the runtime default value to false based on an offline discussion about keeping the legacy behavior for now to avoid risk. This enables us to collect the data about how many ext_proc servers out there are not explicitly encoding "append".

@yanjunxiang-google yanjunxiang-google marked this pull request as ready for review April 15, 2025 20:49
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google
Copy link
Contributor Author

Kind ping!

agrawroh
agrawroh previously approved these changes Apr 16, 2025
Copy link
Member

@agrawroh agrawroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google
Copy link
Contributor Author

There is an offline comments that the added stats for data collection purpose won't be easily removed. Converting the PR to draft for now.

@github-actions
Copy link

github-actions bot commented Jun 8, 2025

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Jun 8, 2025
@github-actions
Copy link

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot closed this Jun 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale stalebot believes this issue/PR has not been touched recently

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ext_proc does not honour HeaderAppendAction

6 participants