Skip to content

Ext authz caching#44874

Open
toddmgreer wants to merge 11 commits into
envoyproxy:mainfrom
toddmgreer:ext_authz_caching
Open

Ext authz caching#44874
toddmgreer wants to merge 11 commits into
envoyproxy:mainfrom
toddmgreer:ext_authz_caching

Conversation

@toddmgreer
Copy link
Copy Markdown
Contributor

@toddmgreer toddmgreer commented May 6, 2026

Commit Message: Cache-enable Envoy External Authentication HTTP Filter
Additional Description: Give ext_authz a way to be overridden by cached authentication results and to inform a cache of the same.
Risk Level: Low. The new feature is only active when configured, and is limited in scope.
Testing: See new tests in PR.
Docs Changes: In ext_authz_filter.rst.
Release Notes: In current.yaml‎
Platform Specific Features: none
Fixes #44852

I used the Gemini generative AI for both code and tests. I have reviewed and understand all generated code, though I'm new to ext_authz, so I could have misunderstandings.

toddmgreer and others added 3 commits May 5, 2026 23:10
Enables cooperative caching between L7 ext_authz filter and CONE caching filter.

- Updated ext_authz.proto with warning documentation.

- Added raw_check_response to internal Response struct.

- Implemented tryCacheHit in L7 filter to bypass external call.

- Populated raw_check_response in gRPC and HTTP clients (with HTTP synthesis).

- Recorded CheckResponse to dynamic metadata in L7 filter onComplete.

- Added invalid_cached_response stat.

- Added extensive unit tests.

Signed-off-by: Todd Greer <tgreer@google.com>
Avoids creating temporary variables raw_check_response and error_response in RawHttpClientImpl::toResponse. Cleaned up trailing whitespace in common BUILD.

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
Inlined metadata variable, cleaned up make_unique Response construction, and added ASSERT to Denied fallback in L7 filter. Refactored cache test to use compound namespaces and removed empty constructor. Added CacheHitErrorFailClosed and CacheHitErrorFailOpen tests.

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #44874 was opened by toddmgreer.

see: more, trace.

Added Cooperative Caching Bypass section to ext_authz_filter.rst describing metadata-based bypass, misses, and fallback. Added invalid_cached_response to statistics table. Addressed review comments regarding CONE mention and redundant security considerations.

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
@toddmgreer toddmgreer marked this pull request as draft May 6, 2026 09:57
Added release note to changelogs/current.yaml under new_features describing ext_authz cooperative caching bypass and the new invalid_cached_response statistic. Refined wording to refer to it as external authorization HTTP filter.

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
@toddmgreer
Copy link
Copy Markdown
Contributor Author

Added docs and release notes.

@toddmgreer toddmgreer marked this pull request as ready for review May 6, 2026 10:04
# Conflicts:
#	changelogs/current.yaml

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
@ggreenway
Copy link
Copy Markdown
Member

Decoupling concerns is a nice goal, but this is very hard to use or configure. I think this needs a built-in cache implementation that does something sensible. This topic has come up before, and I think we should at least consider having the caching be an extension point on the ext_authz filter.

What is the expected flow for inserting something into the cache? How does the message get from the ext_authz response to the caching filter?

This requires an integration test.

/wait

@toddmgreer
Copy link
Copy Markdown
Contributor Author

Thank you for the quick initial review. I clearly didn't adequately explain the motivation--sorry.

My team is working on a caching service for Envoy with the following goals:

  1. cache ext_authz, ext_proc, and HTTP responses
  2. minimize the burden on projects using Envoy proxies by integrating into Envoy using the ext_proc filter. (There is resistance to compiling and linking in new types of filters, both to minimize the code that could crash Envoy, and to reduce administrative burden.)

From here on, I'm just going to talk about caching for ext_authz filters, but caching for ext_proc filters will use the same flows. (It just gets confusing otherwise because the caching filter is itself an ext_proc filter.)

Users of this service will add an ext_proc filter downstream of their ext_authz filters. There are two new flows:

  1. Cache hit:
    In decodeHeaders, the caching ext_proc filter will send a request to an external caching service. If it returns any cached responses, the caching ext_proc filter will put them in dynamic metadata. In ext_authz's decodeHeaders, it will find the cached response and apply it, as if it had received it as an RPC response.
  2. Cache miss
    In decodeHeaders, the caching ext_proc filter will send a request to an external caching service, which will look for cached responses. If the external caching service doesn't return any cached responses, the caching ext_proc filter will let iteration continue normally. In ext_authz's decodeHeaders, it will store the response it got from the external authorization service. In its encodeHeaders, the caching ext_proc filter will send that (along with the request) to the external service for caching.

I'm leaving out a lot of details on the caching service, including mountains of configuration (various ttls, freshness and validation rules, cache key construction, variant handling, and so much more), because this PR is only about the ext_authz changes.

BTW, this feature could be used for purposes other than caching. I could imagine a filter that applies some sort of custom logic to force the ext_authz filter to accept/reject a request. Perhaps there are more use cases.

I'll add an integration test.

Thank you,
Todd

@ggreenway
Copy link
Copy Markdown
Member

Why even use ext_authz? ext_proc is roughly a superset of what ext_authz can do. Why involve two filters and make this complicated relationship, when you could just have ext_proc send back an unauthorized response.

Created ext_authz_cache_integration_test.cc with OK and Denied cache hit integration tests. Simulates the caching filter using header_to_metadata to dynamically inject Base64 CheckResponse. Verifies end-to-end gRPC bypass and header mutation/local reply logic in a real Envoy process. Registered in BUILD.

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
@toddmgreer
Copy link
Copy Markdown
Contributor Author

Some of our proxies have multiple ext_authz filters. Their configurations have been debugged and work. They talk to mature authentication services owned by different groups. We want to put one caching ext_proc filter in front of them that makes one request to a caching service to look for cached responses for all of them. With this proposal, we add a single field to each of their configs (to say where in dynamic metadata the cached responses go), and we add one new ext_proc filter that is configured to talk to all of them.

If we just use ext_proc to do the caching and multiple auth checks, we'd have to write a custom filter, because ext_proc can't talk to multiple external services. Additionally, we'd have to combine all of the config from the cache and from all of the original ext_* filters. Our proxy owners would not accept this solution. (In fact, their dislike of compiling/linking/configuring custom filters is why we started on this path.) Alternatively, we could use a single ext_proc filter, and move all the logic about contacting external services into the caching service. This would be worse, because it would move proxy-specific behaviors and config into what is intended to be a proxy-agnostic caching service.

We're absolutely open to other ways to do this--this is the only good approach we've found, it's always possible that we may have overlooked something.

BTW, when I wrote the design doc and the PR, I was unsure how much to go into these questions. (The ratio of design doc length to code length started to get pretty high.) I may have favored brevity too much.

@ggreenway
Copy link
Copy Markdown
Member

My concern is that you're adding code and complexity to ext_authz for a very specific use case that it's unlikely anyone else will use.

Refactored ext_authz_cache_integration_test.cc incorporating your reviews: switched downstream to HTTP2, renamed initializeConfig() to initialize() override, removed redundant comments, and renamed the metadata key to cached_authz_response. Kept createUpstreams() to allocate secondary dynamic ports.

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
@toddmgreer
Copy link
Copy Markdown
Contributor Author

It's much simpler than any other way to cache the results of ext_authz calls. If your concern is that we're the only ones who want to cache ext_authz, then my response is that there are several very different teams in Google with proxies that would like to use this, and Google isn't all that weird anymore, so something needed by several of our groups is probably also wanted by some non-Google users as well.

Added integration tests.

@ggreenway
Copy link
Copy Markdown
Member

There's been plenty of interest in caching ext_authz response; I believe you on that. But this implementation is very specific and difficult for others to use.

I'd like to see an implementation that is in some way usable for others. Maybe make an additional filter that is a cache implementation (with no external calls; just in-memory).

I think using untyped dynamic metadata makes the interface harder to use; it would be better to fix ext_proc to support typed metadata.

@toddmgreer
Copy link
Copy Markdown
Contributor Author

If we switch from untyped to typed metadata, do you think that would give this wide enough appeal? Seems like a reasonable approach.

@toddmgreer
Copy link
Copy Markdown
Contributor Author

toddmgreer commented May 7, 2026

The advantage of using untyped metadata is that the cache doesn't have to know what type of thing ext_authz uses. ext_authz hands the cache an opaque string, and the cache knows to hand that same opaque string back later (when it sees a request with headers that match the headers that generated that opaque string).

IIUC, adding typed_metadata support to ext_proc would require adding a new field to ProcessingResponse:
envoy.service.auth.v3.CheckResponse authz_check_response = ;

I don't think that would be an improvement, but I'm not totally clear on how typed metadata works. I assume the code has to actually know what types are involved.

@toddmgreer
Copy link
Copy Markdown
Contributor Author

Ok, now that I've read some typed metadata docs, it again sounds reasonable. Please ignore my previous message.

To use typed dynamic metadata:
We'd add a new field to ProcessingResponse.response:
google.protobuf.Any typed_dynamic_metadata = 13;

and another like it in ProcessingRequest.request,
and some new config to say where to put it during decode / where to load from during encode,
and ext_proc would just copy between that dynamic typed metadata location and the ProcessingRequest/Response, without having to know what it is.

It would be somewhat less efficient, due to sending the type info to the caching service, but that's probably not a big deal. I prefer the existing version, due to its slightly better efficiency, but I'll see if the rest of the team raise any big concerns with using typed metadata.

Switched L7 HTTP external authorization caching bypass to strongly-typed dynamic metadata. Replaced check_response_metadata_key with check_response_typed_metadata_namespace in API and C++ config. Refactored tryCacheHit() and onComplete() to directly read and write CheckResponse Any payloads under the configured namespace. Updated unit and integration test suites.

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
@toddmgreer
Copy link
Copy Markdown
Contributor Author

Switched to typed dynamic metadata. You were right--this is better

Ran the official check_format script to auto-format C++ style violations (clang-format, buildifier) across the modified files. Reverted the unit test ext_authz_cache_test.cc back to standard nested namespaces to satisfy the strict Envoy linter.

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
@repokitteh-read-only repokitteh-read-only Bot added the deps Approval required for changes to Envoy's external dependencies label May 8, 2026
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/dependency-shepherds: Your approval is needed for changes made to (bazel/.*repos.*\.bzl)|(bazel/dependency_imports\.bzl)|(api/bazel/.*\.bzl)|(.*/requirements\.txt)|(.*\.patch).
envoyproxy/dependency-shepherds assignee is @agrawroh

🐱

Caused by: #44874 was synchronize by toddmgreer.

see: more, trace.

@toddmgreer toddmgreer force-pushed the ext_authz_caching branch from 8270de5 to a530160 Compare May 8, 2026 23:42
Rewrote the External Authorization cooperative caching bypass documentation in ext_authz_filter.rst to accurately describe the new strongly-typed keyless direct namespace design. Updated the upcoming v1.39.0-dev changelogs in current.yaml to replace the obsolete check_response_metadata_key references and document the new typed metadata bypass. Added .gitignore rule for /docs/bazel-* to clean the workspace.

Signed-off-by: Todd Greer <toddmgreer@gmail.com>
@toddmgreer toddmgreer force-pushed the ext_authz_caching branch from a530160 to b413bf0 Compare May 8, 2026 23:44
@tyxia tyxia self-assigned this May 11, 2026
@tyxia
Copy link
Copy Markdown
Member

tyxia commented May 11, 2026

There's been plenty of interest in caching ext_authz response; I believe you on that. But this implementation is very specific and difficult for others to use.

I'd like to see an implementation that is in some way usable for others. Maybe make an additional filter that is a cache implementation (with no external calls; just in-memory).

I think using untyped dynamic metadata makes the interface harder to use; it would be better to fix ext_proc to support typed metadata.

Hi @ggreenway , I just wanted to clarify and give a e2e example flow here: This PR is to give ext_authz a way to use cached results from ext_proc callout. In other words, we leverage the ext_proc here for external callout to caching service and other filters can use it as well:
On cache hit: downstream ext_proc populates the metadata, the ext_authz checks metadata and then skips gRPC call. On cache miss: ext_proc sees the dynamic metadata set by the gRPC ext_authz call and fills it to the external cache.

Regarding the in-memory cache option: Yes, it is viable option and it is more performant. But it has its own limitation: in-memory cache also is tied with Envoy instance (depends on implementation: per filter, thread local, or process wide). This will lead to inaccurate cache miss. In order to have a global view across envoy/ext_authz instances, I think we will need to have some kinds of external service/storage. With that said, i think we can provide multiple options here : remote vs in-memory for customer to choose based on their use cases and performance requirement. We are currently first implement the remote option due to our deployment requirement and use case.

@kyessenov
Copy link
Copy Markdown
Contributor

kyessenov commented May 14, 2026

my 2c: RLQS is actually quite similar to what you'd want from a caching authz. The servers should specify a bucket and an expiration deadline for the decisions. The bucket is a matcher over the request so that you can group. I don't quite understand why you'd want to use ext_authz if the decision is already made, it might be easier to just wrap into some conditional filter.

@markdroth
Copy link
Copy Markdown
Contributor

I added some comments in #44852. I'm not sure this is really the best approach, given that this is something we're going to also need to support in gRPC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api deps Approval required for changes to Envoy's external dependencies

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Cache-enable Envoy External Authentication HTTP Filter

6 participants