Ext authz caching#44874
Conversation
Enables cooperative caching between L7 ext_authz filter and CONE caching filter. - Updated ext_authz.proto with warning documentation. - Added raw_check_response to internal Response struct. - Implemented tryCacheHit in L7 filter to bypass external call. - Populated raw_check_response in gRPC and HTTP clients (with HTTP synthesis). - Recorded CheckResponse to dynamic metadata in L7 filter onComplete. - Added invalid_cached_response stat. - Added extensive unit tests. Signed-off-by: Todd Greer <tgreer@google.com>
Avoids creating temporary variables raw_check_response and error_response in RawHttpClientImpl::toResponse. Cleaned up trailing whitespace in common BUILD. Signed-off-by: Todd Greer <toddmgreer@gmail.com>
Inlined metadata variable, cleaned up make_unique Response construction, and added ASSERT to Denied fallback in L7 filter. Refactored cache test to use compound namespaces and removed empty constructor. Added CacheHitErrorFailClosed and CacheHitErrorFailOpen tests. Signed-off-by: Todd Greer <toddmgreer@gmail.com>
|
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
Added Cooperative Caching Bypass section to ext_authz_filter.rst describing metadata-based bypass, misses, and fallback. Added invalid_cached_response to statistics table. Addressed review comments regarding CONE mention and redundant security considerations. Signed-off-by: Todd Greer <toddmgreer@gmail.com>
Added release note to changelogs/current.yaml under new_features describing ext_authz cooperative caching bypass and the new invalid_cached_response statistic. Refined wording to refer to it as external authorization HTTP filter. Signed-off-by: Todd Greer <toddmgreer@gmail.com>
|
Added docs and release notes. |
# Conflicts: # changelogs/current.yaml Signed-off-by: Todd Greer <toddmgreer@gmail.com>
|
Decoupling concerns is a nice goal, but this is very hard to use or configure. I think this needs a built-in cache implementation that does something sensible. This topic has come up before, and I think we should at least consider having the caching be an extension point on the ext_authz filter. What is the expected flow for inserting something into the cache? How does the message get from the ext_authz response to the caching filter? This requires an integration test. /wait |
|
Thank you for the quick initial review. I clearly didn't adequately explain the motivation--sorry. My team is working on a caching service for Envoy with the following goals:
From here on, I'm just going to talk about caching for ext_authz filters, but caching for ext_proc filters will use the same flows. (It just gets confusing otherwise because the caching filter is itself an ext_proc filter.) Users of this service will add an ext_proc filter downstream of their ext_authz filters. There are two new flows:
I'm leaving out a lot of details on the caching service, including mountains of configuration (various ttls, freshness and validation rules, cache key construction, variant handling, and so much more), because this PR is only about the ext_authz changes. BTW, this feature could be used for purposes other than caching. I could imagine a filter that applies some sort of custom logic to force the ext_authz filter to accept/reject a request. Perhaps there are more use cases. I'll add an integration test. Thank you, |
|
Why even use ext_authz? ext_proc is roughly a superset of what ext_authz can do. Why involve two filters and make this complicated relationship, when you could just have ext_proc send back an unauthorized response. |
Created ext_authz_cache_integration_test.cc with OK and Denied cache hit integration tests. Simulates the caching filter using header_to_metadata to dynamically inject Base64 CheckResponse. Verifies end-to-end gRPC bypass and header mutation/local reply logic in a real Envoy process. Registered in BUILD. Signed-off-by: Todd Greer <toddmgreer@gmail.com>
|
Some of our proxies have multiple ext_authz filters. Their configurations have been debugged and work. They talk to mature authentication services owned by different groups. We want to put one caching ext_proc filter in front of them that makes one request to a caching service to look for cached responses for all of them. With this proposal, we add a single field to each of their configs (to say where in dynamic metadata the cached responses go), and we add one new ext_proc filter that is configured to talk to all of them. If we just use ext_proc to do the caching and multiple auth checks, we'd have to write a custom filter, because ext_proc can't talk to multiple external services. Additionally, we'd have to combine all of the config from the cache and from all of the original ext_* filters. Our proxy owners would not accept this solution. (In fact, their dislike of compiling/linking/configuring custom filters is why we started on this path.) Alternatively, we could use a single ext_proc filter, and move all the logic about contacting external services into the caching service. This would be worse, because it would move proxy-specific behaviors and config into what is intended to be a proxy-agnostic caching service. We're absolutely open to other ways to do this--this is the only good approach we've found, it's always possible that we may have overlooked something. BTW, when I wrote the design doc and the PR, I was unsure how much to go into these questions. (The ratio of design doc length to code length started to get pretty high.) I may have favored brevity too much. |
|
My concern is that you're adding code and complexity to ext_authz for a very specific use case that it's unlikely anyone else will use. |
Refactored ext_authz_cache_integration_test.cc incorporating your reviews: switched downstream to HTTP2, renamed initializeConfig() to initialize() override, removed redundant comments, and renamed the metadata key to cached_authz_response. Kept createUpstreams() to allocate secondary dynamic ports. Signed-off-by: Todd Greer <toddmgreer@gmail.com>
|
It's much simpler than any other way to cache the results of ext_authz calls. If your concern is that we're the only ones who want to cache ext_authz, then my response is that there are several very different teams in Google with proxies that would like to use this, and Google isn't all that weird anymore, so something needed by several of our groups is probably also wanted by some non-Google users as well. Added integration tests. |
|
There's been plenty of interest in caching ext_authz response; I believe you on that. But this implementation is very specific and difficult for others to use. I'd like to see an implementation that is in some way usable for others. Maybe make an additional filter that is a cache implementation (with no external calls; just in-memory). I think using untyped dynamic metadata makes the interface harder to use; it would be better to fix ext_proc to support typed metadata. |
|
If we switch from untyped to typed metadata, do you think that would give this wide enough appeal? Seems like a reasonable approach. |
|
|
|
Ok, now that I've read some typed metadata docs, it again sounds reasonable. Please ignore my previous message. To use typed dynamic metadata: and another like it in ProcessingRequest.request, It would be somewhat less efficient, due to sending the type info to the caching service, but that's probably not a big deal. I prefer the existing version, due to its slightly better efficiency, but I'll see if the rest of the team raise any big concerns with using typed metadata. |
Switched L7 HTTP external authorization caching bypass to strongly-typed dynamic metadata. Replaced check_response_metadata_key with check_response_typed_metadata_namespace in API and C++ config. Refactored tryCacheHit() and onComplete() to directly read and write CheckResponse Any payloads under the configured namespace. Updated unit and integration test suites. Signed-off-by: Todd Greer <toddmgreer@gmail.com>
|
Switched to typed dynamic metadata. You were right--this is better |
Ran the official check_format script to auto-format C++ style violations (clang-format, buildifier) across the modified files. Reverted the unit test ext_authz_cache_test.cc back to standard nested namespaces to satisfy the strict Envoy linter. Signed-off-by: Todd Greer <toddmgreer@gmail.com>
|
CC @envoyproxy/dependency-shepherds: Your approval is needed for changes made to |
8270de5 to
a530160
Compare
Rewrote the External Authorization cooperative caching bypass documentation in ext_authz_filter.rst to accurately describe the new strongly-typed keyless direct namespace design. Updated the upcoming v1.39.0-dev changelogs in current.yaml to replace the obsolete check_response_metadata_key references and document the new typed metadata bypass. Added .gitignore rule for /docs/bazel-* to clean the workspace. Signed-off-by: Todd Greer <toddmgreer@gmail.com>
a530160 to
b413bf0
Compare
Hi @ggreenway , I just wanted to clarify and give a e2e example flow here: This PR is to give ext_authz a way to use cached results from ext_proc callout. In other words, we leverage the ext_proc here for external callout to caching service and other filters can use it as well: Regarding the in-memory cache option: Yes, it is viable option and it is more performant. But it has its own limitation: in-memory cache also is tied with Envoy instance (depends on implementation: per filter, thread local, or process wide). This will lead to inaccurate cache miss. In order to have a global view across envoy/ext_authz instances, I think we will need to have some kinds of external service/storage. With that said, i think we can provide multiple options here : remote vs in-memory for customer to choose based on their use cases and performance requirement. We are currently first implement the remote option due to our deployment requirement and use case. |
|
my 2c: RLQS is actually quite similar to what you'd want from a caching authz. The servers should specify a bucket and an expiration deadline for the decisions. The bucket is a matcher over the request so that you can group. I don't quite understand why you'd want to use |
|
I added some comments in #44852. I'm not sure this is really the best approach, given that this is something we're going to also need to support in gRPC. |
Commit Message: Cache-enable Envoy External Authentication HTTP Filter
Additional Description: Give ext_authz a way to be overridden by cached authentication results and to inform a cache of the same.
Risk Level: Low. The new feature is only active when configured, and is limited in scope.
Testing: See new tests in PR.
Docs Changes: In ext_authz_filter.rst.
Release Notes: In current.yaml
Platform Specific Features: none
Fixes #44852
I used the Gemini generative AI for both code and tests. I have reviewed and understand all generated code, though I'm new to ext_authz, so I could have misunderstandings.