-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
api: audit_log api specific to RBAC http filters #26001
Conversation
Hi @rockspore, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
// condition specified in the RBAC is met. | ||
// | ||
// [#not-implemented-hide:] | ||
repeated config.core.v3.TypedExtensionConfig audit_log = 54; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a new extension category or is the access log good enough? We really just need log()
callback, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @rockspore Could you make te slides and doc public then most people could access it? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it's not clear to me whether I could publicize those. But I will link a gRFC PR asap with most of the context available. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other alternative here is that we extend access log configuration to allow different loggers at different request lifecycle points, and they get to access the audit metadata as needed there. That seems less hackish. Given this will almost certaily be used in Envoy once we introduce (there is a clear need IMHO), I think we want to spend time to get this API right CC @envoyproxy/api-shepherds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems if we are adding APIs to generalize logging, there are other events you might want to log, in addition to start / end of request you have periodic logs for long lived streams (every N seconds, every N bytes?), and with the right filters perhaps the ability to log on individual gRPC stream messages. I'm not sure how much of this we want to bake into HCM or not, but worth thinking about when adding a new hook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we really want a new API to cover both audit logging and other generalized logging scenarios? From my perspective, audit logging is more about auditing accesses to specific services so it's very much tied to authn/z and cares little about anything after the stream starts.
@htuch When you said there was a clear need in Envoy, did you mean Envoy also primarily needed this audit logging solution or it was referring to the generalized logging where a logger can access different lifecycle points? I'd love to know more about the use cases for the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for audit logging specifically, it would be better to have a mechanism specifically for that purpose than to use a more generalized logging mechanism. For audit logging, it's important to be able to easily validate from looking at the config that the logging will actually be done, and the more complicated that configuration is, the harder it is to do that validation.
To be clear, I am not opposed to having a more general logging solution in the future. But I think that should be a separate effort from audit logging, even if audit logging could also be implemented as a special case of the more generalized solution.
I think this PR is basically what we want for audit logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a very strong opinion on this one way or the other, though if pushed I would tend to agree it would be better to have a common logging API that could be hooked in various different ways and at various lifecycle points.
The normal access log paradigm is that there's one log line for each request that includes both request and response info.
I agree this is generally true historically, but we already support in Envoy emitting multiple logs for a connection/stream. See the recent #25824 for example.
It's also supported by the TCP proxy:
envoy/api/envoy/extensions/filters/network/tcp_proxy/v3/tcp_proxy.proto
Lines 215 to 221 in 3c26c22
// The interval to flush access log. The TCP proxy will flush only one access log when the connection | |
// is closed by default. If this field is set, the TCP proxy will flush access log periodically with | |
// the specified interval. | |
// The interval must be at least 1ms. | |
google.protobuf.Duration access_log_flush_interval = 15 | |
[(validate.rules).duration = {gte {nanos: 1000000}}]; | |
} |
It seems like we might want something similar where we could configure RBAC to just invoke a set of loggers (with the common logging API) when the RBAC decision happens. I think this would be more flexible than a new type of logger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we configure the RBAC filters to invoke the existing access loggers, IIUC it would flush the log but a final log will still happen when the request is complete. How should we modify the API to address the use case where normal access logs are not wanted without introducing confusion?
I'd argue this is a very reasonable need and tricky cases I can think of if we support this via some configs in RBAC would be: RBAC filters added to the filter chain accidentally stopped normal access logging process or vice versa.
Opened grpc/proposal#346 to give some context with gRPC. cc @wbpcode |
There was some internal discussion held with @htuch @markdroth @ejona86 @erm-g @adisuissa. I believe we concluded that the This would lead to a bit config duplication if multiple RBACs simply wanted to use the same loggers. But as we argue it is already not a common use case to have multiple RBAC filters, it should be fine. Besides we don't expect users to deal with raw RBACs often. And in either case, some de-duplication logic will still be needed in both gRPC and Envoy to avoid things like the same file being written by multiple loggers concurrently. I commented on the added fields that they are specific to HTTP filters, since gRPC is only interested in L7. I think it's extensible to network filters in the future if needed in Envoy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we argue it is already not a common use case to have multiple RBAC filters, it should be fine.
I don't actually think we were saying that it's not a common use case to have multiple RBAC filters; I suspect it's actually fairly common to have an ALLOW filter followed by a DENY filter. But I don't think that's really why we decided to move this config into the RBAC filter.
And in either case, some de-duplication logic will still be needed in both gRPC and Envoy to avoid things like the same file being written by multiple loggers concurrently.
Yeah, I think that's actually the key point here. Even if we configured the audit logger in the HCM, we'd still wind up needing to deal with de-duplication in the logger implementation, because there could be multiple HCM instances using the same logger (either due to having multiple Listener filter chains, or even just due to warming a new Listener instance while the old one is still around). So pulling it out into the HCM config doesn't really buy us anything, and hopefully it makes this change less controversial if it's not spreading its tendrils into the HCM config.
api/envoy/config/rbac/v3/rbac.proto
Outdated
@@ -114,6 +129,18 @@ message RBAC { | |||
// Maps from policy name to policy. A match occurs when at least one policy matches the request. | |||
// The policies are evaluated in lexicographic order of the policy name. | |||
map<string, Policy> policies = 2; | |||
|
|||
// Condition for the audit logging to be happen. Specific to HTTP filters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't actually think there's anything specific to HTTP filters about either of these fields. I think audit logging could just as easily be supported for L4 filters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. It's mainly that as L4 filters, it won't be able to log as much as the context we intend to include.
Speaking of this, I suppose the audit context (or maybe Envoy prefers to call it metadata) should be technically part of the API but not reflected in a proto, unless we have a service logger (like the grpc access logger). What would be the general approach to document that besides our gRFC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to say, I removed this sentence, "Specific to HTTP filters."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the context passed to the audit logging plugin is not really a function of the xDS API, so I don't think it needs to be specified here. I think that context is actually a function of the extension API that the data plane provides to implement that plugin. Each data plane is likely to do that extension API a little bit differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. This makes a lot of sense. Thank you, Mark!
Thanks for the correction. That's right we didn't really talk about this. So I should not have used it as an argument here. FWIW, IIRC, there was a time we talked about Traffic Director allowing only one authorization policy, which is one RBAC filter, per workload. In Istio, I checked that multiple can be applied which will be additively inserted as RBAC filters, but the order, which may be implemented to be deterministic, is not guaranteed by the API (not that it matters mostly). |
This approach looks good to me. @htuch or @mattklein123, any concerns? |
I'm LGTM on this API but would like a non-Googler to also sign-off, CC @mattklein123 who has the context. I think we can simultaneously improve lifecycle logging in HCM, which is going on in places like #26094 while having a separate audit log facility. This seems best to live in the RBAC filter as it permits direct logging on RBAC events and avoids problems like some later filter delaying the log message by some non-trivial time period. At least that's one plausible argument.
This is another approach that would be totally valid. I think it would need gRPC to buy-in to implementing access loggers. |
Hi @mattklein123 - could you take a look? This design has the highest priority for us (gRPC) at the moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with small comments, thanks.
/wait
api/envoy/config/rbac/v3/rbac.proto
Outdated
// Condition for the audit logging to be happen. | ||
// If this condition is met, all the audit loggers configured here will be invoked. | ||
// | ||
// [#not-implemented-hide:] | ||
AuditCondition audit_condition = 3; | ||
|
||
// Configurations for RBAC-based authorization audit loggers. | ||
// | ||
// [#not-implemented-hide:] | ||
// [#extension-category: envoy.rbac.audit_loggers] | ||
repeated core.v3.TypedExtensionConfig audit_log = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically I think it would make more sense to have a new message called AuditLog
and then put these fields within that message. That way all of the fields can be grouped together or not.
On this field I would use PGV to specify at least one logger configured. And on the enum I would require a valid value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @mattklein123, for the quick review! I wrapped them and added the PGVs. To avoid identical field names, I renamed the wrapping message AuditOptions
. Please let me know what you think. I don't have strong opinions so could change it back to AuditLog
if preferred.
LGTM. Can you fix DCO? Rebase and force push is fine, thanks. /wait |
Signed-off-by: Luwei Ge <lwge@google.com>
Signed-off-by: Luwei Ge <lwge@google.com>
Signed-off-by: Luwei Ge <lwge@google.com>
Co-authored-by: Matt Klein <mattklein123@gmail.com> Signed-off-by: Luwei Ge <rockingspore@gmail.com>
Signed-off-by: Luwei Ge <lwge@google.com>
Signed-off-by: Luwei Ge <lwge@google.com>
Signed-off-by: Luwei Ge <lwge@google.com>
Following DCO's instruction I always ended up rebasing on an old "main" branch and the PR would try to revert the changes between that commit and HEAD. Did I miss something obivious? The side effect was it also assigned (spammed) a number of reviewers that didn't need to be involved and I can't un-assign them. Nevertheless, I then rebased my branch with HEAD and hopefully it should be good now. Again, thank you for the quick review, @mattklein123 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
api/envoy/config/rbac/v3/rbac.proto
Outdated
|
||
// Audit options that includes the condition for audit logging to happen and | ||
// audit logger configurations. | ||
AuditOptions audit_options = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a [#not-implemented-hide:]
comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch! Done.
api/envoy/config/rbac/v3/rbac.proto
Outdated
|
||
// Audit options that includes the condition for audit logging to happen and | ||
// audit logger configurations. | ||
AuditOptions audit_options = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest calling this message AuditLoggingOptions
and the field audit_logging_options
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
api/envoy/config/rbac/v3/rbac.proto
Outdated
// | ||
// [#not-implemented-hide:] | ||
// [#extension-category: envoy.rbac.audit_loggers] | ||
repeated core.v3.TypedExtensionConfig audit_log = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest calling this audit_loggers
. I think we don't need to match the semantics of the access logger anymore, since we're not configuring this in the same context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Done.
Signed-off-by: Luwei Ge <lwge@google.com>
Signed-off-by: Luwei Ge <lwge@google.com>
Friendly ping for another look, @mattklein123. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
/retest |
Retrying Azure Pipelines: |
@mattklein123 thank you so much for the quick turnaround! |
Commit Message: Add audit_log typed config in HCM and audit_condition enum in RBAC config protos.
Risk Level: low
Docs Changes: N/A
Release Notes: N/A
API Considerations:
This API change is proposed for the feature implementation in gRPC first.
gRPC authorization is based on xDS RBAC proto. The audit logging support we are trying to add is to enable users configure loggers and audit authorization events when the audit condition is met.
The primary difference from the access log is that audit log should be invoked right after the RBAC evaluation is done and will not wait till after the entire call finishes. It seems to me this is something Envoy also lacks and could potentially benefit from.
The API change presented here has gone through some preliminary reviews by @markdroth and @ejona86. I'm working on a gRFC which will be linked here as soon as it's available. Meanwhile, I am opening this PR now such that any question or concern could be raised early.