New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hubble: add option to sanitize sensitive L7 data from flows #23887
Comments
|
This issue has been automatically marked as stale because it has not |
|
Hi @rolinh, we would like to take a stab on this with @ChrsMark, maybe we could split the items you listed. Looking at the codebase we understand that we need to extend Hubble's Layer 7 parser and implement the described logic there. This is similar to how Hubble's Layer 7 parser already redacts the user's password to not include it in the network flow. Does this hold? Thanks! |
|
Hi @ioandr,
Nice! I'm happy to create sub-issues if that helps but we can otherwise simply add your github handle next to each of the task.
Yes, that's the way to do it indeed. Thanks for your contributions! |
|
Hi @rolinh we would like to continue working on the rest of the items listed in #23887 (comment) so that we can deliver the whole L7 redaction/sanitization feature in Hubble. At your convenience please advise on the following items to make sure we are on the same page and we start with the implementation:
Maybe we could keep sanitization logic in a single place and reuse it from multiple places. Other than the above, it seems that Envoy itself provides a list of HTTP headers that it potentially sanitizes: https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/header_sanitizing Maybe we could get some inspiration from this, given that Cilium itself ships with Envoy. We are looking forward to your thoughts on the above - once again, we can split tasks with @ChrsMark |
|
Thanks @ioandr for the detailed comment!
Sounds good to me but again, I'd like some consistency around defaults. In my view, both Kafka API keys and HTTP passwords are secrets (in every context) and we should apply the same default redact policy. Redacting Kafka API keys by default would be the less intrusive option as changing the default for HTTP password would be a kind of "breaking change" and has security implications. Agree that the issue description is lacking in terms of what "sanitize" means concretely. In my opinion: because at the Hubble level we don't know in which context theses value are going to be used we shouldn't modify them. It's up to consumers inserting into a SQL database to take the usual precautions about theses tainted values, so should do consumers generating HTML, printing to a Terminal, etc. cc @rolinh to clarify the intent.
While the list is interesting, I think it covers a different use-case. In a reverse-proxy setup (edge node / front-proxy in Envoy documentation) it is critical to "sanitize" some headers (e.g. |
With @ChrsMark we agree on this, so we can do the following:
I can send a small, dedicated PR for 1 and 2 listed above.
Regarding sanitization let's wait for some feedback by @rolinh who originally opened the issue to decide how we will move forward. |
|
Sorry @ioandr, @ChrsMark and @kaworu, I missed the updates on this issue. Don't hesitate to ping me on Slack if I'm unresponsive as it's sometimes hard to keep up with everything going on 🙂
Fully agree with this. Without context, it's impossible to do the correct job. Sanitization in this context means, for example, given a list of allowed HTTP headers, leave these untouched and remove all others.
I should have used the term "redact" rather than sanitize, sorry about this. I don't think it's on Hubble to take the call and decide what is a strong password, etc. What Hubble should do is simply to replace usernames and passwords with
Same answer here. We can provide a list of default allowed headers (e.g.
Sounds like a good plan to me and doing 1. will make the behavior consistent.
🚀 |
|
Thanks @rolinh! Makes sense to me. Regarding the http-headers' redaction: how about instead of defining the allowed headers, to define the list of "sensitive" headers to be redacted (optionally)? The following headers could be those we can start with and ofc we can extend:
|
I'd strongly prefer to have users to provide a list of allowed headers rather than a list of headers to redact. The reason being that there could be a lot of custom headers and what not that aren't covered and it would give a false sense of security ("secrets are sanitized, it's safe to share the L7 flows"). I think it's simply not possible to be exhaustive. On the other hand, if a header not on the allow-list is redacted, it's easy to add it to the list. This implies that we need to keep all header keys and only redact header values which seems totally fine to me. |
Definitely. In the values file, I picture something like the following: hubble:
redact:
enabled: false
http:
userInfo: true
urlQuery: true
headers:
allow:
- traceparent
- tracestate
- Cache-Control
- ...
kafka:
apiKey: trueSo setting |
|
Thanks for the feedback @chancez and @rolinh! Just to verify what was suggested at #23887 (comment), would that mean that we need to change the format of the So far we have the following: or in terms of values as hubble:
redact:
- http-url-query
- kafka-api-keySo would your proposal @rolinh mean that we need to change the schema to sth like the following: |
|
@ChrsMark Correct, that looks reasonable. |
|
@ChrsMark Given that #25746 made it to the v1.14 branch but #25844 did not and that we have plan to rework the values file, I suggest to pull #25746 out of the v1.14 branch. I believe this would provide a better experience than shipping only a portion of this meta issue. If we ship in v1.14 and then change the values options, we'll also have to deprecate in the next version which mean the whole improvement would land in v1.16 only. |
|
Hey folks. I have filed #26989 to implement the discussed changes. For now, only the |
|
A heads-up on this one. Now that #27553 has been merged, we can resume the work on the following pending tasks:
@ioandr and everyone feel free to add anything that I might miss here. |
|
Hey there, it seems that the only pending item to close this is to add business logic to control whether we redact the user info part from URLs, if present. I have started working on this however I bumped into some unexpected behavior that I describe in #28798. As soon as we build context around it and have a way forward I should be able to proceed and file a PR to:
|
Add business logic to L7 HTTP parser to conditionally redact sensitive user info (e.g., password used in basic authentication) when present in observed URLs. * Add the '--hubble-redact-http-userinfo' option to the Cilium CLI. Preserve existing functionality by setting the default value to true. * Add unit tests for redacting user info in L7 parser * Update the `visibility.rst` document * Update Helm chart templates, values and docs Finally, ensure that values are redacted both in L7 HTTP flows and corresponding (HTTP) summaries. Closes cilium#23887 Signed-off-by: Ioannis Androulidakis <androulidakis.ioannis@gmail.com>
Add business logic to L7 HTTP parser to conditionally redact sensitive user info (e.g., password used in basic authentication) when present in observed URLs. * Add the '--hubble-redact-http-userinfo' option to the Cilium CLI. Preserve existing functionality by setting the default value to true. * Add unit tests for redacting user info in L7 parser * Update the `visibility.rst` document * Update Helm chart templates, values and docs Finally, ensure that values are redacted both in L7 HTTP flows and corresponding (HTTP) summaries. Closes cilium#23887 Signed-off-by: Ioannis Androulidakis <androulidakis.ioannis@gmail.com>
Add business logic to L7 HTTP parser to conditionally redact sensitive user info (e.g., password used in basic authentication) when present in observed URLs. * Add the '--hubble-redact-http-userinfo' option to the Cilium CLI. Preserve existing functionality by setting it to true by default. * Add unit tests to verify that password in observed URL is redacted. * Fix issue in L7 HTTP parser where sensitive values were redacted in (L7) HTTP flows, but not in (L7) HTTP summaries. * Update documentation as needed. * Update Helm chart templates, values and docs as needed. Closes cilium#23887 Signed-off-by: Ioannis Androulidakis <androulidakis.ioannis@gmail.com>
Add business logic to L7 HTTP parser to conditionally redact sensitive user info (e.g., password used in basic authentication) when present in observed URLs. * Add the '--hubble-redact-http-userinfo' option to the Cilium CLI. Preserve existing functionality by setting it to true by default. * Add unit tests to verify that password in observed URL is redacted. * Fix issue in L7 HTTP parser where sensitive values were redacted in (L7) HTTP flows, but not in (L7) HTTP summaries. * Update documentation as needed. * Update Helm chart templates, values and docs as needed. Closes cilium#23887 Signed-off-by: Ioannis Androulidakis <androulidakis.ioannis@gmail.com>
Add business logic to L7 HTTP parser to conditionally redact sensitive user info (e.g., password used in basic authentication) when present in observed URLs. * Add the '--hubble-redact-http-userinfo' option to the Cilium CLI. Preserve existing functionality by setting it to true by default. * Add unit tests to verify that password in observed URL is redacted. * Fix issue in L7 HTTP parser where sensitive values were redacted in (L7) HTTP flows, but not in (L7) HTTP summaries. * Update documentation as needed. * Update Helm chart templates, values and docs as needed. Closes cilium#23887 Signed-off-by: Ioannis Androulidakis <androulidakis.ioannis@gmail.com>
Add business logic to L7 HTTP parser to conditionally redact sensitive user info (e.g., password used in basic authentication) when present in observed URLs. * Add the '--hubble-redact-http-userinfo' option to the Cilium CLI. Preserve existing functionality by setting it to true by default. * Add unit tests to verify that password in observed URL is redacted. * Fix issue in L7 HTTP parser where sensitive values were redacted in (L7) HTTP flows, but not in (L7) HTTP summaries. * Update documentation as needed. * Update Helm chart templates, values and docs as needed. Closes #23887 Signed-off-by: Ioannis Androulidakis <androulidakis.ioannis@gmail.com>
Add business logic to L7 HTTP parser to conditionally redact sensitive user info (e.g., password used in basic authentication) when present in observed URLs. * Add the '--hubble-redact-http-userinfo' option to the Cilium CLI. Preserve existing functionality by setting it to true by default. * Add unit tests to verify that password in observed URL is redacted. * Fix issue in L7 HTTP parser where sensitive values were redacted in (L7) HTTP flows, but not in (L7) HTTP summaries. * Update documentation as needed. * Update Helm chart templates, values and docs as needed. Closes cilium#23887 Signed-off-by: Ioannis Androulidakis <androulidakis.ioannis@gmail.com>
Hubble has the capability of providing visibility on L7 protocols such as HTTP or Kafka. This layer 7 protocol visibility feature is opt-in and requires users to either create a L7 policy or to add explicit pod annotations to be enabled. Layer 7 Hubble flows, however, may contain sensitive information, for instance as part of some HTTP headers or in a URL itself.
Hubble should provide an option for users to decide which potentially sensitive L7 data to keep in Hubble flows and it should be finely configurable.
The text was updated successfully, but these errors were encountered: