-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional safeguards against printing Secret content #474
Conversation
a06920f
to
99497da
Compare
99497da
to
f52b7ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM
As for the philosophical question of how we should handle secrets: I think we should continue our best effort support, emphasise that it's just that, and commit to handling leaks in our logs as high priority issues. As you say the tool may still handle them anyway, and treating them like any other resource seems like it could result in more accidental leakage than our current approach. Actively refusing support doesn't seem pragmatic either, though perhaps we could add a flag for users to opt into this - if secrets are detected, we fail early and refuse to continue.
@@ -62,7 +62,11 @@ def build_secrets | |||
|
|||
secrets.map do |secret_name, secret_spec| | |||
validate_secret_spec(secret_name, secret_spec) | |||
generate_secret_resource(secret_name, secret_spec["_type"], secret_spec["data"]) | |||
resource = generate_secret_resource(secret_name, secret_spec["_type"], secret_spec["data"]) | |||
unless resource.validate_definition(@kubectl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this result in additional calls to the API server compared to today, or is it processed entirely locally? I ask because we've seen a recent increase in deploy failures to due overladed masters at the validation stage, and this might add additional load and exacerbate the issue. It's not a good argument to skip this, but perhaps raises the priority of adding some resiliency patterns around these calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does make API calls, but these new ones will be in serial. I don't think we were aware that the validation site was a particular problem. It may be easy to add additional attempts, but I'd need to check what validation failure looks like to make sure the retries would be appropriate. Can you PM me an example?
How does this sound as an addition to the README? I'm thinking it could go at the end of the Usage section.
|
What are you trying to accomplish with this PR?
Further reduce the likelihood that we'll accidentally print the content of sensitive resources (i.e. Secrets) when something goes wrong. Now that we officially support the Secret resource and push our EJSON secrets through that code path, this is super important.
How is this accomplished?
These are the ways I thought of that secrets could get logged:
I added tests for the three additional cases (test_apply_failure_with_sensitive_resources_hides_raw_output covers the last one already) and tried to fix them as solidly as I could think to.
record_invalid_template
is a bit of a choke point for these problems, so in addition to fixing the specific cases, I made that method do a blunt scan of the content it is about to log and replace it if it containskind: Secret
.I also made the ejson secret provisioner proactively validate the resulting resources and raise its own error if it isn't valid. This is duplicated effort, but it gives us stronger guarantees that nothing will go wrong further down the path.
Note that this is based on #473 to avoid linter noise.What could go wrong?
It's still possible there's a code path that I haven't thought of that will still log secret content on error. It is also possible someone will add a new way for this to go wrong in the future, and nothing will stand in their way (note that our contributor's guide does warn about this:
We handle Kubernetes secrets, so it is critical that changes do not cause the contents of any secrets in the template set to be logged.
).An alternative / additional safeguard could be to have our logger class scan every single string we give it for
kind: Secret
. However, that is very heavy-handed and arguably not effective, since not all strings with that text contain sensitive data, and not all secrets have that string (e.g. they could be invalid, or the string could be a partial resource).If we don't think we can get to Good Enough here, we could revert Secret support. I would rather not though--the community asked for it, and deploying them always actually worked even before we made it official. Even if we took it a step further and prevented Secrets from being deployed with this gem, that still doesn't prevent this gem from handling them.
Another thing that occurred to me is that CRs could contain sensitive data for all we know. We should consider letting administrators tell us this in our CRD annotations. I'll open an issue about this.
@Shopify/cloudx