-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KETTLE-73: Allow censoring of sensitive information which may be present in URL of DataSource #49
Conversation
…ent in URL of DataSource
CI job failed: https://ci.fluidproject.org/job/kettle-pull-request/9/ |
CI job passed: https://ci.fluidproject.org/job/kettle-pull-request/10/ |
lib/dataSource-url.js
Outdated
var censorURL = function (url) { | ||
sensitiveValues.forEach(function (sensitiveValue) { | ||
if (sensitiveValue) { | ||
url = url.replace(sensitiveValue, "[CENSORED]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This straightforward string replacement will fail whenever there is percent-encoding in the URL. For example, if we change the test URL in file tests/DataSourceSimpleTests.js
from "https://secret-user:secret-password@thing.available:997/path"
to "https://secret-user:secret-%25-password@thing.available:997/path"
(that is, a password with value "secret-%-password"), the test will fail.
In this case, the full URL is logged as "https://secret-user:secret-%25-password@thing.available:997/path"
.
Inside the censorURL
function, sensitiveValue
is decoded and has the value "secret-user:secret-%-password"
. Which will not be found in the URL for replacement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it feasible to never log the URL?
An approach might be to encode the value we want to remove and do replacement with the encoded value.
But I fear that this may be difficult to get right as there are multiple possible notations that result in the same decoded value.
It looks like some reserved characters are processed leniently even if they are not escaped. For example, the following URL passes the test:
https://secret-user:secret-=-password@thing.available:997/path
Even though "="
is a reserved character. A URL with another reserved character fails:
https://secret-user:secret-#-password@thing.available:997/path
It's also possible (though less likely) to percent encode unreserved characters. For example the following are all equivalent:
- AB
- A%42
- %41B
- %41%42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting this risk, @simonbates - I decided the readability advantages of logging the URL warranted keeping it, and changed the workflow to regenerate the URL by re-encoding the already censored broken-down fields rather than attempting to censor it in place. Ready for another look - cheers
CI job passed: https://ci.fluidproject.org/job/kettle-pull-request/11/ |
tests/StaticTests.js
Outdated
@@ -54,12 +54,13 @@ fluid.defaults("kettle.tests.middleware.verifyingUnmarked", { | |||
gradeNames: ["kettle.plainAsyncMiddleware"], | |||
middleware: kettle.tests.verifyingUnmarkedMiddleware | |||
}); | |||
|
|||
fluid.logObjectRenderChars = 10240; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This debug line can be removed.
@cindyli, @simonbates - ready for another look |
CI job passed: https://ci.fluidproject.org/job/kettle-pull-request/12/ |
With the current censoring implementation, I wonder if it's worthwhile to keep the presence of these options in the log but replacing their values with a string like [SENSITIVE] so we don't leak sensitive values but are still aware these info have been sent. |
Thanks for this good suggestion, @cindyli - ready for another look |
CI job passed: https://ci.fluidproject.org/job/kettle-pull-request/13/ |
fluid.defaults("kettle.tests.KETTLE34dataSource", { | ||
gradeNames: "kettle.dataSource.URL", | ||
url: "https://user:password@thing.available:997/path", | ||
headers: { | ||
"x-custom-header": "x-custom-value" | ||
}, | ||
censorRequestOptionsLog: { | ||
auth: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test for auth: true
with sensitive values being replaced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that is the default value - is the idea just to test that options merging works correctly?
I understand |
I couldn't follow the link in your comment. This test https://github.com/fluid-project/kettle/pull/49/files#diff-f338ac10c905127921ee7d0c4ab2f01bR142 shows the sensitive option "auth" being censored from the logs |
Right. Was thinking to enhance this test to have sensitive options, which doesn't seem necessary with the test you pointed out. Thanks. Wonder if "header.Authorization" request field should be censored by default in kettle too. In universal, this field holds access tokens when requests are sent to the cloud based flow manager for fetching or saving user settings. See the get endpoint doc for your reference. |
@cindyli - good idea, implementation and tests enhanced to allow "deep censoring" - ready for another look |
CI job passed: https://ci.fluidproject.org/job/kettle-pull-request/14/ |
docs/DataSources.md
Outdated
@@ -144,6 +144,15 @@ We document these configuration options in the next section: | |||
nonexistent file, or an HTTP resource giving a 404) will result in a <code>resolve</code> with an empty | |||
payload rather than a <code>reject</code> response.</td> | |||
</tr> | |||
<tr> | |||
<td><code>censorRequestOptionsLog</code></td> | |||
<td><code>Object</code> (map of <code>String</code> to <code>Boolean</code>) (default: <code>{auth: true}</code>)</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add "headers.Authorization": true
into the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well caught. Ready for another look
@simonbates, this pull request looks good to me. |
CI job passed: https://ci.fluidproject.org/job/kettle-pull-request/15/ |
Merged at 281d9aa |
No description provided.