-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to built-in data structures in SecretsMasker #16424
Conversation
2bb0b92
to
ceab5ac
Compare
Using Iterable in SecretsMasker might cause undesireable side effect in case the object passed as log parameter is an iterable object and actually iterating it is not idempotent. For example in case of botocore, it passes StreamingBody object to log and this object is Iterable. However it can be iterated only once. Masking causes the object to be iterated during logging and results in empty body when actual results are retrieved later. This change only iterates list type of objects and recurrently redacts only dicts/strs/tuples/sets/lists which should never produce any side effects as all those objects do not have side effects when they are accessed. Fixes: apache#16148
ceab5ac
to
af39d40
Compare
return list(self.redact(subval) for subval in item) | ||
else: | ||
return item | ||
# I think this should never happen, but it does not hurt to leave it just in case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find an easy way to raise an exception any more after removing Iterable. I believe we cannot get the Exception if we do what we do now - i.e. walking the built-in structures in the way we do. But maybe there is some way an exception can be raised here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’s technically possible if the user passes in some kind of custom subclass (e.g. class MyList(list):
that overrides some weird stuff), so yeah let’s keep the exception handler there, but I don’t think it’s worthwhile to have a test for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreet
@@ -196,12 +195,11 @@ def redact(self, item: "RedactableItem", name: str = None) -> "RedactableItem": | |||
elif isinstance(item, (tuple, set)): | |||
# Turn set in to tuple! | |||
return tuple(self.redact(subval) for subval in item) | |||
elif isinstance(item, io.IOBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only reason IOBase was added here specifically was that it was Iterable. This else is not needed now as this one will fall through the last return item
anyway.
@@ -72,22 +72,6 @@ def test_args(self, logger, caplog): | |||
|
|||
assert caplog.text == "INFO Cannot connect to user:***\n" | |||
|
|||
def test_non_redactable(self, logger, caplog): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could not find a way to trigger Exception :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inherit from list or dict should trigger it, but don't mind deleting this test
@ashb , does it look good for you? I think switching to list solves the problem entirely. |
@@ -165,7 +164,7 @@ def _redact_all(self, item: "RedactableItem") -> "RedactableItem": | |||
elif isinstance(item, (tuple, set)): | |||
# Turn set in to tuple! | |||
return tuple(self._redact_all(subval) for subval in item) | |||
elif isinstance(item, Iterable): | |||
elif isinstance(item, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tuple too please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's a few lines up
return list(self.redact(subval) for subval in item) | ||
else: | ||
return item | ||
# I think this should never happen, but it does not hurt to leave it just in case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreet
@@ -72,22 +72,6 @@ def test_args(self, logger, caplog): | |||
|
|||
assert caplog.text == "INFO Cannot connect to user:***\n" | |||
|
|||
def test_non_redactable(self, logger, caplog): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inherit from list or dict should trigger it, but don't mind deleting this test
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
Using Iterable in SecretsMasker might cause undesireable side effect in case the object passed as log parameter is an iterable object and actually iterating it is not idempotent. For example in case of botocore, it passes StreamingBody object to log and this object is Iterable. However it can be iterated only once. Masking causes the object to be iterated during logging and results in empty body when actual results are retrieved later. This change only iterates list type of objects and recurrently redacts only dicts/strs/tuples/sets/lists which should never produce any side effects as all those objects do not have side effects when they are accessed. Fixes: #16148 (cherry picked from commit d1d02b6)
Using Iterable in SecretsMasker might cause undesireable side effect in case the object passed as log parameter is an iterable object and actually iterating it is not idempotent. For example in case of botocore, it passes StreamingBody object to log and this object is Iterable. However it can be iterated only once. Masking causes the object to be iterated during logging and results in empty body when actual results are retrieved later. This change only iterates list type of objects and recurrently redacts only dicts/strs/tuples/sets/lists which should never produce any side effects as all those objects do not have side effects when they are accessed. Fixes: apache#16148 (cherry picked from commit d1d02b6) (cherry picked from commit 4c37aea)
Using Iterable in SecretsMasker might cause undesireable side effect in case the object passed as log parameter is an iterable object and actually iterating it is not idempotent. For example in case of botocore, it passes StreamingBody object to log and this object is Iterable. However it can be iterated only once. Masking causes the object to be iterated during logging and results in empty body when actual results are retrieved later. This change only iterates list type of objects and recurrently redacts only dicts/strs/tuples/sets/lists which should never produce any side effects as all those objects do not have side effects when they are accessed. Fixes: apache#16148 (cherry picked from commit d1d02b6) (cherry picked from commit 4c37aea) (cherry picked from commit 7e5968a)
Using Iterable in SecretsMasker might cause undesireable side effect in case the object passed as log parameter is an iterable object and actually iterating it is not idempotent. For example in case of botocore, it passes StreamingBody object to log and this object is Iterable. However it can be iterated only once. Masking causes the object to be iterated during logging and results in empty body when actual results are retrieved later. This change only iterates list type of objects and recurrently redacts only dicts/strs/tuples/sets/lists which should never produce any side effects as all those objects do not have side effects when they are accessed. Fixes: apache#16148 (cherry picked from commit d1d02b6) (cherry picked from commit 4c37aea) (cherry picked from commit 7e5968a) (cherry picked from commit 523bba0)
Using Iterable in SecretsMasker might cause undesireable
side effect in case the object passed as log parameter
is an iterable object and actually iterating it is not idempotent.
For example in case of botocore, it passes StreamingBody
object to log and this object is Iterable. However it can be
iterated only once. Masking causes the object to be iterated
during logging and results in empty body when actual results
are retrieved later.
This change only iterates list type of objects and recurrently
redacts only dicts/strs/tuples/sets/lists which should never
produce any side effects as all those objects do not have side
effects when they are accessed.
Fixes: #16148
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.