Redact rendered template fields while still structured to preserve nested-key masking on truncation#65906
Conversation
|
@potiuk can you resolve conflicts on this. I see it fixes security issue https://github.com/airflow-s/airflow-s/issues/345. Adding to 3.2.2 milestone |
Damn . Will be away from my PC for few hours.. I hope it will be ok tif I so it later ? |
…sted-key masking on truncation Generated-by: Claude Opus 4.7 (1M context) following the guidelines at https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions
The new test_rendered_templates_mask_nested_keys_with_truncation shares the singleton SecretsMasker with earlier tests in the file. One of those (test_get_connection_from_context) fetches a connection whose password fixture value happens to be the literal string "password", which the SDK runtime registers as a regex mask via mask_secret(). When the new test runs after it, that regex substitutes the literal token "password" inside str(redacted) -- including the dict KEY name -- so the assertion "'password': '***'" fails because the key itself is also masked. Reset patterns/replacer for the test via monkeypatch (auto-restored on teardown) so the assertion isolates value-masking (the behavior under test) from key-token replacement (a side effect of leaked patterns).
4304e53 to
e8380b2
Compare
|
rebased and green @vatsrahul1001 |
|
I'd love to get this one merged — and would love it in 3.2.2 if it's not too late. cc @vatsrahul1001 (3.2.2 RM) Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting |
Backport successfully created: v3-2-testNote: As of Merging PRs targeted for Airflow 3.X In matter of doubt please ask in #release-management Slack channel.
|
… preserve nested-key masking on truncation (apache#65906) * Redact rendered template fields while still structured to preserve nested-key masking on truncation Generated-by: Claude Opus 4.7 (1M context) following the guidelines at https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions * Isolate masker patterns in nested-key truncation test The new test_rendered_templates_mask_nested_keys_with_truncation shares the singleton SecretsMasker with earlier tests in the file. One of those (test_get_connection_from_context) fetches a connection whose password fixture value happens to be the literal string "password", which the SDK runtime registers as a regex mask via mask_secret(). When the new test runs after it, that regex substitutes the literal token "password" inside str(redacted) -- including the dict KEY name -- so the assertion "'password': '***'" fails because the key itself is also masked. Reset patterns/replacer for the test via monkeypatch (auto-restored on teardown) so the assertion isolates value-masking (the behavior under test) from key-token replacement (a side effect of leaked patterns). (cherry picked from commit 4ceb0db) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
… preserve nested-key masking on truncation (apache#65906) * Redact rendered template fields while still structured to preserve nested-key masking on truncation Generated-by: Claude Opus 4.7 (1M context) following the guidelines at https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions * Isolate masker patterns in nested-key truncation test The new test_rendered_templates_mask_nested_keys_with_truncation shares the singleton SecretsMasker with earlier tests in the file. One of those (test_get_connection_from_context) fetches a connection whose password fixture value happens to be the literal string "password", which the SDK runtime registers as a regex mask via mask_secret(). When the new test runs after it, that regex substitutes the literal token "password" inside str(redacted) -- including the dict KEY name -- so the assertion "'password': '***'" fails because the key itself is also masked. Reset patterns/replacer for the test via monkeypatch (auto-restored on teardown) so the assertion isolates value-masking (the behavior under test) from key-token replacement (a side effect of leaked patterns). (cherry picked from commit 4ceb0db) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
… preserve nested-key masking on truncation (#65906) (#67117) * Redact rendered template fields while still structured to preserve nested-key masking on truncation Generated-by: Claude Opus 4.7 (1M context) following the guidelines at https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions * Isolate masker patterns in nested-key truncation test The new test_rendered_templates_mask_nested_keys_with_truncation shares the singleton SecretsMasker with earlier tests in the file. One of those (test_get_connection_from_context) fetches a connection whose password fixture value happens to be the literal string "password", which the SDK runtime registers as a regex mask via mask_secret(). When the new test runs after it, that regex substitutes the literal token "password" inside str(redacted) -- including the dict KEY name -- so the assertion "'password': '***'" fails because the key itself is also masked. Reset patterns/replacer for the test via monkeypatch (auto-restored on teardown) so the assertion isolates value-masking (the behavior under test) from key-token replacement (a side effect of leaked patterns). (cherry picked from commit 4ceb0db) Co-authored-by: Jarek Potiuk <jarek@potiuk.com> Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
… preserve nested-key masking on truncation (#65906) (#67117) * Redact rendered template fields while still structured to preserve nested-key masking on truncation Generated-by: Claude Opus 4.7 (1M context) following the guidelines at https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions * Isolate masker patterns in nested-key truncation test The new test_rendered_templates_mask_nested_keys_with_truncation shares the singleton SecretsMasker with earlier tests in the file. One of those (test_get_connection_from_context) fetches a connection whose password fixture value happens to be the literal string "password", which the SDK runtime registers as a regex mask via mask_secret(). When the new test runs after it, that regex substitutes the literal token "password" inside str(redacted) -- including the dict KEY name -- so the assertion "'password': '***'" fails because the key itself is also masked. Reset patterns/replacer for the test via monkeypatch (auto-restored on teardown) so the assertion isolates value-masking (the behavior under test) from key-token replacement (a side effect of leaked patterns). (cherry picked from commit 4ceb0db) Co-authored-by: Jarek Potiuk <jarek@potiuk.com> Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
… preserve nested-key masking on truncation (#65906) (#67117) * Redact rendered template fields while still structured to preserve nested-key masking on truncation Generated-by: Claude Opus 4.7 (1M context) following the guidelines at https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions * Isolate masker patterns in nested-key truncation test The new test_rendered_templates_mask_nested_keys_with_truncation shares the singleton SecretsMasker with earlier tests in the file. One of those (test_get_connection_from_context) fetches a connection whose password fixture value happens to be the literal string "password", which the SDK runtime registers as a regex mask via mask_secret(). When the new test runs after it, that regex substitutes the literal token "password" inside str(redacted) -- including the dict KEY name -- so the assertion "'password': '***'" fails because the key itself is also masked. Reset patterns/replacer for the test via monkeypatch (auto-restored on teardown) so the assertion isolates value-masking (the behavior under test) from key-token replacement (a side effect of leaked patterns). (cherry picked from commit 4ceb0db) Co-authored-by: Jarek Potiuk <jarek@potiuk.com> Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
… preserve nested-key masking on truncation (#65906) (#67117) * Redact rendered template fields while still structured to preserve nested-key masking on truncation Generated-by: Claude Opus 4.7 (1M context) following the guidelines at https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions * Isolate masker patterns in nested-key truncation test The new test_rendered_templates_mask_nested_keys_with_truncation shares the singleton SecretsMasker with earlier tests in the file. One of those (test_get_connection_from_context) fetches a connection whose password fixture value happens to be the literal string "password", which the SDK runtime registers as a regex mask via mask_secret(). When the new test runs after it, that regex substitutes the literal token "password" inside str(redacted) -- including the dict KEY name -- so the assertion "'password': '***'" fails because the key itself is also masked. Reset patterns/replacer for the test via monkeypatch (auto-restored on teardown) so the assertion isolates value-masking (the behavior under test) from key-token replacement (a side effect of leaked patterns). (cherry picked from commit 4ceb0db) Co-authored-by: Jarek Potiuk <jarek@potiuk.com> Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
When a rendered template field exceeds
[core] max_templated_field_length, theJSON-serializable serialization path stringifies the value before applying
redact(). That order loses the nested-key context thatredact()uses tomask values under sensitive keys such as
password,token,secret, andapi_key— only registeredmask_secret()value patterns survive thetruncation path.
This change applies
redact()to the structured value first, then stringifiesthe redacted result for truncation. Both nested-key-context masking and value-
pattern masking now behave consistently regardless of whether the rendered
field crosses the truncation boundary. The fit-in-limits branch is unchanged.
The same fix is applied in both
airflow-core/src/airflow/serialization/helpers.py(
serialize_template_field) andtask-sdk/src/airflow/sdk/execution_time/task_runner.py(
_serialize_template_field), since the two functions diverged intonear-duplicates after #59566 and carried the same bug pattern.
Test plan
test_serialize_template_field_masks_nested_sensitive_keys_on_truncationto
airflow-core/tests/unit/serialization/test_helpers.pycovering thestructured-redact-before-stringify behaviour for an oversized nested
passwordpayload.test_rendered_templates_mask_nested_keys_with_truncationtotask-sdk/tests/task_sdk/execution_time/test_task_runner.pycovering thesame behaviour through the runtime path.
test_serialize_template_field_with_very_small_max_lengthandtest_rendered_templates_mask_secrets_with_truncationcontinue to pass.Was generative AI tooling used to co-author this PR?
Generated-by: Claude Opus 4.7 (1M context) following the guidelines at https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions