Tighten deserialization allowlist regex to require full-string match#66499
Open
potiuk wants to merge 2 commits intoapache:mainfrom
Open
Tighten deserialization allowlist regex to require full-string match#66499potiuk wants to merge 2 commits intoapache:mainfrom
potiuk wants to merge 2 commits intoapache:mainfrom
Conversation
The ``allowed_deserialization_classes_regexp`` allowlist used ``re.match()``, which only anchors at the start of the string. A pattern like ``airflow\.models\.Variable`` therefore also admitted classnames such as ``airflow.models.Variable_Malicious``. Switch to ``re.fullmatch()`` so the admin's pattern matches the entire classname; document the semantics in the config description so operators know to use ``.*`` for prefix-style allowances.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Tighten the deserialization allowlist (
[core] allowed_deserialization_classes_regexp)to use
re.fullmatch()instead ofre.match(). Previously a pattern such asairflow\.models\.Variableadmitted not only the intended class but alsoairflow.models.Variable_Malicious—re.matchonly anchors at the startof the string. Using
fullmatchrequires the pattern to match the entireclassname, eliminating the prefix-bypass footgun.
Updated the config description so admins know patterns are full-match and
that
.*is needed for prefix-style allowances. Updated the existing testthat relied on prefix-match semantics, and added a dedicated test for the
bypass scenario.
Compatibility note for reviewers
This is a behaviour change for any deployment that configured
allowed_deserialization_classes_regexpwith patterns relying onprefix-match semantics (e.g.
airflow\.models\.to mean "any class underairflow.models"). Such deployments need to add.*to the pattern.The default value is empty, so out-of-the-box deployments are unaffected.
Default off, admin-only config — leaving the newsfragment decision to the
reviewer.
Was generative AI tooling used to co-author this PR?
Generated-by: Claude Opus 4.7 (1M context) following the guidelines