-
-
Notifications
You must be signed in to change notification settings - Fork 33.4k
gh-125346: Fix decoding with non-standard Base64 alphabet #141128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The "+" and "/" characters are no longer recognized as the part of the Base64 alphabet in base64.urlsafe_b64decode() and base64.b64decode() the altchars argument that does not contain them.
|
@serhiy-storchaka Thanks for this, can you link this PR to either the original issue or a new issue for tracking purposes. |
sethmlarson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really concerned about the subtle breakages that this change could cause, especially because the default behavior is to throw away characters that aren't in the current alphabet. If the default behavior was to raise an error I would feel better about this change.
Makes me wonder if we should target validate=True with this behavior change (because IMO, the silent dropping of invalid characters is in itself a concerning behavior) and then long-term move to having validate be enabled by default?
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worries me too. We can keep the old behavior but emit a warning if characters + or / occur in Base64 data with the alternative alphabet.
But urlsafe_b64decode() does not have the validate parameter.
I wonder if for |
|
But is not passing |
The "+" and "/" characters are no longer recognized as the part of the Base64 alphabet in base64.urlsafe_b64decode() and base64.b64decode() the altchars argument that does not contain them.