Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CVE-2023-27043] gh-102988: Reject malformed addresses in email.parseaddr() #111116

Merged
merged 14 commits into from
Dec 15, 2023

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Oct 20, 2023

Detect email address parsing errors and return empty tuple to indicate the parsing error (old API). Add an optional 'strict' parameter to getaddresses() and parseaddr() functions. Patch by Thomas Dwyer.


📚 Documentation preview 📚: https://cpython-previews--111116.org.readthedocs.build/

@vstinner
Copy link
Member Author

@gpshead @serhiy-storchaka @bitdancer @warsaw: Would you mind to review this security fix?

See issue gh-102988 for the context.

This PR is a copy of PR #108250 but I added strict=True parameter, so it's possible to get the old behavior. I added tests on both modes, strict=True and strict=False.

@vstinner
Copy link
Member Author

This PR is a copy of PR #108250 but I added strict=True parameter

My colleague Lumir Balhar @frenzymadness ran an impact check of PR #108250 on Fedora: in short, there is no impact, the test suite of all Python packages (in Fedora) pass with the change. While there were some build errors, they were unrelated to the email issue. For details, see https://copr.fedorainfracloud.org/coprs/lbalhar/email-CVE/builds/ COPR which as more than 4300 builds.

Now with an additional strict parameter, if there is any impacted project, at least there is a way to "opt out".

@vstinner
Copy link
Member Author

@tdwyer: Would you mind to review my change, to see if I preserved your work correctly? (code and tests)

@vstinner vstinner added type-security A security issue needs backport to 3.8 only security fixes needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels Oct 27, 2023
@vstinner
Copy link
Member Author

I think that we should backport the change to all branches accepting security fixes. Problem: the change refer to version numbers, which as .. versionchanged:: 3.13. I suppose that if the change is backported, we should compute the next version of each branch, so backport manually.

@vstinner
Copy link
Member Author

@ambv @SethMichaelLarson: Would you mind to review this PR?

@ambv ambv changed the title gh-102988: email parseaddr() now rejects malformed address gh-102988: Reject malformed addresses in email.parseaddr() Oct 27, 2023
@ambv
Copy link
Contributor

ambv commented Oct 27, 2023

Why is this a separate PR from #108250?

@@ -165,7 +165,7 @@ email
encountered instead of potentially inaccurate values. Add optional *strict*
parameter to these two functions: use ``strict=False`` to get the old
behavior, accept malformed inputs.
(Contributed by Thomas Dwyer for :gh:`102988` to ameliorate CVE-2023-27043
(Contributed by Thomas Dwyer for :gh:`102988` to improve the CVE-2023-27043
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL a new word.

@@ -42,6 +42,8 @@

specialsre = re.compile(r'[][\\()<>@,:;".]')
escapesre = re.compile(r'[\\"]')
realname_comma_re = re.compile(r'"[^"]*,[^"]*"')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
realname_comma_re = re.compile(r'"[^"]*,[^"]*"')
realname_comma_re = re.compile(r'"[^",]*+,[^"]*+"')

It is faster. But I am not sure that the use of such regex is correct.

def _pre_parse_validation(email_header_fields):
accepted_values = []
for v in email_header_fields:
s = v.replace('\\(', '').replace('\\)', '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what if that backslash was already escaped with a backslash? For example \\) or \\\\).

Lib/email/utils.py Outdated Show resolved Hide resolved
Lib/email/utils.py Outdated Show resolved Hide resolved
@vstinner
Copy link
Member Author

Why is this a separate PR from #108250?

I'm not the author of the other PR. I copied the other PR and added strict parameter.

@ambv
Copy link
Contributor

ambv commented Oct 27, 2023

I'm not the author of this PR and I was able to make commits to it.

@vstinner
Copy link
Member Author

I'm not the author of this PR and I was able to make commits to it.

I don't feel comfortable to make significant change of a PR without asking the author. I prefer to create a separated PR and ask for review.

@vstinner
Copy link
Member Author

vstinner commented Oct 30, 2023

Is this behavior a bug or a feature? I don't know how ; is supposed to behave.

$ python
Python 3.11.6 (main, Oct  3 2023, 00:00:00) [GCC 13.2.1 20230728 (Red Hat 13.2.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from email.utils import getaddresses
>>> from pprint import pprint
>>> pprint(getaddresses('<bob@example.org>; <alice@example.org>'))
[('', ''),
 ('', 'b'),
 ('', 'o'),
 ('', 'b'),
 ('', ''),
 ('', 'e'),
 ('', 'x'),
 ('', 'a'),
 ('', 'm'),
 ('', 'p'),
 ('', 'l'),
 ('', 'e'),
 ('', '.'),
 ('', 'o'),
 ('', 'r'),
 ('', 'g'),
 ('', ''),
 ('', ''),
 ('', ''),
 ('', ''),
 ('', 'a'),
 ('', 'l'),
 ('', 'i'),
 ('', 'c'),
 ('', 'e'),
 ('', ''),
 ('', 'e'),
 ('', 'x'),
 ('', 'a'),
 ('', 'm'),
 ('', 'p'),
 ('', 'l'),
 ('', 'e'),
 ('', '.'),
 ('', 'o'),
 ('', 'r'),
 ('', 'g'),
 ('', '')]

@vstinner
Copy link
Member Author

Is this behavior a bug or a feature? I don't know how ; is supposed to behave.

Oh. getaddresses() expects a sequence, not a string :-)

@vstinner
Copy link
Member Author

Except of parsedate_tz() function, Lib/email/_parseaddr.py file didn't evolve much it was added in 2002 by commit 030ddf7. The file was created from Lib/rfc822.py which was added in 1992 (commit 01ca336).

The latest major change was done in... 1997 with commit be7c45e

New address parser by Ben Escoto replaces Sjoerd Mullender's parseaddr()

The latest minor change was done in 2019 to fix CVE-2019-16056: commit 8cb65d1 of issue #78336.

@vstinner
Copy link
Member Author

What if the input is '"Jane Doe" jane@example.net, "John Doe" john@example.net'?

Oh, realname_comma_re replaces "Jane Doe" <jane@example.net>, "John Doe" <john@example.net> with "Jane Doe <john@example.net> which is invalid...

@vstinner
Copy link
Member Author

@vstinner vstinner marked this pull request as draft October 30, 2023 14:20
frenzymadness pushed a commit to frenzymadness/cpython that referenced this pull request Dec 22, 2023
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>

Changes for Python 2:
- Define encoding for test_email
- Adjust import so we don't need change the tests
- Do not use f-strings
- Do not use SubTest
- KW only function arguments are not supported
hroncok pushed a commit to hroncok/cpython that referenced this pull request Jan 17, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>

Changes for Python 2:
- Define encoding for test_email
- Adjust import so we don't need change the tests
- Do not use f-strings
- Do not use SubTest
- KW only function arguments are not supported
@tdwyer
Copy link
Contributor

tdwyer commented Jan 27, 2024

Thank you @vstinner for picking this up and helping to get this patch merged. I agree, adding a flag to disable the new strict behavior is a good idea to avoid any issues it may cause for others.

hrnciar pushed a commit to fedora-python/cpython that referenced this pull request Feb 7, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
hrnciar pushed a commit to fedora-python/cpython that referenced this pull request Feb 7, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
….parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
stratakis pushed a commit to stratakis/cpython that referenced this pull request Feb 27, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
hroncok pushed a commit to fedora-python/cpython that referenced this pull request Mar 7, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
stratakis pushed a commit to stratakis/cpython that referenced this pull request Mar 11, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
stratakis pushed a commit to stratakis/cpython that referenced this pull request Mar 11, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
stratakis pushed a commit to stratakis/cpython that referenced this pull request Mar 20, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
hrnciar pushed a commit to fedora-python/cpython that referenced this pull request Mar 20, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
hrnciar pushed a commit to fedora-python/cpython that referenced this pull request Mar 20, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
hrnciar pushed a commit to fedora-python/cpython that referenced this pull request Mar 20, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
stratakis pushed a commit to stratakis/cpython that referenced this pull request Mar 20, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
stratakis pushed a commit to stratakis/cpython that referenced this pull request Mar 20, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
stratakis pushed a commit to stratakis/cpython that referenced this pull request Mar 20, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
stratakis pushed a commit to stratakis/cpython that referenced this pull request Mar 25, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
hroncok pushed a commit to fedora-python/cpython that referenced this pull request Mar 26, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
mcepl pushed a commit to openSUSE-Python/cpython that referenced this pull request Apr 2, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
hrnciar pushed a commit to fedora-python/cpython that referenced this pull request Apr 9, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
hrnciar pushed a commit to fedora-python/cpython that referenced this pull request Apr 10, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
@mcepl
Copy link
Contributor

mcepl commented Apr 16, 2024

I have already some testing on this backport of the patch for Python 2.7. Comments and reviews are more than welcome.

CVE-2023-27043-email-parsing-errors.patch

@stratakis
Copy link
Contributor

I have already some testing on this backport of the patch for Python 2.7. Comments and reviews are more than welcome.

CVE-2023-27043-email-parsing-errors.patch

Hey @mcepl . We have already backported this to python2.7 in Fedora: fedora-python@5ae7bab

@vstinner
Copy link
Member Author

Is it time to backport the change to Python 3.8-3.12? So far, nobody reported any regression in Python 3.13.

@mcepl
Copy link
Contributor

mcepl commented Apr 17, 2024

Of course, I am eager for reviews and comments.

So far, the only issue I have heard about was confusion about str/unicode objects in Python 2.7 (https://bugzilla.suse.com/1222537), which I think may be your problem as well.

Also, I left some comments on that commit.

@encukou
Copy link
Member

encukou commented Apr 22, 2024

Is it time to backport the change to Python 3.8-3.12

According to this comment, not yet: #102988 (comment)

hrnciar pushed a commit to fedora-python/cpython that referenced this pull request Jun 7, 2024
…n email.parseaddr() (python#111116)

Detect email address parsing errors and return empty tuple to
indicate the parsing error (old API). Add an optional 'strict'
parameter to getaddresses() and parseaddr() functions. Patch by
Thomas Dwyer.

Co-Authored-By: Thomas Dwyer <github@tomd.tel>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error type-security A security issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants