Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

email.parser.Parser hang #81642

Closed
Guido mannequin opened this issue Jul 1, 2019 · 19 comments
Closed

email.parser.Parser hang #81642

Guido mannequin opened this issue Jul 1, 2019 · 19 comments
Labels
3.7 only security fixes 3.8 only security fixes 3.9 only security fixes topic-email type-security A security issue

Comments

@37ef113c-b05e-4e81-a20a-9a1a5e0c3784
Copy link
Mannequin

Guido mannequin commented Jul 1, 2019

BPO 37461
Nosy @warsaw, @vstinner, @larryhastings, @ned-deily, @alex, @bitdancer, @postmasters, @maxking, @eamanu, @miss-islington, @tirkarthi, @iritkatriel
PRs
  • [WIP] bpo-37461: Fix email.parser.Parse hang #14551
  • bpo-37461: Fix infinite loop in parsing of specially crafted email headers #14794
  • [3.7] bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794) #14816
  • [3.6] bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794) #14817
  • [3.8] bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794) #14818
  • [3.7] bpo-37461: Fix typo (inifite -> infinite) #15430
  • [3.6] bpo-37461: Fix typo (inifite -> infinite) #15432
  • [3.5] bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794) #15446
  • Files
  • reproducer.py
  • reproducer2.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-10-19.21:45:01.609>
    created_at = <Date 2019-07-01.03:21:13.231>
    labels = ['type-security', '3.7', '3.8', 'expert-email', '3.9']
    title = 'email.parser.Parser hang'
    updated_at = <Date 2020-10-19.21:45:01.607>
    user = 'https://bugs.python.org/Guido'

    bugs.python.org fields:

    activity = <Date 2020-10-19.21:45:01.607>
    actor = 'barry'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-10-19.21:45:01.609>
    closer = 'barry'
    components = ['email']
    creation = <Date 2019-07-01.03:21:13.231>
    creator = 'Guido'
    dependencies = []
    files = ['48483', '48484']
    hgrepos = []
    issue_num = 37461
    keywords = ['patch', 'security_issue']
    message_count = 19.0
    messages = ['346953', '347014', '347108', '347263', '347953', '348060', '348064', '348070', '348071', '348072', '348850', '348867', '350344', '350345', '350346', '350348', '351289', '378396', '379027']
    nosy_count = 14.0
    nosy_names = ['barry', 'vstinner', 'larry', 'ned.deily', 'alex', 'r.david.murray', 'Nam.Nguyen', 'maxking', 'Guido', 'eamanu', 'miss-islington', 'xtreak', 'Marcin Niemira', 'iritkatriel']
    pr_nums = ['14551', '14794', '14816', '14817', '14818', '15430', '15432', '15446']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue37461'
    versions = ['Python 3.5', 'Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9']

    @37ef113c-b05e-4e81-a20a-9a1a5e0c3784
    Copy link
    Mannequin Author

    Guido mannequin commented Jul 1, 2019

    The following will hang, and consume a large amount of memory:

    from email.parser import BytesParser, Parser
    from email.policy import default
    payload = "".join(chr(c) for c in [0x43, 0x6f, 0x6e, 0x74, 0x65, 0x6e, 0x74, 0x2d, 0x54, 0x79, 0x70, 0x65, 0x3a, 0x78, 0x3b, 0x61, 0x72, 0x1b, 0x2a, 0x3d, 0x22, 0x73, 0x4f, 0x27, 0x23, 0x61, 0xff, 0xff, 0x27, 0x5c, 0x22])
    Parser(policy=default).parsestr(payload)

    @37ef113c-b05e-4e81-a20a-9a1a5e0c3784 Guido mannequin added type-crash A hard crash of the interpreter, possibly with a core dump 3.9 only security fixes topic-email labels Jul 1, 2019
    @5da58861-5180-44d0-b5d2-ff85c2be17d8
    Copy link
    Mannequin

    MarcinNiemira mannequin commented Jul 1, 2019

    Sounds like there is an infinite loop here:

    Pdb) 
    > /usr/lib/python3.7/email/_header_value_parser.py(2370)get_parameter()
    -> v.append(token)
    (Pdb) 
    > /usr/lib/python3.7/email/_header_value_parser.py(2365)get_parameter()
    -> while value:
    

    the v.append(token) is just growing with ValueTerminal(''), ValueTerminal(''), ValueTerminal('')...

    I'd be happy to try to fix this.

    @tirkarthi
    Copy link
    Member

    Since the parser could take user input this looks like a security issue to me along with high CPU usage. Feel free to remove the tag if it's not a security issue. Thanks.

    @tirkarthi tirkarthi added 3.7 only security fixes 3.8 only security fixes labels Jul 2, 2019
    @5da58861-5180-44d0-b5d2-ff85c2be17d8
    Copy link
    Mannequin

    MarcinNiemira mannequin commented Jul 4, 2019

    I'm terribly sorry, but I feel I won't be able to fix this issue. Sorry for fuss. Closing my PR, because it's broken.

    @vstinner vstinner added type-security A security issue and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Jul 15, 2019
    @vstinner
    Copy link
    Member

    >>> bytes([0x43, 0x6f, 0x6e, 0x74, 0x65, 0x6e, 0x74, 0x2d, 0x54, 0x79, 0x70, 0x65, 0x3a, 0x78, 0x3b, 0x61, 0x72, 0x1b, 0x2a, 0x3d, 0x22, 0x73, 0x4f, 0x27, 0x23, 0x61, 0xff, 0xff, 0x27, 0x5c, 0x22])
    b'Content-Type:x;ar\x1b*="sO\'#a\xff\xff\'\\"'

    The following loop of Lib/email/_header_value_parser.py does never stop:

    def get_parameter(value):
        """ attribute [section] ["*"] [CFWS] "=" value
    The CFWS is implied by the RFC but not made explicit in the BNF.  This
    simplified form of the BNF from the RFC is made to conform with the RFC BNF
    through some extra checks.  We do it this way because it makes both error
    recovery and working with the resulting parse tree easier.
    """
    ...
    if remainder is not None:
        ...
        while value:
            ...
    

    Attached reproducer.py is code from initial msg346953.

    reproducer2.py simplify the input and calls directly get_parameter(). Simplified input string:

    r*="'a'\"

    @maxking
    Copy link
    Contributor

    maxking commented Jul 17, 2019

    I have proposed a PR for this: #14794

    Reviews are welcome.

    @37ef113c-b05e-4e81-a20a-9a1a5e0c3784
    Copy link
    Mannequin Author

    Guido mannequin commented Jul 17, 2019

    I used fuzzing to find this bug. After applying your patch, the infinite loop is gone and it cannot find any other bugs of this nature.

    @warsaw
    Copy link
    Member

    warsaw commented Jul 17, 2019

    New changeset a4a994b by Barry Warsaw (Abhilash Raj) in branch 'master':
    bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794)
    a4a994b

    @miss-islington
    Copy link
    Contributor

    New changeset 391511c by Miss Islington (bot) in branch '3.7':
    bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794)
    391511c

    @miss-islington
    Copy link
    Contributor

    New changeset 6816ca3 by Miss Islington (bot) in branch '3.8':
    bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794)
    6816ca3

    @tirkarthi
    Copy link
    Member

    3.5 also seems to be affected. git cherry pick works and the patch fixes the problem so I assume the backport would be straightforward. Since 3.5 is open for security fixes with 3.5.8 as next release I am adding Larry.

    $ git cherry-pick a4a994b
    Performing inexact rename detection: 100% (566740/566740), done.
    [detached HEAD 9877e9283c] bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794)
     Author: Abhilash Raj <maxking@users.noreply.github.com>
     Date: Wed Jul 17 09:44:27 2019 -0700
     3 files changed, 12 insertions(+)
     create mode 100644 Misc/NEWS.d/next/Security/2019-07-16-08-11-00.bpo-37461.1Ahz7O.rst

    @ned-deily
    Copy link
    Member

    New changeset 1789bbd by Ned Deily (Miss Islington (bot)) in branch '3.6':
    bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794) (GH-14817)
    1789bbd

    @ned-deily
    Copy link
    Member

    New changeset 799e4e5 by Ned Deily (GeeTransit) in branch '3.7':
    [3.7] bpo-37461: Fix typo (inifite -> infinite) (GH-15430)
    799e4e5

    @ned-deily
    Copy link
    Member

    New changeset f1f9c0c by Ned Deily (GeeTransit) in branch '3.6':
    [3.6] bpo-37461: Fix typo (inifite -> infinite) (bpo-15432)
    f1f9c0c

    @ned-deily
    Copy link
    Member

    xtreak, would you be willing to make the PR for 3.5? That will make it easier for Larry to decide whether to include it in a 3.5 security release.

    @maxking
    Copy link
    Contributor

    maxking commented Aug 24, 2019

    I manually created a backport PR for 3.5 and added Larry as a reviewer.

    #15446

    @larryhastings
    Copy link
    Contributor

    New changeset c28e4a5 by larryhastings (Abhilash Raj) in branch '3.5':
    [3.5] bpo-37461: Fix infinite loop in parsing of specially crafted email headers (GH-14794) (bpo-15446)
    c28e4a5

    @iritkatriel
    Copy link
    Member

    This seems complete, can it be closed?

    @warsaw
    Copy link
    Member

    warsaw commented Oct 19, 2020

    Thanks everyone for the fixes; I think this bug is now resolved. Closing.

    @warsaw warsaw closed this as completed Oct 19, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 only security fixes 3.8 only security fixes 3.9 only security fixes topic-email type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants