Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

email 3.0+ stops parsing headers prematurely #42984

Closed
msapiro mannequin opened this issue Mar 6, 2006 · 3 comments
Closed

email 3.0+ stops parsing headers prematurely #42984

msapiro mannequin opened this issue Mar 6, 2006 · 3 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@msapiro
Copy link
Mannequin

msapiro mannequin commented Mar 6, 2006

BPO 1443866
Nosy @warsaw, @terryjreedy, @msapiro, @bitdancer
Files
  • junk.txt: Example message
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/bitdancer'
    closed_at = <Date 2010-08-06.20:39:35.954>
    created_at = <Date 2006-03-06.04:10:02.000>
    labels = ['type-bug', 'library']
    title = 'email 3.0+ stops parsing headers prematurely'
    updated_at = <Date 2010-08-06.20:39:35.953>
    user = 'https://github.com/msapiro'

    bugs.python.org fields:

    activity = <Date 2010-08-06.20:39:35.953>
    actor = 'r.david.murray'
    assignee = 'r.david.murray'
    closed = True
    closed_date = <Date 2010-08-06.20:39:35.954>
    closer = 'r.david.murray'
    components = ['Library (Lib)']
    creation = <Date 2006-03-06.04:10:02.000>
    creator = 'msapiro'
    dependencies = []
    files = ['1913']
    hgrepos = []
    issue_num = 1443866
    keywords = []
    message_count = 3.0
    messages = ['27686', '113047', '113134']
    nosy_count = 4.0
    nosy_names = ['barry', 'terry.reedy', 'msapiro', 'r.david.murray']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue1443866'
    versions = ['Python 3.1']

    @msapiro
    Copy link
    Mannequin Author

    msapiro mannequin commented Mar 6, 2006

    given the following input:

    Received: by main.example.com; Sun Nov 4 02:45:50 2001
    X-From_: postmaster@example.com Sun Nov 4 02:45:50 2001

    From bob Sun Nov 4 02:45:50 2001
    Return-Path: <postmaster@example.com>
    Delivered-To: bob@example.com

    followed by more headers and message body, the email
    3.0+ parser parses everything beginning with the

    From bob Sun Nov 4 02:45:50 2001

    line as the body of the message with only the first two
    lines as the header. RFC 2822 is clear that the message
    headers are separated from the body by an empty line,
    so I think the parser should continue parsing
    everything as headers until an empty line or the end of
    input is encountered, and should consider lines such as

    From bob Sun Nov 4 02:45:50 2001

    or

    Some arbitrary text

    encountered in the headers to be MalformedHeaderDefect.
    A complete example message is atteched.

    @msapiro msapiro mannequin assigned warsaw Mar 6, 2006
    @msapiro msapiro mannequin added the stdlib Python modules in the Lib dir label Mar 6, 2006
    @msapiro msapiro mannequin assigned warsaw Mar 6, 2006
    @msapiro msapiro mannequin added the stdlib Python modules in the Lib dir label Mar 6, 2006
    @devdanzin devdanzin mannequin added type-bug An unexpected behavior, bug, or error labels Mar 21, 2009
    @warsaw warsaw assigned bitdancer and unassigned warsaw May 5, 2010
    @terryjreedy
    Copy link
    Member

    The email module has several problems. RDM is working on overhauling the email module for 3.2. Existing issues may not get individual attention.

    @bitdancer
    Copy link
    Member

    It's not clear to me that this is a valid bug. It is true that the RFC says that a blank line preceeds the body. However, the line in question is not a valid header line. Mail parsers trying to implement the "be liberal in what you accept" portion of Postel's law should parse messages that where the blank line between the headers and body is missing. With the input given, there are three valid Postel interpretations: the body starts at the >From line, the >From line is missing a folding indent and is part of the value of the preceding header, and the >From line is garbage and should be discarded.

    Since a leading >From is a token that occurs validly with reasonable frequency in message bodies and is never valid in message headers, I think the current choice is a sane one. A smarter heuristic might look at the subsequent line and note that they look like headers, but headers can occur in the body of messages, so that heuristic would probably be wrong more often than it was right. Especially considering that putting headers in a message body is the time when you are most likely to see the leading '>From ' token, since it would be quoting the mbox 'From ' header.

    So, I'm closing this bug as rejected. (Rejected rather than invalid, since reasonable people can certainly disagree about the best heuristics for handling invalid data.)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants