Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

email.parser.Parser is inefficient with large strings #52257

Closed
marcio mannequin opened this issue Feb 24, 2010 · 2 comments
Closed

email.parser.Parser is inefficient with large strings #52257

marcio mannequin opened this issue Feb 24, 2010 · 2 comments
Assignees
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@marcio
Copy link
Mannequin

marcio mannequin commented Feb 24, 2010

BPO 8009
Nosy @bitdancer
Files
  • test.py: Simple speed test
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/bitdancer'
    closed_at = <Date 2010-12-28.04:24:37.894>
    created_at = <Date 2010-02-24.09:44:03.496>
    labels = ['library', 'performance']
    title = 'email.parser.Parser is inefficient with large strings'
    updated_at = <Date 2010-12-28.04:24:37.892>
    user = 'https://bugs.python.org/marcio'

    bugs.python.org fields:

    activity = <Date 2010-12-28.04:24:37.892>
    actor = 'r.david.murray'
    assignee = 'r.david.murray'
    closed = True
    closed_date = <Date 2010-12-28.04:24:37.894>
    closer = 'r.david.murray'
    components = ['Library (Lib)']
    creation = <Date 2010-02-24.09:44:03.496>
    creator = 'marcio'
    dependencies = []
    files = ['16355']
    hgrepos = []
    issue_num = 8009
    keywords = []
    message_count = 2.0
    messages = ['100019', '124760']
    nosy_count = 2.0
    nosy_names = ['r.david.murray', 'marcio']
    pr_nums = []
    priority = 'normal'
    resolution = 'wont fix'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue8009'
    versions = ['Python 2.6']

    @marcio
    Copy link
    Mannequin Author

    marcio mannequin commented Feb 24, 2010

    The email parser class is slow and memory intensive when dealing with sufficiently large strings.

    For example, on a Windows 7 64-bit running at 1.60 GHz the attached test file gives the following results (number of seconds it took to parse a 10 MiB string):
    Original: 76.6973627829
    Modified: 0.231140741387

    @marcio marcio mannequin added stdlib Python modules in the Lib dir performance Performance or resource usage labels Feb 24, 2010
    @bitdancer
    Copy link
    Member

    Parser is a legacy API, and message_from_string (which uses it) is just a convenience function. If performance is an issue for your application, call feedparser directly and optimize the feeding to suit your application.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant