Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex version pin breaks Python 3 support #23

Closed
wichert opened this issue Dec 20, 2013 · 6 comments
Closed

regex version pin breaks Python 3 support #23

wichert opened this issue Dec 20, 2013 · 6 comments

Comments

@wichert
Copy link
Contributor

wichert commented Dec 20, 2013

Currently regex has a hard version pin on a very old version of the regex package. This version does not support Python 3, which results in flanker also not being usable in Python 3 projects.

The version pin has a comment that indicates that this is done for performance reasons. I am wondering a few things:

  • Is there a benchmark script that one can run to test this performance problem? That would make it possible to discuss this with the regex developers.
  • Which regex version showed this degradation?
  • For many sites slow performance for flanker is not problematic. Can you consider doing the same pin/unpinned trick you added for Impossible to install due to version pins #20 for regex as well?
@cool-RR
Copy link

cool-RR commented Dec 23, 2013

I am affected by this bug as well.

@russjones
Copy link
Contributor

Hey guys, we are looking into this. I'll try and figure out what the issue was with the regex library as soon as possible.

@wichert
Copy link
Contributor Author

wichert commented Jan 26, 2014

@russjones can you give us a quick status update?

@russjones
Copy link
Contributor

Hi @wichert @cool-RR, I'll discuss this with the original developer who set that requirement.

@russjones
Copy link
Contributor

Hi @wichert @cool-RR, I discussed this with the original developer and the regex team, and we unpinned the version on master. It shouldn't be an issue anymore, but if you do see serious a serious performance degradation, you can always revert back to the stable branch.

@russjones
Copy link
Contributor

By the way, we just updated the Benchmarks for Flanker comparing the pinned version on the stable branch and the unpinned version on master there is still a 6x performance improvement if you use the pinned version for large files.

You can test it yourself using the following script:

import sys 
import time

from flanker import mime
from flanker.mime import MimeError

def parse(data, count):
    times = []

    for _ in xrange(count):
        try:
            start = time.time()
            msg = mime.from_string(data)
            for p in msg.walk():
                if p.is_attachment():
                    p.body
            times.append(time.time() - start)
        except MimeError:
            print "Bad MIME"
            mime.recover(data)

    return times

if __name__ == "__main__":
    with open(sys.argv[1]) as f:
        data = f.read()

        times = parse(data, 50)
        print 'Mean:', reduce(lambda x, y: x + y, times) / len(times)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants