New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Computed-goto patch for RE engine #51842
Comments
Part of Unladen Swallow's roadmap is to use a threaded-interpreter The current patch is attached. To try it: run configure Still to do:
|
On the principle nothing looks wrong. There are some tabs-vs-spaces It needs some benchmarks to know whether it's efficient. Also, I think |
_sre is listed in Modules/Setup, so it will be a built-in module by default. |
I finally got around to benchmarking this change, and unfortunately the results are not good. I used the regex tests in the Unladen Swallow test suite, regex_effbot and regex_v8. The tests are written for Python 2.x, but the fixes for 3.x are straightforward (use print() in one function; replace xrange with range in the bm_regex_effbot.py and bm_regex_v8.py files; remove a few uses of u''; I'll provide a patch later.) Hardware: MacBook, Intel Core 2 Duo, 1.83GHz, 2MB L2 cache, 667 MHz bus. Tests invoked with ./perf.py -b regex_effbot -r -v ../py3k/python.exe ../threaded-3000/python.exe regex_effbot is 1.1002 times slower with the computed-goto patch, and I'd like to see a few people replicate these results -- maybe the effect is very platform dependent or I ran the tests incorrectly -- but on current evidence, this patch is not worth pursuing. |
Actually, I really want someone to verify that measurement. As a control, I tried running the call_method benchmark (after a few more xrange fixes). The Python 3.x trunk version with my patch is measured as 1.0227x slower, even though the patch only touches the re module and call_method doesn't use the module at all. I recompiled both binaries; both builds are using the same compiler arguments; both have the same version from trunk. I'm mystified about why the patched version is slower. |
You should disassemble the output (or produce assembler from gcc) and check that the various indirect jumps at the end of each case block don't get merged into a single shared indirect jump. Or perhaps it's simply that regular expression matching isn't really sensitive to bytecode dispatch overhead. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: