Computed-goto patch for RE engine #51842

akuchling · 2009-12-29T02:11:30Z

BPO	7593
Nosy	@loewis, @akuchling, @birkenfeld, @pitrou, @ezio-melotti
Files	re-computed-goto-v1.txt: re-computed-goto-v1.txt

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2013-11-06.20:23:25.718>
created_at = <Date 2009-12-29.02:11:30.382>
labels = ['expert-regex', 'performance']
title = 'Computed-goto patch for RE engine'
updated_at = <Date 2013-11-06.20:23:25.717>
user = 'https://github.com/akuchling'

bugs.python.org fields:

activity = <Date 2013-11-06.20:23:25.717>
actor = 'akuchling'
assignee = 'none'
closed = True
closed_date = <Date 2013-11-06.20:23:25.718>
closer = 'akuchling'
components = ['Regular Expressions']
creation = <Date 2009-12-29.02:11:30.382>
creator = 'akuchling'
dependencies = []
files = ['15690']
hgrepos = []
issue_num = 7593
keywords = ['patch', 'needs review']
message_count = 6.0
messages = ['96984', '97038', '97097', '99471', '99473', '99478']
nosy_count = 6.0
nosy_names = ['loewis', 'akuchling', 'georg.brandl', 'pitrou', 'cdleary', 'ezio.melotti']
pr_nums = []
priority = 'normal'
resolution = 'wont fix'
stage = 'patch review'
status = 'closed'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue7593'
versions = ['Python 3.4']

akuchling · 2009-12-29T02:11:29Z

Part of Unladen Swallow's roadmap is to use a threaded-interpreter
technique for the regular expression engine. That sounded like an
interesting idea, so I went ahead and tried to implement it.

The current patch is attached. To try it: run configure
--with-computed-gotos; apply the patch; and compile Python.

Still to do:

Benchmarking, to determine if it's actually an improvement.
Possibly integrate construction of the Modules/re_opcodes.h file into
the build process. The current file is supplied; to rebuild it, run
"python Modules/makereopcodes.py > Modules/re_opcodes.h". (But it only
needs rebuilding if you add new regex opcodes, which is rarely done --
maybe it's sufficient to include a reminder to update it plus the script).

pitrou · 2009-12-30T16:36:48Z

On the principle nothing looks wrong. There are some tabs-vs-spaces
issues in _sre.c. Can some out-of-bounds crash be triggered if an opcode
is greater than 33?

It needs some benchmarks to know whether it's efficient. Also, I think
it would be nice to include regeneration of the opcode table in the
build process. Since it's very light-weight it can be done
unconditionally (I suppose it will be done from setup.py, right?).

birkenfeld · 2009-12-31T14:54:16Z

_sre is listed in Modules/Setup, so it will be a built-in module by default.

akuchling · 2010-02-17T14:36:50Z

I finally got around to benchmarking this change, and unfortunately the results are not good.

I used the regex tests in the Unladen Swallow test suite, regex_effbot and regex_v8. The tests are written for Python 2.x, but the fixes for 3.x are straightforward (use print() in one function; replace xrange with range in the bm_regex_effbot.py and bm_regex_v8.py files; remove a few uses of u''; I'll provide a patch later.)

Hardware: MacBook, Intel Core 2 Duo, 1.83GHz, 2MB L2 cache, 667 MHz bus.

Tests invoked with ./perf.py -b regex_effbot -r -v ../py3k/python.exe ../threaded-3000/python.exe

regex_effbot is 1.1002 times slower with the computed-goto patch, and
regex_v8 is 1.0081x slower. perf.py indicates that both results are significant.

I'd like to see a few people replicate these results -- maybe the effect is very platform dependent or I ran the tests incorrectly -- but on current evidence, this patch is not worth pursuing.

akuchling · 2010-02-17T15:13:58Z

Actually, I really want someone to verify that measurement. As a control, I tried running the call_method benchmark (after a few more xrange fixes). The Python 3.x trunk version with my patch is measured as 1.0227x slower, even though the patch only touches the re module and call_method doesn't use the module at all. I recompiled both binaries; both builds are using the same compiler arguments; both have the same version from trunk. I'm mystified about why the patched version is slower.

pitrou · 2010-02-17T18:50:13Z

You should disassemble the output (or produce assembler from gcc) and check that the various indirect jumps at the end of each case block don't get merged into a single shared indirect jump.

Or perhaps it's simply that regular expression matching isn't really sensitive to bytecode dispatch overhead.

akuchling added the topic-regex label Dec 29, 2009

ezio-melotti added the performance Performance or resource usage label Dec 29, 2009

birkenfeld changed the title ~~Computed-goto patch~~ Computed-goto patch for RE engine Dec 30, 2009

akuchling closed this as completed Nov 6, 2013

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Computed-goto patch for RE engine #51842

Computed-goto patch for RE engine #51842

akuchling commented Dec 29, 2009

akuchling commented Dec 29, 2009

pitrou commented Dec 30, 2009

birkenfeld commented Dec 31, 2009

akuchling commented Feb 17, 2010

akuchling commented Feb 17, 2010

pitrou commented Feb 17, 2010

Computed-goto patch for RE engine #51842

Computed-goto patch for RE engine #51842

Comments

akuchling commented Dec 29, 2009

akuchling commented Dec 29, 2009

pitrou commented Dec 30, 2009

birkenfeld commented Dec 31, 2009

akuchling commented Feb 17, 2010

akuchling commented Feb 17, 2010

pitrou commented Feb 17, 2010