New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IntFlag makes re.compile slower #75852
Comments
flags exposed by re module are IntFlag from Python 3.4. Since it is passed to underlying sre_parse and sre_compile, It makes re.compile() slower. |
See also bpo-28637. Using IntFlag in the re module impacted the Python startup time. This was "fixed" by getting rid of re in site.py. |
Oh, Python already accepts floating point numbers: haypo@selma$ ./python
Python 3.7.0a1+ (heads/master-dirty:efb560eee2, Oct 3 2017, 12:15:58)
[GCC 7.2.1 20170915 (Red Hat 7.2.1-2)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.compile("[a-z]", flags=re.I).match("A")
<_sre.SRE_Match object; span=(0, 1), match='A'>
>>> re.I
<RegexFlag.IGNORECASE: 2>
>>> re.compile("[a-z]", flags=2).match("A")
<_sre.SRE_Match object; span=(0, 1), match='A'>
# and now a float
>>> re.compile("[a-z]", flags=2.0).match("A")
<_sre.SRE_Match object; span=(0, 1), match='A'> |
:) This is due to caching. 2.0 == 2. |
Oh, right. Strange. >>> re.compile("e", flags=2.0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/haypo/prog/python/master/Lib/re.py", line 240, in compile
return _compile(pattern, flags)
File "/home/haypo/prog/python/master/Lib/re.py", line 288, in _compile
p = sre_compile.compile(pattern, flags)
File "/home/haypo/prog/python/master/Lib/sre_compile.py", line 747, in compile
p = sre_parse.parse(p, flags)
File "/home/haypo/prog/python/master/Lib/sre_parse.py", line 890, in parse
p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
TypeError: unsupported operand type(s) for &: 'float' and 'int' |
I proposed PR 3867 to fix the re.compile() cache: check flags type. |
Victor, how large is performance regression of your patch? |
Serhiy: "Victor, how large is performance regression of your patch?" I tested bm_regex_compile.py of the Python performance benchmark suite: My patch has no impact on *uncached* re.compile() performances according to this benchmark. haypo@selma$ ./python -m perf compare_to ref.json patch.json haypo@selma$ ./python -m perf compare_to ref.json patch.json -v -- On a microbenchmark on the *cache* itself, the difference is visible: 1018 ./python -m perf timeit -s 'import re; re_compile=re.compile' 're_compile("a", flags=2)' --duplicate=1024 -o ref.json --inherit=PYTHONPATH -v haypo@selma$ ./python -m perf compare_to ref.json patch.json If you replace type(flags) with flags.__class__, the change has no impact on performances :-) But obj.__class__ can be different than type(obj). |
I don't think there is a problem which is worth to be solved by PR 3867. It is very unlikely that anyone uses re functions with a pattern and float flags which accidentally matches already cached valid pattern and integer flag. If float value is passed as a flag it is likely is passed the first time. Fractional floats are rejected in any case. This is different from the problem with PR 3862. |
When proposing to undo recent decisions, please add the people to the nosy list who were involved in making that decision in the first place. |
I don't propose reverting IntFlag. |
Nice speedup! I ran a benchmark on PR 3862: 1002 ./python ~/prog/python/performance/performance/benchmarks/bm_regex_compile.py --inherit=PYTHONPATH -v -o patch.json Mean +- std dev: [ref] 396 ms +- 16 ms -> [patch] 347 ms +- 8 ms: 1.14x faster (-12%) It's the following benchmark on *uncached* regular expressions. So it really measures re.compile() performance, not the cache. |
Thank you for benchmarking. I've added news entry about it. |
Naoki: "I've added news entry about it." Nice. You might also mention the "optimization" in What's New in Python 3.7 doc, in the Optimisations section. The issue here is that it's only faster than Python 3.6, but Python 3.6 was slower than 3.5 :-) The "optimization" just makes Python 3.7 as fast as Python 3.5 ... :-) |
Did you compare the benchmarking results against 3.7 or 3.6? |
My lasted benchmark is on the master branch: original code ("ref") vs code patched with PR 3862 ("patch"). |
It makes sense to report the performance gain only in comparison with the previous released version. I expect that re compiling is slower in 3.7 due to new features and optimizations. Generating more optimal code takes a time. |
I don't really care if Python 3.7 is still slower than 3.6 with PR 3862. The speedup is significant, the change is short and safe, so the PR LGTM :-) We can implement further optimizations later ;-) By the way, speed.python.org can be used to track performances over time. Sadly, I didn't have much time to take care of the website: run manually to job to draw new plots :-) |
IntFlag.__and__ does not create a new instance every time -- all new instances are cached in the IntFlag machinery (so RegexFlag(7) is only created once). If all the RegexFlag combinations are created before the regex compile benchmark do we still see a speed-up? |
I'm sorry, I misunderstood.
I believe that's what Victor benchmarked. |
The perf module always starts with a "warmup" run to fill caches. If enum |
INADA Naoki said:
Only the first two calls always happen -- the last two calls only happen for numbers that haven't been seen yet. And yes, two Python level calls are much slower than an int & int. |
Thanks Naoki! |
My PR 3867 fixes a corner case when the re.compile() is misused ("on purpose"?). I'm going to reject (abandon) my own PR since it makes re.compile() slower: Mean +- std dev: [ref] 364 ns +- 6 ns -> [patch] 459 ns +- 10 ns: 1.26x slower (+26%) @serhiy, @naoki: Are you ok to reject this PR and keep the corner case bug? |
Instead caching type(flags), how about this? if not isinstance(flags, int):
raise TypeError(f"flags must be int or RegexFlag, got {flags!r}")
flags = int(flags) |
PR 3867 looks unpythonic to me. We usually don't check the type of arguments. This complicate and slow down a code. Do you have a realistic example when the current behavior harms? |
Hum, I ran again my microbenchmark on re.compile() cache: haypo@selma$ ./python -m perf timeit -s 'import re; re_compile=re.compile' 're_compile("a", flags=2)' --duplicate=1024 -o ref.json --inherit=PYTHONPATH -v Sadly, the commit c1c47c1 made the cache slower: Mean +- std dev: [ref] 364 ns +- 26 ns -> [patch] 545 ns +- 19 ns: 1.50x slower (+50%) Just to check, I reverted the change on Scanner, the benchmark is still "560 ns +- 27 ns". "if isinstance(flags, RegexFlag): flags = flags.value" added 181 nanoseconds? A quick "isinstance(flags, RegexFlag)" timeit microbenchmark says that this operation has a cost of 46.2 ns (+- 1.6 ns). Benchmarks are strange, as usual :-) |
Serhiy: "PR 3867 looks unpythonic to me. We usually don't check the type of arguments. This complicate and slow down a code. Do you have a realistic example when the current behavior harms?" No, I don't have any realistic example. I just noticed the bug and I was surprised. In practice, flags=2.0 instead of flags=2 is not totally wrong, it's okish to accept 2.0. Let's just forget this corner case. I abandon my change. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: