-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compile("-"*3000000 + "4", '', mode) causes hard crash #90268
Comments
|
In case it helps track this down. On my system I've tested these two setups: On Windows, on the main branch, python just exists with no message when I run this from the REPL. Also on Windows, with the Cygwin 3.8.12 version, I get MemoryError: Python 3.8.12 (default, Nov 23 2021, 20:18:25)
[GCC 11.2.0] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> eval("-"*3000000 + "4")
s_push: parser stack overflow
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError Those are the only two systems I have available to test with. |
This is the top of the stacktrace I got on Windows in Visual Studio:
|
See also https://bugs.python.org/issue32758 |
Windows, IDLE, 3.10.1: compile("-"*3000000 + "4", '', mode) crashes execution process for any of 'exec', 'eval', 'single'. bpo-42609 is also about 'too high' string multiplication with new compilet, though the exact breaking point in crash dumps seems different. |
eval("-"*3000000 + "4")
in cmd causes hard crasheval("-"*3000000 + "4")
in cmd causes hard crash
This is a stack overflow in the parser, unfortunately, which unfortunately is very difficult to defend against. |
This is a known issue for recursive descendent parsers. The only thing we can do here is somehow limit the maximum stack depth of the C stack, but that can:
|
I worked in the past in a refactor of the math based rules (https://github.com/python/cpython/pull/20696/files) that could prevent **this** particular example, but others could still make the parser crash by stack overflow |
I made a draft PR here: to fix the issue. But we should benchmark and evaluate it before deciding anything. |
It seems that the PR was merged without discussion about 85% regression in python_startup benchmark. https://speed.python.org/timeline/?ben=python_startup https://speed.python.org/changes/?rev=e9898bf153 |
Ugh, that's quite bad. We measured performance impact in general and that was quite acceptable but seems that for startup this is quite sensitive :( There isn't many other ways we can do this that I can think of unfortunately, so we need to make a decision on what we care most here, unless someone has a better idea on how we can overcome the recursion problem. Adding Guido and Eric as they gave been working on startup quite a lot. |
I am not able to reproduce on Linux either, with pyperformance or manual testing in the CLI. Interestingle, this shows up in both machines: |
Does python_startup benchmark start with all modules parsed and __pycache__d, or with no cache, so it includes the normally one-time parse time? |
I don't know what pyperf does with the cache (adding Victor as maybe he knowns). |
Ran pyperformance with PGO/LTO CPU-isol on my Linux box and I cannot reproduce either: ❯ pyperf compare_to json/* --table --table-format=md -G
|
Maybe something unrelated changed on the benchmark machines? (Like installing a new version of pyperformance???) Though it happened across both benchmark machines. What configuration and flags are being used to run the benchmark suite on that machine? |
Very unlikely, it happened on two separate machines with different distributions and nothing was updated in the machines that I can see. Both use a configuration file for pyperformance that looks like this: ------------- $ cat bench.conf [config] [scm] [compile] [run_benchmark] [upload] [compile_all] [compile_all_revisions] |
I propose a test: revert the PR and see if speed.Python.org shows a speedup |
Ok, let's do that and see what happens |
I wrote a tiny script that calls compile() on raw bytes read from some source file, does this 100 times, and reports the total time. I tested the script with Lib/pydoc_data/topics.py (which happens to be the largest source file in the CPython repo, but mostly string literals) and with Lib/test/test_socket.py (the second-largest file). I built python.exe on a Mac with PGO/LTO, from "make clean", both before and after (at) PR 30177. For both files, the difference between the results is well in the noise caused by my machine (I don't have a systematic way to stop background jobs). But it's very clear that this PR cannot have been the cause of an 85% jump in the time taken by the python_startup benchmark in PyPerformance. For topics.py, the time was around 7.2 msec/compile; for test_socket.py, it was around 38. (I am not showing separate before/after numbers because the noise in my data really is embarrassing.) The compilation speed comes down to ~170,000 lines/sec on my machine (an Intel Mac from 2019; 2.6 GHz 6-Core Intel Core i7 running macOS Big Sur 11.6.1; it has clang 12.0.5). It must be something weird on the benchmark machines. I suspect that a new version of some package was installed in the venv shared by all the benchmarks (we are still using PyPerformance 1.0.2) and that affected something, perhaps through a .pth file? |
I ran the benchmarks machines with the revert and seems that it doesn't go back to the previous timing: (check last data point for 9d35ded). So this seems that the difference in speed remains a mystery but is not due to this fix. |
also doesn't show any difference with the revert |
I ran all benchmarks on installed optimized framework builds of 3.9 with the change (-a) and with the revert (-revert). It shows no change: ❯ ./python3.9 -m pyperf compare_to /Volumes/RAMDisk/py39* Benchmark hidden because not significant (35): chameleon, chaos, crypto_pyaes, deltablue, django_template, dulwich_log, json_dumps, json_loads, logging_silent, logging_simple, mako, meteor_contest, nbody, pathlib, pickle, pickle_list, python_startup_no_site, raytrace, regex_compile, scimark_fft, scimark_lu, scimark_monte_carlo, scimark_sor, scimark_sparse_mat_mult, sqlalchemy_declarative, sqlalchemy_imperative, sqlite_synth, sympy_integrate, sympy_sum, tornado_http, unpickle_pure_python, xml_etree_parse, xml_etree_iterparse, xml_etree_generate, xml_etree_process Geometric mean: 1.00x faster |
(that's on M1 Macbook Pro on macOS Monterey) |
At thing at this point we can confidently say that is very very unlike that there is no actual regression. What's going on with the performance servers is something I still cannot explain. I at least can confirm the servers system packages were not updated between these runs but I cannot think of anything that could have influenced that change. I propose to close this issue as we are clearly unable to reproduce said slowdown. |
"is very very unlike that there is no actual regression" I presume you meant "it is very very *likely* that there is no actual regression" This shouldn't hold up releases, but (having spent months trying to improve startup speed) I would still like to get to the bottom of the speed.python.org regression. |
Yes, sorry, that's what I meant :)
Cool, we will proceed with 3.9.10 and 3.11.0a3 tomorrow. |
I'm still wondering why speed.python.org showed such a slowdown, and how we can revert that. Here's a new theory. Thanks to an investigation I did together with Eric, I now suspect that the release of setuptools 60.0.0 on Dec 19 is a smoking gun. PyPerformance (re)installs the latest versions of pip and setuptools. Setuptools 60.0 makes a change in the file distutils-precedence.pth that causes it (by default) to import something called _distutils_hack and to call its add_shim() function. In previous setuptools this was off by default, but in 60.0 it switched to on. See https://github.com/pypa/setuptools/blob/main/CHANGES.rst#v6000 That add_shim() call in turn installs an extra entry in front of sys.meta_path, which does a bunch of extra work. See https://github.com/pypa/setuptools/blob/main/_distutils_hack/__init__.py Pablo, can we change the PyPerformance configuration or the script that runs it to set and export SETUPTOOLS_USE_DISTUTILS=stdlib, to see whether that affects perf at all? (Note that the code in distutils-precedence.pth is executed by site.py in addpackage().) |
More data. On my Mac, with SETUPTOOLS_USE_DISTUTILS=stdlib, using -X importtime I see the following extra modules being imported: import time: 278 | 278 | types
import time: 112 | 112 | _operator
import time: 419 | 531 | operator
import time: 129 | 129 | itertools
import time: 325 | 325 | keyword
import time: 468 | 468 | reprlib
import time: 258 | 258 | _collections
import time: 978 | 2156 | collections
import time: 78 | 78 | _functools
import time: 835 | 3068 | functools
import time: 1359 | 5235 | enum
import time: 138 | 138 | _sre
import time: 497 | 497 | sre_constants
import time: 528 | 1025 | sre_parse
import time: 512 | 1674 | sre_compile
import time: 109 | 109 | _locale
import time: 886 | 886 | copyreg
import time: 671 | 8574 | re
import time: 471 | 471 | warnings
import time: 330 | 801 | importlib
import time: 906 | 10279 | _distutils_hack That's around 10 msec, so in the right ballpark. |
I did a run of pyperformance manually forcing setuptools<60.0 and another with setuptools>=60.0 I can reproduce the timing difference. I assume we can therefore close this issue and maybe open another one thinking about how to deal with the setuptools problem (maybe reaching to the package maintainera). |
If the issue is about how pyperformance creates its virtual environment (how setuptools is installed), I suggest to continue discussion the issue in pyperformance: https://github.com/python/pyperformance/issues/ ;-) |
To close the loop, setuptools 60.4.0 should fix this, see pypa/setuptools#3009. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: