gh-148284: Block inlining of gigantic functions in ceval.c for clang 22#148334
gh-148284: Block inlining of gigantic functions in ceval.c for clang 22#148334Fidget-Spinner merged 6 commits intopython:mainfrom
Conversation
Co-Authored-By: Victor Stinner <vstinner@python.org>
|
Tested with
|
|
So it seems whether the original bug reproduces on main or not is nondeterministic 😨 . However, I still believe this fix is right. |
I built the Python main branch (without the fix) 3 times in a row from scratch ( I built the Python main branch on Fedora Rawhide (clang 22.1.3) with: On the two first builds, Script: import _testcapi
import functools
def py_call():
return _testcapi.stack_pointer()
def call():
return py_call()
start = _testcapi.stack_pointer()
call = functools.partial(call)
top = call()
diff = start - top
print("stack memory per call: %.1f kB" % (diff / 1024))Output on the first two builds: Output on the third build: Note: I installed clang and its dependencies using |
|
To get a more reproducible output, I tried using Using I built the Python main branch 3 times with these options and I got the same result each time: |
|
Ok, now testing the fix. I built Python with the fix 5 times in a row and
Commands: ./configure --enable-optimizations --with-lto --enable-shared CC=clang LD=clang
time make -j14
LD_LIBRARY_PATH=$PWD ./python -m test -v test_callResult: test_call pass successfully and my script outputs Moreover, the C flags are set as expected in Makefile: |
vstinner
left a comment
There was a problem hiding this comment.
LGTM. The fix works as expected according to my manual tests (see above).
In general, I'm not fan of using a specific compiler option depending on the compiler veresion. In this case, I'm fine with it, it's a good trade-off.
|
Thanks @vstinner for the very thorough testing and review :). |
|
Thanks @Fidget-Spinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14. |
…clang 22 (pythonGH-148334) (cherry picked from commit e007631) Co-authored-by: Ken Jin <kenjin@python.org> Co-authored-by: Victor Stinner <vstinner@python.org>
|
GH-148349 is a backport of this pull request to the 3.14 branch. |
It seems that on clang-22, the inliner is too aggressive on _PyEval_EvalFrameDefault when on computed goto interpreter. Together with some strange interaction with the stackref buffer, the function requires 40kB of stack space (!!!) versus the usual 1-2kB normally used.
This sets the inline limit to functions of max 512B stack space (1/4th of normal) allowed to be inlined in
ceval.c. I checked the dissasembly and the new function uses about 2kB of stack.python -m test -v test_callsegfault with clang 22.1.3 build #148284📚 Documentation preview 📚: https://cpython-previews--148334.org.readthedocs.build/