Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-46329: Split calls into precall and call instructions. #30855

Merged

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Jan 24, 2022

Splits most calls (excluding CALL_FUNCTION_EX) into a PRECALL and a CALL.
The two call opcodes CALL_FUNCTION and CALL_METHOD are unified into a single CALL opcode.

We already split method calls into PRECALL_METHOD and CALL_METHOD, this PR adds a PRECALL_FUNCTION and KW_NAMES.

"normal" calls, f(args) are split into PRECALL_FUNCTION; CALL.
"method" calls, o.m(args) are split into PRECALL_METHOD; CALL
Any calls with named arguments have KW_NAMES inserted in between the PRECALL and CALL.

https://bugs.python.org/issue46329

@pablogsal
Copy link
Member

Seems that this PR has caused some segfaults in the buildbots:

https://buildbot.python.org/all/#/builders/441/builds/1345/steps/5/logs/stdio

Is still not clear that is exactly this one, but this is a possible candidate.

@pablogsal
Copy link
Member

YEah, I can reproduce the segfault locally and doesn't happen if I revert this PR:

$ ./python -m test -u all -W --slowest --fail-env-changed   test_lib2to3 -F -v  
  File "/home/pablogsal/github/cpython/Lib/lib2to3/pytree.py", line 786 in _recursive_matches
  File "/home/pablogsal/github/cpython/Lib/lib2to3/pytree.py", line 786 in _recursive_matches
  File "/home/pablogsal/github/cpython/Lib/lib2to3/pytree.py", line 786 in _recursive_matches
  ...                                                                                                                                                                                                                                                                                                                               Extension modules: _testcapi (total: 1)                                                                                                                           [1]    26282 segmentation fault  ./python -m test -u all -W --slowest --fail-env-changed test_lib2to3 -F -v

This is the stack trace:

Program received signal SIGSEGV, Segmentation fault.                                                                                                              0x00000000006a2c74 in _PyEval_EvalFrameDefault (tstate=0xaee548 <_PyRuntime+130792>, frame=0x7ffff5cee4a0, throwflag=0) at Python/ceval.c:1653
1653        _Py_EnsureTstateNotNULL(tstate);                                                                                                                      (gdb)                                                                                                                                                             (gdb) bt                                                                                                                                                          #0  0x00000000006a2c74 in _PyEval_EvalFrameDefault (tstate=0xaee548 <_PyRuntime+130792>, frame=0x7ffff5cee4a0, throwflag=0) at Python/ceval.c:1653
#1  0x000000000051eae9 in _PyEval_EvalFrame (tstate=0xaee548 <_PyRuntime+130792>, frame=0x7ffff5cee4a0, throwflag=0) at ./Include/internal/pycore_ceval.h:53
#2  0x000000000051e6c1 in gen_send_ex2 (gen=0x7ffff5cee450, arg=0x0, presult=0x7fffff7ff360, exc=0, closing=0) at Objects/genobject.c:219
#3  0x000000000051c385 in gen_iternext (gen=0x7ffff5cee450) at Objects/genobject.c:585
#4  0x00000000006b76cf in _PyEval_EvalFrameDefault (tstate=0xaee548 <_PyRuntime+130792>, frame=0x7ffff5da68a0, throwflag=0) at Python/ceval.c:4214
#5  0x000000000051eae9 in _PyEval_EvalFrame (tstate=0xaee548 <_PyRuntime+130792>, frame=0x7ffff5da68a0, throwflag=0) at ./Include/internal/pycore_ceval.h:53
#6  0x000000000051e6c1 in gen_send_ex2 (gen=0x7ffff5da6850, arg=0x0, presult=0x7fffff8015f0, exc=0, closing=0) at Objects/genobject.c:219
#7  0x000000000051c385 in gen_iternext (gen=0x7ffff5da6850) at Objects/genobject.c:585
#8  0x00000000006b76cf in _PyEval_EvalFrameDefault (tstate=0xaee548 <_PyRuntime+130792>, frame=0x7ffff5da6120, throwflag=0) at Python/ceval.c:4214
#9  0x000000000051eae9 in _PyEval_EvalFrame (tstate=0xaee548 <_PyRuntime+130792>, frame=0x7ffff5da6120, throwflag=0) at ./Include/internal/pycore_ceval.h:53
#10 0x000000000051e6c1 in gen_send_ex2 (gen=0x7ffff5da60d0, arg=0x0, presult=0x7fffff803880, exc=0, closing=0) at Objects/genobject.c:219
#11 0x000000000051c385 in gen_iternext (gen=0x7ffff5da60d0) at Objects/genobject.c:585

@markshannon
Copy link
Member Author

I can't reproduce the failure on Ubuntu linux with gcc. Which configure options and platform?

@pablogsal
Copy link
Member

I am using clang, configure with ---with-pydebug and I am running ./python -m test -u all -W --slowest --fail-env-changed test_lib2to3 -F -v

@pablogsal
Copy link
Member

pablogsal commented Jan 28, 2022

I am probably going to revert this unless we have a fix soon as there are 35 buildbots failing

Ugh, seems that this issue is piling up with more issues, we have almost all buildbots failing :(+-

@pablogsal
Copy link
Member

I am blocking the main branch until all these issues are resolved. Please ping me if you have a fix for this

@markshannon
Copy link
Member Author

I can reproduce with clang, but it is mystifying.
I'll look into it further tomorrow.

@markshannon
Copy link
Member Author

It's a C stack overflow.

>>> import _testcapi
>>> def f():
...     yield _testcapi.stack_pointer()
... 
>>> _testcapi.stack_pointer() - next(f())

Spot the odd one out:

GCC Clang
./configure 544 656
./configure --with-pydebug 608 9200

I'd like to raise an exception, rather than segfaulting, but the SC rejected PEP 651.
I'm genuinely puzzled as to why the SC thinks segfaults are better than raising an exception 🤷

@pablogsal
Copy link
Member

pablogsal commented Jan 29, 2022

I'm genuinely puzzled as to why the SC thinks segfaults are better than raising an exception 🤷

I'm not taking in the name of the SC here (and this is not the place to discuss this) but I'm sure we can agree this is an extreme and unhelpful oversimplification of the SC resolution.

Unfortunately seems that this commit creates the stack overflow while it didn't happen before. Even if we raise an exception, that was not happening before so technically will be also a regression.

Unless we have a fix for this, we probably need to revert the commit. Do you know why clang stack overflows but GCC doesn't?

@markshannon
Copy link
Member Author

Do you know why clang stack overflows but GCC doesn't?

As explained above, Clang uses ridiculous amounts of stack space using --with-pydebug

GCC Clang
./configure 544 656
./configure --with-pydebug 608 9200

@markshannon
Copy link
Member Author

Maybe relegate the clang with debug buildbot(s) to unstable?

@pablogsal
Copy link
Member

pablogsal commented Jan 29, 2022

Maybe relegate the clang with debug buildbot(s) to unstable?

Absolutely not. They were passing before, and this breaks the buildbots. The solution is not deactivating previous passing checks.

Notice that any system running the tests under clang will fail (which by default is MacOS systems) so this is clearly not acceptable.

Maybe relegate the clang with debug buildbot(s) to unstable?

Yeah, I get that, but why?

@markshannon
Copy link
Member Author

Why?
Because clang and --with-pydebug uses huge amounts of C stack, so that (on linux) it is uses over 8Mb of stack at recursion depth of 1000. We should not be rejected changes just because they push the amount of C stack used by Clang over an arbitrary threshold of ~8300 bytes.

In release mode, it's fine as the stack consumption is ~600 bytes.

We could reduce the default recursion depth to ~800 in debug mode. That seems the best solution, short term.

Longer term there are two things I want to do:

  1. Make iteration over generators not consume C stack, this will fix the failure with clang, at least for the failing test.
  2. Check for C stack use and raise an exception the stack is near exhaustion. I really think this is essential.

@pablogsal
Copy link
Member

Because clang and --with-pydebug uses huge amounts of C stack, so that (on linux) it is uses over 8Mb of stack at recursion depth of 1000.

That doesn't answer the question I think. The question is why GCC is not going over the limit and clang does. What is this extra stack space? What is this being used for?

We should not be rejected changes just because they push the amount of C stack used by Clang over an arbitrary threshold of ~8300 bytes.

That is subjective. I kind of agree with you, but I don't think this is something it can be decided here in this PR. Technically this makes functions that didn't raise/segfault now change their behaviour in debug mode and that's a backwards incompatible change so at least needs to be discussed in python-dev

@markshannon
Copy link
Member Author

markshannon commented Jan 30, 2022

The question is why GCC is not going over the limit and clang does. What is this extra stack space? What is this being used for?

You'll need to ask the Clang/llvm developers.
My guess is that at -O0 GCC assigns values to registers and Clang/llvm leaves them on the stack in non-overlapping locations. But that's just a guess.

@jkloth
Copy link
Contributor

jkloth commented Jan 30, 2022

Why wouldn't just increasing the stack size for Clang debug builds be a solution? It was deemed an OK resolution when a similar overflow was occurring on Windows in bpo-17206. Also, the cause of increased stack usage for Clang may be for related reasons as in afore mentioned issue.

@pablogsal
Copy link
Member

Why wouldn't just increasing the stack size for Clang debug builds be a solution? It was deemed an OK resolution when a similar overflow was occurring on Windows in bpo-17206. Also, the cause of increased stack usage for Clang may be for related reasons as in afore mentioned issue.

👍

I'm ok with that as long as it doesn't need to be manually configured by the end user (so is automatically done by the configure step).

@vstinner
Copy link
Member

vstinner commented Jan 30, 2022

It was deemed an OK resolution when a similar overflow was occurring on Windows in bpo-17206.

https://bugs.python.org/issue17206 was closed in 2013. Are you sure about the issue number? https://bugs.python.org/issue17206

@jkloth
Copy link
Contributor

jkloth commented Jan 30, 2022

Very. I don't see how being an old, closed issue makes a difference as to the resolution. Mind you, the stack overflow wasn't mentioned until a while into the discussion. msg190623 is where that starts, coincidentally by yourself.

@markshannon
Copy link
Member Author

Rather than increasing the allowed stack, how about setting Clang to -O1 for the debug build?

ambv pushed a commit that referenced this pull request Jan 31, 2022
@vstinner
Copy link
Member

Rather than increasing the allowed stack, how about setting Clang to -O1 for the debug build?

I proposed to enable optimization on Windows debug build when debugging a test_exceptions crash: https://bugs.python.org/issue44348 => this idea was rejected, it's a bad idea: https://bugs.python.org/issue45115

The lib2to3 crash is tracked at https://bugs.python.org/issue46542 I already wrote a fix for test_json: use support.infinite_recursion() which reduces the recusion limit while running the test.

For the lib2to3 crash, my problem is that so far I failed to reproduce it. Does someone know how to reproduce the test_lib2to3 crash?

@vstinner
Copy link
Member

I proposed PR #31035 to prevent the crash without having to revert this change.

@markshannon markshannon deleted the split-calls-into-precall-and-call-part-2 branch September 26, 2023 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants