gh-143361: Pass PY_VECTORCALL_ARGUMENTS_OFFSET in _Py_CallBuiltinClass_StackRefSteal #143367
+3
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes a missed optimization.
The function
_Py_CallBuiltinClass_StackRefStealusesSTACKREFS_TO_PYOBJECTS. However, the code was not passing thePY_VECTORCALL_ARGUMENTS_OFFSETflag to the callee. This forced vector-callable types to reallocate and copy the arguments whenever they needed to prepend an argument.Verification
Using LLDB to inspect
long_vectorcallwhen reached via this path:nargsf = 1nargsf = 0x8000000000000001(High bit correctly set)Benchmarks
I ran a small benchmark file allocating a class in a tight loop to trigger the path:
gh-143361
Benchmark script