New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracing causes module globals to be mutated when calling functions from C #90609
Comments
Starting here, but there could be Cython interaction or something else in theory. But, when running the following:
It can happen that module globals in the called funtions scope seem to be modified. Oddly enough to a value used in the locals of that function?! The pandas issue:
has a reproducer (sorry that it takes NumPy and pandas for now). I will paste it at the end here also. I can find that the value is modified by the time the (Reproducible using NumPy 1.21.5 and Pandas 1.3.5, but except maybe pandas due to the Cython version, I don't expect version dependency.) The output of the script is very brief:
The full reproducer script is: import sys
import numpy as np
import pandas as pd
from numpy.core import numeric
stop = False
def trace(frame, event, arg):
global stop
if stop:
return None
if np.core.numeric.dtype is not np.dtype:
print("Something happened here, `np.core.numeric.dtype IS np.dtype`")
print(frame, event, arg)
stop = True
else:
print(frame, event, arg)
return trace
sys.settrace(trace)
pd._libs.lib.maybe_convert_objects(np.array([None], dtype=object)) For completeness, the Cython code calling the NumPy function in question, is (likley, there is more, this is Cython, I just cut it out a bit :)): #if CYTHON_FAST_PYCALL
if (PyFunction_Check(__pyx_t_5)) {
PyObject *__pyx_temp[3] = {__pyx_t_2, __pyx_t_6, Py_False};
__pyx_t_15 = __Pyx_PyFunction_FastCall(__pyx_t_5, __pyx_temp+1-__pyx_t_8, 2+__pyx_t_8); if (unlikely(!__pyx_t_15)) __PYX_ERR(0, 2441, __pyx_L1_error)
__Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;
__Pyx_GOTREF(__pyx_t_15);
__Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
} else
#endif |
Ahh, a further data-point. The name from the module scope that is overwritten IS a parameter name used in the function locals. Strangly, if I modify the tracing to print more: stop = 0
def trace(frame, event, arg):
global stop
if stop > 10:
return None
if np.core.numeric.dtype is not np.dtype:
#print("Something happened here, `np.core.numeric.dtype IS np.dtype`")
print(np.core.numeric.dtype)
print(frame, event, arg)
stop += 1
else:
print(frame, event, arg)
return trace Then what I get is: None So, upon entering the function, the value is (already) cleared/set to None (which is correct of course for For the fact that it keeps changing during the function run, points very strongly at CPython? |
Can you reproduce this in pure Python? If you can't do either, I'm going to have to assume that this is a NumPy or Pandas bug. Maybe NumPy or Pandas is accessing CPython internals, but not via the C-API, and those internals changed between 3.9 and 3.10? |
Thanks for having a look. I have confirmed this is related to Cython (no pandas/NumPy involved) – repro at https://github.com/seberg/bpo46451. What happens under the hood in Cython is probably: Which generates Otherwise, will just close this, and may reopen if Cython hits a wall. |
Not reopening for now, but I will note again that (AFAIK) Cython uses So it seems pretty plausible that the bug is in |
While I have a repro for Python, I think the pre release of cython already fixes it (and I just did not regenerated the C sources when trying, I guess. A |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: