Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing causes module globals to be mutated when calling functions from C #90609

Closed
seberg mannequin opened this issue Jan 21, 2022 · 6 comments
Closed

Tracing causes module globals to be mutated when calling functions from C #90609

seberg mannequin opened this issue Jan 21, 2022 · 6 comments
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@seberg
Copy link
Mannequin

seberg mannequin commented Jan 21, 2022

BPO 46451
Nosy @markshannon, @seberg

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2022-02-01.15:21:06.748>
created_at = <Date 2022-01-21.04:54:22.071>
labels = ['interpreter-core', '3.10', 'type-crash']
title = 'Tracing causes module globals to be mutated when calling functions from C'
updated_at = <Date 2022-02-01.16:32:18.364>
user = 'https://github.com/seberg'

bugs.python.org fields:

activity = <Date 2022-02-01.16:32:18.364>
actor = 'seberg'
assignee = 'none'
closed = True
closed_date = <Date 2022-02-01.15:21:06.748>
closer = 'seberg'
components = ['Interpreter Core']
creation = <Date 2022-01-21.04:54:22.071>
creator = 'seberg'
dependencies = []
files = []
hgrepos = []
issue_num = 46451
keywords = []
message_count = 6.0
messages = ['411082', '411084', '412240', '412276', '412280', '412288']
nosy_count = 2.0
nosy_names = ['Mark.Shannon', 'seberg']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'resolved'
status = 'closed'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue46451'
versions = ['Python 3.10']

@seberg
Copy link
Mannequin Author

seberg mannequin commented Jan 21, 2022

Starting here, but there could be Cython interaction or something else in theory. But, when running the following:

  • Python 3.10.1 (not 3.9.9, debug version or not)
  • Setting a tracing function (not setting a trace-function will fix the issue)
  • Running Cython (maybe also C code) calling back into Python (the best I can tell)

It can happen that module globals in the called funtions scope seem to be modified. Oddly enough to a value used in the locals of that function?!

The pandas issue:

https://github.com/pandas-dev/pandas/issues/41935

has a reproducer (sorry that it takes NumPy and pandas for now). I will paste it at the end here also.

I can find that the value is modified by the time the trace function is called. No other "trace" triggers are processed before in this example (Cython calls other functions in NumPy, but all of those are C implemented, so I guess that is why).
The function in question (unlike all?) should further be called with __Pyx_PyFunction_FastCall, so that is probably an important data point: Fastcall protocol seems involved.

(Reproducible using NumPy 1.21.5 and Pandas 1.3.5, but except maybe pandas due to the Cython version, I don't expect version dependency.)

The output of the script is very brief:

Something happened here, `np.core.numeric.dtype IS np.dtype`
<frame at 0x7ff8d84416c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 289, code full> call None

The full reproducer script is:

import sys
import numpy as np
import pandas as pd

from numpy.core import numeric

stop = False
def trace(frame, event, arg):
    global stop
    if stop:
        return None

    if np.core.numeric.dtype is not np.dtype:
        print("Something happened here, `np.core.numeric.dtype IS np.dtype`")
        print(frame, event, arg)
        stop = True
    else:
        print(frame, event, arg)

    return trace

sys.settrace(trace)

pd._libs.lib.maybe_convert_objects(np.array([None], dtype=object))

For completeness, the Cython code calling the NumPy function in question, is (likley, there is more, this is Cython, I just cut it out a bit :)):

  #if CYTHON_FAST_PYCALL
  if (PyFunction_Check(__pyx_t_5)) {
    PyObject *__pyx_temp[3] = {__pyx_t_2, __pyx_t_6, Py_False};
    __pyx_t_15 = __Pyx_PyFunction_FastCall(__pyx_t_5, __pyx_temp+1-__pyx_t_8, 2+__pyx_t_8); if (unlikely(!__pyx_t_15)) __PYX_ERR(0, 2441, __pyx_L1_error)
    __Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;
    __Pyx_GOTREF(__pyx_t_15);
    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  } else
  #endif

@seberg seberg mannequin added 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Jan 21, 2022
@seberg
Copy link
Mannequin Author

seberg mannequin commented Jan 21, 2022

Ahh, a further data-point. The name from the module scope that is overwritten IS a parameter name used in the function locals. Strangly, if I modify the tracing to print more:

stop = 0
def trace(frame, event, arg):
    global stop
    if stop > 10:
        return None

    if np.core.numeric.dtype is not np.dtype:
        #print("Something happened here, `np.core.numeric.dtype IS np.dtype`")
        print(np.core.numeric.dtype)
        print(frame, event, arg)
        stop += 1
    else:
        print(frame, event, arg)

    return trace

Then what I get is:

None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 289, code full> call None
None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 337, code full> line None
None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 340, code full> line None
None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 341, code full> line None
None
<frame at 0x7faa254396c0, file '/home/sebastian/forks/numpy/build/testenv/lib/python3.10/site-packages/numpy/core/numeric.py', line 342, code full> line None
bool

So, upon entering the function, the value is (already) cleared/set to None (which is correct of course for dtype=None) and then while the function runs storing into the function locals mutates the module global?

For the fact that it keeps changing during the function run, points very strongly at CPython?

@seberg seberg mannequin changed the title Possibly bad interaction with tracing and cython? Tracing causes module globals to be mutated when calling functions from C Jan 31, 2022
@seberg seberg mannequin changed the title Possibly bad interaction with tracing and cython? Tracing causes module globals to be mutated when calling functions from C Jan 31, 2022
@markshannon
Copy link
Member

Can you reproduce this in pure Python?
If not, can you produce a minimal reproducer using just NumPy?

If you can't do either, I'm going to have to assume that this is a NumPy or Pandas bug.

Maybe NumPy or Pandas is accessing CPython internals, but not via the C-API, and those internals changed between 3.9 and 3.10?

@seberg
Copy link
Mannequin Author

seberg mannequin commented Feb 1, 2022

Thanks for having a look. I have confirmed this is related to Cython (no pandas/NumPy involved) – repro at https://github.com/seberg/bpo46451. What happens under the hood in Cython is probably:
https://github.com/cython/cython/blob/master/Cython/Utility/ObjectHandling.c#L2569-L2611

Which generates PyEval_EvalCodeEx, and I could not repro with just a PyObject_FastCallDict, so I assume Cython is doing something wrong and will open an issue there, but if you have a quick tip as to what might wrong, that could be nice :).

Otherwise, will just close this, and may reopen if Cython hits a wall.

@seberg seberg mannequin closed this as completed Feb 1, 2022
@seberg seberg mannequin closed this as completed Feb 1, 2022
@seberg
Copy link
Mannequin Author

seberg mannequin commented Feb 1, 2022

Not reopening for now, but I will note again that (AFAIK) Cython uses PyEval_EvalCodeEx, and the docs say that it is not used internally to CPython anymore.

So it seems pretty plausible that the bug is in PyEval_EvalCodeEx and not the generated Cython code?

@seberg
Copy link
Mannequin Author

seberg mannequin commented Feb 1, 2022

While I have a repro for Python, I think the pre release of cython already fixes it (and I just did not regenerated the C sources when trying, I guess. A git clean to the rescue...).

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

1 participant