Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault accessing frame_locals #118746

Open
ngoldbaum opened this issue May 7, 2024 · 10 comments
Open

Segfault accessing frame_locals #118746

ngoldbaum opened this issue May 7, 2024 · 10 comments
Labels
3.13 bugs and security fixes type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@ngoldbaum
Copy link

ngoldbaum commented May 7, 2024

Bug report

Bug description:

I don't know how to cause this otherwise, but I can reproducibly trigger this segfault running:

python -m pytest "sklearn/tests/test_common.py::test_estimators[NuSVC()-check_classifiers_one_label_sample_weights]"

goldbaum at Nathans-MBP in ~/Documents/scikit-learn on main!
± $(pyenv which python) $(pyenv which pytest) -rsx -svx "sklearn/tests/test_common.py::test_estimators[NuSVC()-check_classifiers_one_label_sample_weights]"
===================================================================================== test session starts =====================================================================================
platform darwin -- Python 3.13.0a6+, pytest-8.2.0, pluggy-1.5.0 -- /Users/goldbaum/.pyenv/versions/3.13-dev-debug/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase(PosixPath('/Users/goldbaum/Documents/scikit-learn/.hypothesis/examples'))
rootdir: /Users/goldbaum/Documents/scikit-learn
configfile: setup.cfg
plugins: hypothesis-6.100.5
collected 1 item

sklearn/tests/test_common.py::test_estimators[NuSVC()-check_classifiers_one_label_sample_weights] I: Seeding RNGs with 1678026760
Fatal Python error: Segmentation fault

Current thread 0x00000001dd5dd000 (most recent call first):
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/_code/code.py", line 149 in f_locals
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/_code/code.py", line 285 in ishidden
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/_code/code.py", line 412 in <lambda>
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/_code/code.py", line 342 in __init__
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/_code/code.py", line 415 in filter
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/python.py", line 1644 in _traceback_filter
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/_code/code.py", line 973 in repr_traceback
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/_code/code.py", line 1064 in repr_excinfo
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/_code/code.py", line 697 in getrepr
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/nodes.py", line 462 in _repr_failure_py
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/python.py", line 1669 in repr_failure
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/reports.py", line 363 in from_item_and_call
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/runner.py", line 368 in pytest_runtest_makereport
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/runner.py", line 243 in call_and_report
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/runner.py", line 135 in runtestprotocol
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/runner.py", line 116 in pytest_runtest_protocol
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/main.py", line 364 in pytest_runtestloop
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/main.py", line 339 in _main
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/main.py", line 285 in wrap_session
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/main.py", line 332 in pytest_cmdline_main
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/config/__init__.py", line 178 in main
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/lib/python3.13/site-packages/_pytest/config/__init__.py", line 206 in console_main
  File "/Users/goldbaum/.pyenv/versions/3.13-dev-debug/bin/pytest", line 8 in <module>

Extension modules: numpy._core._multiarray_umath, numpy.linalg._umath_linalg, cython.cimports.libc.math, scipy._lib._ccallback_c, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.sparse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, sklearn.__check_build._check_build, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._ansari_swilk_statistics, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.stats._unuran.unuran_wrapper, sklearn.utils._isfinite, sklearn.utils.sparsefuncs_fast, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, scipy.io.matlab._mio_utils, scipy.io.matlab._streams, scipy.io.matlab._mio5_utils, sklearn.datasets._svmlight_format_fast, sklearn.utils._random, sklearn.utils._vector_sentinel, sklearn.feature_extraction._hashing_fast, sklearn.metrics.cluster._expected_mutual_info_fast, sklearn.metrics._dist_metrics, sklearn.metrics._pairwise_distances_reduction._datasets_pair, sklearn.utils._cython_blas, sklearn.metrics._pairwise_distances_reduction._base, sklearn.metrics._pairwise_distances_reduction._middle_term_computer, sklearn.utils._heap, sklearn.utils._sorting, sklearn.metrics._pairwise_distances_reduction._argkmin, sklearn.metrics._pairwise_distances_reduction._argkmin_classmode, sklearn.metrics._pairwise_distances_reduction._radius_neighbors, sklearn.metrics._pairwise_distances_reduction._radius_neighbors_classmode, sklearn.metrics._pairwise_fast, sklearn.utils._fast_dict, sklearn.cluster._hierarchical_fast, sklearn.cluster._k_means_common, sklearn.cluster._k_means_elkan, sklearn.cluster._k_means_lloyd, sklearn.cluster._k_means_minibatch, sklearn.neighbors._partition_nodes, sklearn.neighbors._ball_tree, sklearn.neighbors._kd_tree, sklearn.utils.arrayfuncs, sklearn.utils._seq_dataset, sklearn.linear_model._cd_fast, _loss, sklearn._loss._loss, sklearn.svm._liblinear, sklearn.svm._libsvm, sklearn.svm._libsvm_sparse, sklearn.utils._weight_vector, sklearn.linear_model._sgd_fast, sklearn.linear_model._sag_fast, sklearn.decomposition._online_lda_fast, sklearn.decomposition._cdnmf_fast, sklearn.cluster._dbscan_inner, sklearn.cluster._hdbscan._tree, sklearn.cluster._hdbscan._linkage, sklearn.cluster._hdbscan._reachability, sklearn._isotonic, sklearn.tree._utils, sklearn.tree._tree, sklearn.tree._splitter, sklearn.tree._criterion, sklearn.neighbors._quad_tree, sklearn.manifold._barnes_hut_tsne, sklearn.manifold._utils, sklearn.ensemble._gradient_boosting, sklearn.ensemble._hist_gradient_boosting.common, sklearn.ensemble._hist_gradient_boosting._gradient_boosting, sklearn.ensemble._hist_gradient_boosting._binning, sklearn.ensemble._hist_gradient_boosting._bitset, sklearn.ensemble._hist_gradient_boosting.histogram, sklearn.ensemble._hist_gradient_boosting._predictor, sklearn.ensemble._hist_gradient_boosting.splitting (total: 166)
[1]    462 segmentation fault  $(pyenv which python) $(pyenv which pytest) -rsx -svx

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

Linked PRs

@ngoldbaum ngoldbaum added the type-bug An unexpected behavior, bug, or error label May 7, 2024
@AlexWaygood AlexWaygood added type-crash A hard crash of the interpreter, possibly with a core dump and removed type-bug An unexpected behavior, bug, or error labels May 7, 2024
@ngoldbaum
Copy link
Author

C traceback: https://gist.github.com/ngoldbaum/5da31a3b055a9563d4cb5101516bc8d8

f_locals is NULL in frame_getlocals.

@AlexWaygood AlexWaygood added the 3.13 bugs and security fixes label May 7, 2024
@gvanrossum
Copy link
Member

@gaogaotiantian Can you submit a PR for this?

gvanrossum pushed a commit that referenced this issue May 8, 2024
We don't know how to create an unoptimized frame with f_locals == NULL,
but they are seen in the wild, and this fixes the crash.
@gvanrossum
Copy link
Member

I think the release can continue now, but I'm keeping this open so we can investigate the root cause (and possibly construct a more elegant fix).

@gaogaotiantian
Copy link
Member

Should we remove the release-blocker tag then?

@gvanrossum
Copy link
Member

Should we remove the release-blocker tag then?

@Yhg1s can do that -- I don't know what the policy is.

@gaogaotiantian
Copy link
Member

Site note for the actual issue:

I added assert(!PyCode_Check(frame->f_executable) || (_PyFrame_GetCode(frame)->co_flags & CO_OPTIMIZED) || frame->f_locals != NULL); in DISPATCH() macro and ran the full python test suite without any issue. This has to be some dark issue.

@JelleZijlstra
Copy link
Member

The frame is apparently from Cython; I made pytest print each frame object before attempting to access its f_locals:

<frame at 0x127a763e0, file '_libsvm.pyx', line 216, code sklearn.svm._libsvm.fit>
Fatal Python error: Segmentation fault

I was able to reproduce with just Cython. Put this in exc.pyx:

def f():
    raise ValueError

Compile it with python -m Cython.Build.Cythonize -3 -i exc.pyx (where python is built from current main from before the fix associated with this issue).

And in crash.py:

from exc import f

try:
    f()
except Exception as e:
    tb = e.__traceback__
    while tb:
        print(tb.tb_frame.f_locals)
        tb = tb.tb_next
% python crash.py 
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <_frozen_importlib_external.SourceFileLoader object at 0x1018dc350>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, '__file__': '/Users/jelle/py/pybuild/cycrash/crash.py', '__cached__': None, 'f': <cyfunction f at 0x1018a3ed0>, 'e': ValueError(), 'tb': <traceback object at 0x101917ec0>}
zsh: segmentation fault  python crash.py

With about the same C backtrace as @ngoldbaum saw:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x000000010060bce0 libpython3.13.dylib`frame_getlocals [inlined] Py_INCREF(op=0x0000000000000000) at object.h:822:30 [opt]
    frame #1: 0x000000010060bce0 libpython3.13.dylib`frame_getlocals [inlined] _Py_NewRef(obj=0x0000000000000000) at object.h:1056:5 [opt]
    frame #2: 0x000000010060bce0 libpython3.13.dylib`frame_getlocals(f=0x0000000100bdb4c0, closure=0x0000000000000000) at frameobject.c:745:16 [opt]
    frame #3: 0x0000000100644720 libpython3.13.dylib`_PyObject_GenericGetAttrWithDict(obj=0x0000000100bdb4c0, name=0x00000001004033b0, dict=0x0000000000000000, suppress=0) at object.c:1578:19 [opt]

I looked in the Cython source code and there are four calls to PyFrame_New, each of which passes NULL for the locals argument. PyFrame_New is apparently not documented as a public API.

For now it seems we have to live with the reality that f_locals may always be NULL. I looked through the code briefly and didn't find any other cases where we're not handling this case, so we should be good.

@gaogaotiantian
Copy link
Member

Thanks for the investigation, I did not find any way to do it with pure CPython. PyFrame_New() is a legacy API I think? It's not even documented.

@gvanrossum
Copy link
Member

The status of PyFrame_New() is ambiguous -- undocumented, but lives in a "public" header (frameobject.h), doesn't start with an underscore, and has been around since the Python 2.1 days. The implementation has a comment saying /* Legacy API */. We'll have to tread carefully there -- we can't just remove it. Before we can start deprecating it we should probably understand why Cython is calling it -- my hunch is it all has to do with creating seamless tracebacks. In any case, that's all for 3.14 at the earliest.

@gaogaotiantian
Copy link
Member

Yeah I'm definitely not saying that we should throw Cython under the bus and claim we never supported the function. We should try to clarify on the matter and maybe make it less ambiguous. CPython always sets a f_locals of an unoptimized frame and this is interpreter frame, not the FrameObject. It would be nice if we explicitly enforce it or allow otherwise.

SonicField pushed a commit to SonicField/cpython that referenced this issue May 8, 2024
…ython#118748)

We don't know how to create an unoptimized frame with f_locals == NULL,
but they are seen in the wild, and this fixes the crash.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

6 participants