Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: import scipy.stats fails under valgrind #17368

Closed
PhilMiller opened this issue Nov 7, 2022 · 13 comments · Fixed by #17432
Closed

BUG: import scipy.stats fails under valgrind #17368

PhilMiller opened this issue Nov 7, 2022 · 13 comments · Fixed by #17432
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.stats
Milestone

Comments

@PhilMiller
Copy link

PhilMiller commented Nov 7, 2022

Describe your issue.

When running under valgrind, importing scipy.stats fails, preventing use of valgrind to debug any subsequent errors in application code.

The latest version of NumPy at the time of my report, 1.23.4, also exhibits this failure. I'm using 1.22.4 due to compatibility constraints with Numba

Reproducing Code Example

PYTHONMALLOC=malloc valgrind python3 -m scipy.stats

Error message

==30715== Memcheck, a memory error detector
==30715== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==30715== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==30715== Command: python3 -m scipy.stats
==30715== 
==30715== Conditional jump or move depends on uninitialised value(s)
==30715==    at 0x58B11F: PyUnicode_Decode (in /usr/bin/python3.8)
==30715==    by 0x58B474: PyUnicode_FromEncodedObject (in /usr/bin/python3.8)
==30715==    by 0x577A5A: ??? (in /usr/bin/python3.8)
==30715==    by 0x5F73E2: _PyObject_MakeTpCall (in /usr/bin/python3.8)
==30715==    by 0x570D54: _PyEval_EvalFrameDefault (in /usr/bin/python3.8)
==30715==    by 0x59C86A: ??? (in /usr/bin/python3.8)
==30715==    by 0x5F745E: _PyObject_MakeTpCall (in /usr/bin/python3.8)
==30715==    by 0x570D54: _PyEval_EvalFrameDefault (in /usr/bin/python3.8)
==30715==    by 0x569DB9: _PyEval_EvalCodeWithName (in /usr/bin/python3.8)
==30715==    by 0x5F6EB2: _PyFunction_Vectorcall (in /usr/bin/python3.8)
==30715==    by 0x570B25: _PyEval_EvalFrameDefault (in /usr/bin/python3.8)
==30715==    by 0x569DB9: _PyEval_EvalCodeWithName (in /usr/bin/python3.8)
==30715== 
==30723== Warning: invalid file descriptor 8180 in syscall close()
==30723== Warning: invalid file descriptor 8181 in syscall close()
==30723== Warning: invalid file descriptor 8182 in syscall close()
==30723== Warning: invalid file descriptor 8183 in syscall close()
==30723==    Use --log-fd=<number> to select an alternative log fd.
==30723== Warning: invalid file descriptor 8184 in syscall close()
==30723== Warning: invalid file descriptor 8185 in syscall close()
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 185, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.8/runpy.py", line 144, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/usr/lib/python3.8/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/home/ubuntu/env/2022-11-07-newamrex/lib/python3.8/site-packages/scipy/stats/__init__.py", line 467, in <module>
    from ._stats_py import *
  File "/home/ubuntu/env/2022-11-07-newamrex/lib/python3.8/site-packages/scipy/stats/_stats_py.py", line 46, in <module>
    from . import distributions
  File "/home/ubuntu/env/2022-11-07-newamrex/lib/python3.8/site-packages/scipy/stats/distributions.py", line 10, in <module>
    from . import _continuous_distns
  File "/home/ubuntu/env/2022-11-07-newamrex/lib/python3.8/site-packages/scipy/stats/_continuous_distns.py", line 31, in <module>
    import scipy.stats._boost as _boost
  File "/home/ubuntu/env/2022-11-07-newamrex/lib/python3.8/site-packages/scipy/stats/_boost/__init__.py", line 1, in <module>
    from scipy.stats._boost.beta_ufunc import (
SystemError: initialization of beta_ufunc raised unreported exception
==30715== 
==30715== HEAP SUMMARY:
==30715==     in use at exit: 5,882,271 bytes in 35,007 blocks
==30715==   total heap usage: 591,124 allocs, 556,117 frees, 104,206,788 bytes allocated
==30715== 
==30715== LEAK SUMMARY:
==30715==    definitely lost: 46,936 bytes in 415 blocks
==30715==    indirectly lost: 20,712 bytes in 208 blocks
==30715==      possibly lost: 2,329,450 bytes in 10,996 blocks
==30715==    still reachable: 3,485,173 bytes in 23,388 blocks
==30715==                       of which reachable via heuristic:
==30715==                         stdstring          : 38 bytes in 1 blocks
==30715==         suppressed: 0 bytes in 0 blocks
==30715== Rerun with --leak-check=full to see details of leaked memory
==30715== 
==30715== Use --track-origins=yes to see where uninitialised values come from
==30715== For lists of detected and suppressed errors, rerun with: -s
==30715== ERROR SUMMARY: 3 errors from 1 contexts (suppressed: 0 from 0)

SciPy/NumPy/Python version information

1.9.3 1.22.4 sys.version_info(major=3, minor=8, micro=10, releaselevel='final', serial=0)

@PhilMiller PhilMiller added the defect A clear bug or issue that prevents SciPy from being installed or used as expected label Nov 7, 2022
@rgommers
Copy link
Member

rgommers commented Nov 8, 2022

Thanks for the report @PhilMiller. How did you install SciPy and Python?

@seberg I think you're running NumPy under Valgrind regularly - are you aware of 1.23.4 not working?

@seberg
Copy link
Contributor

seberg commented Nov 8, 2022

Nah, I run things regularly (as in before any release except bug-fix releases). I patched Python, to show what should be the real error (it would be nice to chain the error, but I think I may need to ask for permission to just make a PR to C-Python):

  File "/home/sebastian/python-dev/lib/python3.10/site-packages/scipy/stats/_boost/__init__.py", line 1, in <module>
    from scipy.stats._boost.beta_ufunc import (
OverflowError: Error in function boost::math::erfc_inv<e>(e, e): Overflow Error

which in turn points to scipy.special.erfcinv, but NumPy wouldn't raise this as an error unless set with np.errstate (which I will doubt), so it looks more like something going wrong in SciPy's custom ufunc error creation trying to propagate C++ exceptions (IIRC that this is what SciPy did)?!

@seberg
Copy link
Contributor

seberg commented Nov 8, 2022

Maybe valgrind causes an incorrect FPE to be set, but not sure?

@seberg
Copy link
Contributor

seberg commented Nov 8, 2022

Ah, the error originates from this boost policy: https://github.com/scipy/scipy/blob/main/scipy/stats/_boost/include/func_defs.hpp#L41

Would probably be best if someone looks at it with boost/C++ glasses about stats/special.

@PhilMiller
Copy link
Author

Thanks for the report @PhilMiller. How did you install SciPy and Python?

In case it's still material, Python is as packaged in Ubuntu 20.04, and SciPy and everything else are from pip install.

@PhilMiller
Copy link
Author

If you need C++ expertise in following through on debugging, let me know and we can set a time.

@seberg
Copy link
Contributor

seberg commented Nov 8, 2022

@PhilMiller not sure if that is necessary. I would suggest by continuing to patch (or just setting a breakpoint) up that OverflowError I pointed to in scipy to get a full backtrace for where the error is actually raised in gdb, maybe that gives the right pointer.

EDIT: Btw. clearly reproducing the issue "just works" for me, that was on debian with a python3.9 (and a python3.9 or 3.10 dev I had still flying around; I also used the pip installed SciPy with current NumPy main, but I don't think NumPy version should matter.)

@WarrenWeckesser
Copy link
Member

WarrenWeckesser commented Nov 15, 2022

This is the result of an uncaught exception in the Boost code. We have Cython code (generated during the build) that wraps a bunch of the methods of Boost's beta distribution as ufuncs. Some of those Boost methods depend on boost::math::erfc_inv. It appears that, during the import of the beta_ufuncs module, the C++ class erf_inv_initializer at https://github.com/scipy/boost-headers-only/blob/3af99e6d566043072e95bc882d32c9c26f37e0ba/boost/math/special_functions/detail/erf_inv.hpp#L328-L379 is instantiated. And when python is run with valgrind, these lines

            if(is_value_non_zero(static_cast<T>(BOOST_MATH_BIG_CONSTANT(T, 64, 1e-800))))
               boost::math::erfc_inv(static_cast<T>(BOOST_MATH_BIG_CONSTANT(T, 64, 1e-800)), Policy());

result in the call boost::math::erfc_inv(0), even though it looks like the if expression should prevent that. That call of erfc_inv(0) raises an exception that is not caught in the Python wrapper, so Python crashes. I don't know what valgrind is doing to make that code misbehave; if I run python directly and import the wrapper module, the call erfc_inv(0) does not happen.

@mckib2, any ideas?

@WarrenWeckesser
Copy link
Member

WarrenWeckesser commented Nov 15, 2022

I just noticed this a few lines down in the Boost file special_functions/detail/erf_inv.hpp:

template <class T, class Policy>
bool erf_inv_initializer<T, Policy>::init::is_value_non_zero(T v)
{
   // This needs to be non-inline to detect whether v is non zero at runtime
   // rather than at compile time, only relevant when running under valgrind
   // which changes long double's to double's on the fly.
   return v != 0;
}

So perhaps Boost's attempt to work around valgrind's behavior with long double isn't working as expected.

@WarrenWeckesser
Copy link
Member

... and it looks like this has already been fixed in Boost: boostorg/math#809

So the issue should be fixed when we update our vendored version of Boost to 1.80+.

@mckib2
Copy link
Contributor

mckib2 commented Nov 16, 2022

@mckib2, any ideas?

I immediately thought of some weird compiler issues we've seen with Boost, but I think that's the case. I think you're right @WarrenWeckesser, once gh-17207 goes in this will be fixed or we can add the upstream patch. I think that's currently blocking on me doing a refactor to get Boost.Math as a standalone project

@seberg
Copy link
Contributor

seberg commented Nov 16, 2022

valgrind doesn't support longdouble. Those constants are too big for longdouble. So, the error is not unexpected, in NumPy testing, these are usually just warnings (and quite a few tests fail due to it), but I guess here it happening at import time already is a problem.

(Some people report such issue with NumPy as well, they usually have stranger environments like embedded Python setup.)

@mckib2
Copy link
Contributor

mckib2 commented Jan 30, 2023

On the branch from gh-17432, the beta_ufunc exception is not raised as it is currently on main:

$ PYTHONMALLOC=malloc valgrind python -m scipy.stats
==401588== Memcheck, a memory error detector
==401588== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==401588== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==401588== Command: python -m scipy.stats
==401588== 
/venvs/scipy/bin/python: No module named scipy.stats.__main__; 'scipy.stats' is a package and cannot be directly executed
==401588== 
==401588== HEAP SUMMARY:
==401588==     in use at exit: 3,859,503 bytes in 28,995 blocks
==401588==   total heap usage: 1,087,455 allocs, 1,058,460 frees, 162,718,519 bytes allocated
==401588== 
==401588== LEAK SUMMARY:
==401588==    definitely lost: 44,664 bytes in 299 blocks
==401588==    indirectly lost: 23,184 bytes in 233 blocks
==401588==      possibly lost: 2,665,657 bytes in 22,869 blocks
==401588==    still reachable: 1,125,998 bytes in 5,594 blocks
==401588==         suppressed: 0 bytes in 0 blocks
==401588== Rerun with --leak-check=full to see details of leaked memory
==401588== 
==401588== For lists of detected and suppressed errors, rerun with: -s
==401588== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@PhilMiller Please checkout gh-17432 to make sure it resolves the issue on your system

@tupui tupui added this to the 1.11.0 milestone Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.stats
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants
@seberg @rgommers @PhilMiller @WarrenWeckesser @tupui @mckib2 @j-bowhay and others