Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Integer overflow in remainder for Windows #19260

Closed
wants to merge 3 commits into from

Conversation

ganesh-k13
Copy link
Member

@ganesh-k13 ganesh-k13 commented Jun 16, 2021

Integer overflow in remainder for Windows

  1. Aims to fix crash when % operator is used between MIN value of int and -1
  2. Test cases for integer remainders

TODO:

  • Commit actual fix.
  • Fix scalar case. Comment

Note: I have not committed the actual fix yet, I want to confirm if the crash is reproducible in Azure Pipelines

Local Reproduction
# ganes in DESKTOP-5UIVG9B  D:osnumpy on git:BUG_19246_mod_crash ≢  ~2                                                                                                             [21:58:45]
  3.9.4 ➜  python3 .\runtests.py -v -t numpy\core\tests\test_umath.py
Building, see build.log...
Build OK
NumPy version 0.3.0+25943.g8b0f1ad46
NumPy relaxed strides checking option: True
NumPy CPU features:  SSE SSE2 SSE3 SSSE3* SSE41* POPCNT* SSE42* AVX* F16C* FMA3* AVX2* AVX512F? AVX512CD? AVX512_SKX? AVX512_CLX? AVX512_CNL? AVX512_ICL?
====================================================================================== test session starts ======================================================================================
platform win32 -- Python 3.8.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: D:\os\numpy, configfile: pytest.ini
plugins: hypothesis-6.14.0, cov-2.12.1
collected 327 items

numpy\core\tests\test_umath.py ...........................................................................................FFWindows fatal exception: integer overflow

Current thread 0x00006478 (most recent call first):
  File "D:\os\numpy\build\testenv\Lib\site-packages\numpy\core\tests\test_umath.py", line 544 in test_remainder_integer
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\python.py", line 183 in pytest_pyfunc_call
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\callers.py", line 187 in _multicall
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 84 in <lambda>
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 93 in _hookexec
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\hooks.py", line 286 in __call__
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\python.py", line 1641 in runtest
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\runner.py", line 162 in pytest_runtest_call
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\callers.py", line 187 in _multicall
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 84 in <lambda>
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 93 in _hookexec
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\hooks.py", line 286 in __call__
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\runner.py", line 255 in <lambda>
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\runner.py", line 311 in from_call
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\runner.py", line 254 in call_runtest_hook
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\runner.py", line 215 in call_and_report
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\runner.py", line 126 in runtestprotocol
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\runner.py", line 109 in pytest_runtest_protocol
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\callers.py", line 187 in _multicall
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 84 in <lambda>
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 93 in _hookexec
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\hooks.py", line 286 in __call__
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\main.py", line 348 in pytest_runtestloop
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\callers.py", line 187 in _multicall
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 84 in <lambda>
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 93 in _hookexec
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\hooks.py", line 286 in __call__
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\main.py", line 323 in _main
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\main.py", line 269 in wrap_session
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\main.py", line 316 in pytest_cmdline_main
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\callers.py", line 187 in _multicall
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 84 in <lambda>
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\manager.py", line 93 in _hookexec
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\pluggy\hooks.py", line 286 in __call__
  File "C:\Users\ganes\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\_pytest\config\__init__.py", line 162 in main
  File "D:\os\numpy\build\testenv\Lib\site-packages\numpy\_pytesttester.py", line 197 in __call__
  File ".\runtests.py", line 383 in main
  File ".\runtests.py", line 685 in <module>

resolves #19246

with pytest.raises(FloatingPointError):
np.remainder(a, -1)
# with pytest.raises(FloatingPointError):
# np.remainder(iinfo.min, -1)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not using the same function as

@TYPE@_remainder(char **args, npy_intp const *dimensions, npy_intp const *steps, void *NPY_UNUSED(func))

So only array cases are solved by adding the MIN&-1 check

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just stumbled on this. @ganesh-k13 might be nice to finish it?

@@ -849,7 +849,7 @@ NPY_NO_EXPORT void
BINARY_LOOP {
const @type@ in1 = *(@type@ *)ip1;
const @type@ in2 = *(@type@ *)ip2;
if (in2 == 0) {
if (in2 == 0 || (in1 == NPY_MIN_@TYPE@ && in2 == -1)) {
npy_set_floatstatus_divbyzero();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to set the error flag for the second branch?

@ganesh-k13
Copy link
Member Author

Yeah, let me rebase and try it now.

@ganesh-k13
Copy link
Member Author

So the problem, on the whole, has been transferred to a different location via #21124, able to reproduce it though, but the issue is still there.

@ganesh-k13
Copy link
Member Author

I'm stuck in this weird issue where: np.iinfo(np.int32).min % -1 is fixed but np.iinfo(np.int32).min % np.int32(-1) just flies past the if.

Now getting gdb working on windows seems harder than SIMD :). Will try to fix it soon.

@seberg
Copy link
Member

seberg commented May 7, 2022

Hmm, but its not that the first ends up in the array and the second the scalar path? I can't see why, but both should be 32bit on windows, so fine.

There is always the possibility that the compiler optimizes away an if and indeed "flies past" because it thinks it can reorder the code and just overwrite the result if the if evaluated true. godbolt might help, or adding a volatile to the pointers (or maybe disable optimization). The latter options might force the compiler to not do such shenanigans.

@ganesh-k13
Copy link
Member Author

Thanks Sebastian! Yeah I did volatile the variables but that gave no results. Pretty interesting issue this. There are quite a few paths now, although these two should take the same. If gdb works, this is pretty much a five mins debug :), will get it working first.

@mattip
Copy link
Member

mattip commented May 7, 2022

Now getting gdb working on windows seems harder than SIMD :). Will try to fix it soon.

I have done this in the past by

  • Setting the compiler flags to add the /Zi option, either by setting CFLAGS or by editing the compiler flags in numpy/distutils/ccompiler_opt.py
  • Set a breakpoint inside runtest.main() so that the command python runtests.py -t <relevant test> will stop just after starting up and importing NumPy. Then print the PID with os.getpid(). Importing NumPy will make sure the debug information for the DLL can be found.
  • Use the "Attach to Process" option in the Visual debugger (I use Visual Studio, maybe you can also use Visual Code?) to attach to the PID
  • Open the file where you want to stop in the debugger and try to set a breakpoint. If you cannot set a breakpoint, check the "output" window to see that the debug information was actually loaded when probing the DLL. If not, go back to the first step and make sure compilation is creating the debug info.
  • Hit c in the python prompt to continue running the test.

@ganesh-k13
Copy link
Member Author

Thanks for the detailed steps @mattip ! I'll give it a go.

@ganesh-k13
Copy link
Member Author

Thanks @mattip , I'm able to hit the code, etc. Just that the /Zi is not producing pdb files for some reason and /Z7 is not building with symbols so no symbols to debug.

I'll dig a bit deeper, but your steps were super useful! We should document this somewhere, will help CPython and not just NumPy.

Looking at build logs, it seems Zi is not picked although the env is set. Weird.

@mattip
Copy link
Member

mattip commented May 9, 2022

it seems Zi is not picked although the env is set

Try modifying the actual code with the flags in numpy/distutils/ccompiler_opt.py?

@ganesh-k13
Copy link
Member Author

ganesh-k13 commented May 9, 2022

Ah ok, I was editing in the wrong place, I edited this and got the pdb:

srcs, list(flags), ccompiler=ccompiler, **kwargs

-a----          5/9/2022  10:36 PM         290816 vc140.pdb

Only problem however, is that the pdb does not belong to any of the modules, so how/where do we load it to? As a workaround, Z7, /DEBUG:FASTLINK nor /DEBUG:FULL produced .pdb .pyd with required symbols:

...
'python3.10.exe' (Win32): Loaded 'D:\os\numpy\numpy\random\_bounded_integers.cp310-win_amd64.pyd'. Module was built without symbols.
'python3.10.exe' (Win32): Loaded 'D:\os\numpy\numpy\random\_mt19937.cp310-win_amd64.pyd'. Module was built without symbols.
'python3.10.exe' (Win32): Loaded 'D:\os\numpy\numpy\random\_philox.cp310-win_amd64.pyd'. Module was built without symbols.
'python3.10.exe' (Win32): Loaded 'D:\os\numpy\numpy\random\_pcg64.cp310-win_amd64.pyd'. Module was built without symbols.
'python3.10.exe' (Win32): Loaded 'D:\os\numpy\numpy\random\_sfc64.cp310-win_amd64.pyd'. Module was built without symbols.
'python3.10.exe' (Win32): Loaded 'D:\os\numpy\numpy\random\_generator.cp310-win_amd64.pyd'. Module was built without symbols.
...

My call stacks frame0:

>	_multiarray_umath.cp310-win_amd64.pyd!00007ffc5befee95()	Unknown

So on loading vc140.pdb with _multiarray_umath.cp310-win_amd64.pyd, I get error: A matching symbol file was not found in this folder.

So my doubt is how do I use the generated the vc140.pdb or, how do I generate inline pyd with symbols?

@mattip
Copy link
Member

mattip commented May 10, 2022

It's too bad we never captured this information for the debugging docs. Do you see /Zi (case sensitive) being used? Maybe you need to add a link argument /DEBUG ?

@ganesh-k13
Copy link
Member Author

Yep the flags seem right. I ended up using WSL to step through the code and verify changes on powershell :). We'll get back to windows debug someday and document it.

Somehow the problem is now in

if (a == 0 || b == 0) {
*out = 0;
if (b == 0) {
return NPY_FPE_DIVIDEBYZERO;

Which is weird cause the problem should have always been here right? Anyways it does fix the issue(take this with a pinch of salt, as I've said this 3 times with the issue transferring to a different file).

Now interesting thing is, // also does not have the min+-1 check, naturally, that crashed as well(kinda):
Powershell

>>> np.iinfo(np.int64).min // np.int64(-1)
<stdin>:1: RuntimeWarning: overflow encountered in longlong_scalars
-9223372036854775808

WSL:

>>> np.iinfo(np.int64).min // np.int64(-1)
fish: 'python3' terminated by signal SIGFPE (Floating point exception)

Linux: I need to reboot, will update here when I boot into it.

Now there are more than 2 places where we do unchecked // and %, let me fish all of them out and add proper UT covering all the cases.

@ganesh-k13
Copy link
Member Author

This is super interesting. TIL moment if I'm being honest:

This is in scalarmath.c

np.iinfo(np.int64).min % np.int64(-1)

while below is in loops_modulo.dispatch.c

np.iinfo(np.int64).min % np.int16(-1)

I'm basically able to repro the issues consistently on Linux, albeit only on 64 bits. Will fix them soon and add UT.

@seberg
Copy link
Member

seberg commented Sep 8, 2022

I think this was covered by gh-21507, so closing, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Int32 (and Int64) Min Value modulo -1 Causes Python to Crash
3 participants