Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numpy unexpectedly raises zero division error if array is long enough #11051

Closed
evfro opened this issue May 5, 2018 · 19 comments
Closed

numpy unexpectedly raises zero division error if array is long enough #11051

evfro opened this issue May 5, 2018 · 19 comments

Comments

@evfro
Copy link

evfro commented May 5, 2018

Steps to reproduce:

with np.errstate(invalid='ignore', divide='raise'):
    for i in range(10000):
        z = np.zeros(i)
        try:
            z / z
        except(FloatingPointError) as exc:
            print(exc, 'at size', i)
            break

output:
divide by zero encountered in true_divide at size 8001

numpy version: 1.14.2

Seems like this bug was introduced just recently, as there is no such issue with 1.13.3 and 1.14.1.

@charris
Copy link
Member

charris commented May 5, 2018

I cannot reproduce this. What platform?

@evfro
Copy link
Author

evfro commented May 7, 2018

Sorry for not mentioning,
Windows 10, version 1709

@mattip
Copy link
Member

mattip commented May 7, 2018

What do you get when you run import sys; print(sys.version), if not anaconda how did you obtain NumPy?

@evfro
Copy link
Author

evfro commented May 7, 2018

3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56) [GCC 7.2.0]

@mattip
Copy link
Member

mattip commented May 7, 2018

How do you get Anaconda on windows 10 compiled with GCC 7.2.0? Is that standard? I thought they use MSVC?

@evfro
Copy link
Author

evfro commented May 7, 2018

My bad, I've mixed terminals. I have double checked and I'm also not able to reproduce it on Windows.
So the problem appears only on Ubuntu (at least 14.04.4 LTS and 16.04.3 LTS) both with
Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56) [GCC 7.2.0] on linux and numpy version 1.14.2

@evfro
Copy link
Author

evfro commented Oct 6, 2018

Hi! Any update on that issue? I'm getting this error also with newer versions of Python: Python 3.7.0 (default, Jun 28 2018, 13:15:42) [GCC 7.2.0] :: Anaconda, Inc. on linux, and Numpy: 1.15.2 on Ubuntu system.

@charris
Copy link
Member

charris commented Oct 6, 2018

I still cannot reproduce this, so it is hard to debug :)

@charris
Copy link
Member

charris commented Oct 6, 2018

Anyone out there running ubuntu who can check this?

@charris
Copy link
Member

charris commented Oct 6, 2018

I don't see anything suspicious between NumPy 1.14.1 and 1.14.2

A total of 5 pull requests were merged for this release.

* `#10674 <https://github.com/numpy/numpy/pull/10674>`__: BUG: Further back-compat fix for subclassed array repr
* `#10725 <https://github.com/numpy/numpy/pull/10725>`__: BUG: dragon4 fractional output mode adds too many trailing zeros
* `#10726 <https://github.com/numpy/numpy/pull/10726>`__: BUG: Fix f2py generated code to work on PyPy
* `#10727 <https://github.com/numpy/numpy/pull/10727>`__: BUG: Fix missing NPY_VISIBILITY_HIDDEN on npy_longdouble_to_PyLong
* `#10729 <https://github.com/numpy/numpy/pull/10729>`__: DOC: Create 1.14.2 notes and changelog.

@mattip
Copy link
Member

mattip commented Oct 6, 2018

Maybe Ananconda switched compilers / mkl libraries. Please check with them.

evfro added a commit to evfro/polara that referenced this issue Oct 7, 2018
@mattip mattip added 00 - Bug 57 - Close? Issues which may be closable unless discussion continued component: numpy.ufunc labels Oct 8, 2018
@mattip mattip added 29 - Intel/Anaconda and removed 57 - Close? Issues which may be closable unless discussion continued labels Jan 16, 2019
@mattip
Copy link
Member

mattip commented Aug 18, 2019

@evfro, @oleksandr-pavlyk can we close this?

@evfro
Copy link
Author

evfro commented Aug 18, 2019

I've found a workaround and not using the lines of code that generate that issue.
That said, I've checked whether the issue still persists with the recent compatible conda/numpy/mkl environment and, unfortunately, it does.

import sys; print(sys.version)
3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0]

numpy.__version__
1.16.4

conda list mkl
mkl 2019.4 243
mkl-service 2.0.2 py36h7b6447c_0

@oleksandr-pavlyk
Copy link
Contributor

Yes, it is still reproducible:

In [2]: z = np.zeros(8001)

In [3]: z/z
/tmp/miniconda3/bin/ipython:1: RuntimeWarning: divide by zero encountered in true_divide
  #!/tmp/miniconda3/bin/python
/tmp/miniconda3/bin/ipython:1: RuntimeWarning: invalid value encountered in true_divide
  #!/tmp/miniconda3/bin/python
Out[3]: array([nan, nan, nan, ..., nan, nan, nan])

It is coming from use of MKL VML function vdDiv used to perform division.

One can turn errors off by using mkl.set_vml_mode:

import mkl     # import mkl-service
saved = mkl.vml_set_mode(
    "ha", # accuracy control
     "off", # denormalized numbers handling
     "ignore" # error mode control
)

See https://software.intel.com/en-us/mkl-developer-reference-c-vmlsetmode for more details

You can restore the VML behavior with

mkl.vml_set_mode(*saved)

@mattip
Copy link
Member

mattip commented Aug 20, 2019

NumPy specifically resets the error state when it is obvious (by emitting NAN) that divide-by-zero occurred. I think the MKL port should do the same inside the ufuncs without requiring global error mode control.

@mattip
Copy link
Member

mattip commented Dec 2, 2020

Closing, as it seems this is a "won't fix" from the mkl port of NumPy. Please reopen if there is something actionable NumPy can do.

@mattip mattip closed this as completed Dec 2, 2020
@astroboylrx
Copy link

It is coming from use of MKL VML function vdDiv used to perform division.

Thanks a lot for explaining that. It's so hard to find relevant information since too many webpages are about real zero-division cases.

I ran into this issue as well and it only occurs with numpy with MKL. I tried to turn errors off with your code snippet and it
does removed half of such errors.
@oleksandr-pavlyk @phil-blain I wonder what might be responsible for the other half "divide by zero encountered in true_divide" errors? Any thoughts or suggestions would be greatly appreciated!

@oleksandr-pavlyk
Copy link
Contributor

Nowadays implementations of universal functions that use MKL VML has all been moved to mkl_umath package.

The ufunc loops for certain types are registered with NumPy's universal functions via PyUFunc_RegisterLoopForType.

One can deregister these loops using mkl_umath.restore().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants