Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

mxnet.numpy and numpy differ regarding binary op result type with broadcast 0-dim array input #20898

Open
DickJC123 opened this issue Feb 17, 2022 · 4 comments

Comments

@DickJC123
Copy link
Contributor

DickJC123 commented Feb 17, 2022

Description

I discovered this issue when troubleshooting a unittest failure of ./tests/python/unittest/test_numpy_op.py::test_np_standard_binary_funcs. The command-line repro is given in the steps to reproduce section below. The test involves adding 2 operands, one with dtype float16, the other of dtype float64.

The standard numpy (onp below) has a somewhat strange behavior when one of the inputs is broadcast: if the broadcast input is a 0-dim array comprising a single scalar, then the output dtype will be the dtype of the other input, and the scalar is cast to that other input's precision. However, if the broadcast input is a 1-element array of shape (1,), then the output has the dtype of the most precise input, and the least precise input is cast-up to that precision.

MXNet's numpy does not match this behavior: in both cases the output is cast-up to the highest precision.

See the sequence below. The lines a_onp + b_onp and a_np + b_np match, showing the same behavior when broadcasting a 1-element array. However, the lines a2_onp + b_onp and a2_np + b_np do not match, showing a different behavior when broadcasting a 0-dim array:

$ python
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as onp
>>> a_onp = onp.array([-0.5456035234182584], dtype=onp.float64)              # dim-1 array
>>> b_onp = onp.array([0.54833984, 0], dtype=onp.float16)
>>> a_onp + b_onp
array([ 0.00273632, -0.54560352])
>>> type(a_onp + b_onp)
<class 'numpy.ndarray'>
>>> (a_onp + b_onp).dtype
dtype('float64')
>>> a2_onp = onp.array(-0.5456035234182584, dtype=onp.float64)               # dim-0 array
>>> a2_onp + b_onp
array([ 0.00293, -0.5454 ], dtype=float16)                                   # numpy 'ground truth' for issue
>>> import mxnet.numpy as np
>>> a_np = np.array([-0.5456035234182584], dtype=np.float64)                 # dim-1 array
>>> b_np = np.array([0.54833984, 0], dtype=np.float16)
>>> a_np + b_np
array([ 0.00273632, -0.54560352], dtype=float64)                             # mxnet.numpy matches numpy
>>> type(a_np + b_np)
<class 'mxnet.numpy.ndarray'>
>>> (a_np + b_np).dtype
dtype('float64')
>>> a2_np = np.array(-0.5456035234182584, dtype=np.float64)                  # dim-0 array
>>> a2_np + b_np
array([ 0.00273632, -0.54560352], dtype=float64)                             # mxnet.numpy differs from numpy 'ground truth'

Steps to reproduce

MXNET_TEST_SEED=1117974739 pytest --verbose -s tests/python/unittest/test_numpy_op.py::test_np_standard_binary_funcs[lshape6-rshape6-add-add-True-numeric-\<lambda\>-None--1.0-1.0]

Error Message

Partial output:

E       AssertionError:
E       Items are not equal:
E       Error 1.495529 exceeds tolerance rtol=1.000000e-02, atol=1.000000e-04 (mismatch 16.666667%).
E       Location of maximum error: (0, 0), a=0.00273632, b=0.00292969
E        ACTUAL: array([[ 0.00273632,  0.04570507, -0.57712818],
E              [-0.14863087, -1.35322071, -0.93403126]])
E        DESIRED: array([[ 0.00292969,  0.04589844, -0.57714844],
E              [-0.1484375 , -1.35351562, -0.93359375]])

Environment

Ubuntu 20.04. Recent master build. Failure of cpu test, so GPU config not relevant.

@szha
Copy link
Member

szha commented Feb 21, 2022

In general we want to conform to the array API standard. While fp16 is not in the array standard yet, I tried the same with ints and the same discrepancy exists.

In [17]: onp.array(-2, dtype=onp.int64) + onp.array([2, 0], dtype=onp.int16)
Out[17]: array([ 0, -2], dtype=int16)

In [18]: onp.array([-2], dtype=onp.int64) + onp.array([2, 0], dtype=onp.int16)
Out[18]: array([ 0, -2])

In [19]: np.array(-2, dtype=onp.int64) + np.array([2, 0], dtype=onp.int16)
Out[19]: array([ 0, -2], dtype=int64)

In [20]: np.array([-2], dtype=onp.int64) + np.array([2, 0], dtype=onp.int16)
Out[20]: array([ 0, -2], dtype=int64)

According to the type promotion rules in array API standard, mxnet's current behavior is desired.

Type promotion rules must apply when determining the common result type for two array operands during an arithmetic operation, regardless of array dimension. Accordingly, zero-dimensional arrays must be subject to the same type promotion rules as dimensional arrays.

@rgommers do you expect this to become a problem for API standard adoption on the numpy side?

@rgommers
Copy link

@rgommers do you expect this to become a problem for API standard adoption on the numpy side?

Type promotion in NumPy has a lot of weird corner cases. Right now there's an effort to get rid of those, including value-based casting. A draft NEP (Numpy Enhancement Proposal) was just posted: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/RWOZYX2LN66T243C6II2BCXRWCPM3CV2/.

Type casting issues are the most important reason for why we added array API support as a separate namespace (numpy.array_api), even though we initially aimed for the main namespace. I'm hoping we can resolve those, but with the deprecation policy that will take at least 2 more releases (and perhaps more, because it is hairy), so adding array API standard support to the main numpy namespace is still a long-term goal but not an easy fix.

Many of the issues like the 0-D array special-casing brought up here continue to yield bug reports in NumPy itself and downstream libraries, and are design problems that shouldn't be emulated but needed fixing already before we designed the array API standard.

@rgommers
Copy link

@rgommers do you expect this to become a problem for API standard adoption on the numpy side?

To clarify, let me also say: numpy.array_api is 100% complete and compliant already, CuPy has a matching namespace (cupy.array_api) and there's work ongoing on scikit-learn and SciPy to use that array API standard support. So I'd say that NumPy has already adopted the standard.

For MXNet I think it makes sense to not emulate the separate array_api module, but do it in the main namespace. The differences are (almost) all either naming things or design warts in NumPy. PyTorch is also adding support in its main namespace.

@szha
Copy link
Member

szha commented Mar 2, 2022

@DickJC123 to sum up this should be a bug in numpy and they need to fix the type promotion. We can use the array API namespace for comparison in tests.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants