[numpy] array ufunc and array function protocols #16097

reminisce · 2019-09-05T05:05:33Z

Description

This PR implemented NumPy's array ufunc and array function protocols so that mxnet.numpy.ndarray can be fed into a subset of official NumPy operators and get dispatched to the corresponding MXNet operators for execution. See the following for examples.

import os

import numpy as onp  # the official NumPy's version must be >= 1.17
from mxnet import np as mx_np

a = mx_np.random.uniform(size=(1, 3))
b = mx_np.random.uniform(size=(2, 1))
assert type(onp.concatenate([a, a])) == mx_np.ndarray
assert type(onp.sum(a)) == mx_np.ndarray

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

szha · 2019-09-05T16:50:33Z

python/mxnet/array_function_protocol.py

+    return decorator
+
+
+_NUMPY_ARRAY_FUNCTION_LIST = [


is this the list of operators where our operator definition complies with the official numpy behavior?

Yes, it is the operators in our codebase that is dispatchable through the array function protocol. Others will be dispatched through either fluent methods or array ufunc protocol, which will be implemented later.

Awesome. We can use this list to track the full-compatibility with numpy operators once we start measuring it.

reminisce · 2019-09-07T02:27:16Z

@szha This PR is ready for merge if there are no more comments. Thanks.

szha · 2019-09-10T06:58:23Z

LGTM. @leezu would you mind taking a look too?

leezu

Thanks @reminisce ! I noticed some issues, consider the following example

[ins] In [1]: import mxnet as mx
impor
[ins] In [2]: import numpy as np

[ins] In [3]: a1 = mx.np.arange(10)

[ins] In [4]: np.dot(a1.asnumpy(),a1.asnumpy())
Out[4]: 285.0

[ins] In [5]: mx.np.dot(a1,a1)
Out[5]: array(285.)

[ins] In [6]: np.dot(a1,a1)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-be002a006ef6> in <module>
----> 1 np.dot(a1,a1)

ValueError: dot: too many dimensions in result

[ins] In [7]:

Would it make sense to fix it in this PR? My biggest concern is that this was not caught by the tests, so there may be more issues that we are not yet aware of. Could the tests be improved?
It's also OK to do in a follow-up PR, but if you can integrate it in this PR it's of course best.

leezu · 2019-09-10T09:28:14Z

python/mxnet/numpy_dispatch_protocol.py

+        if cur_np_ver >= np_1_17_ver:
+            try:
+                func(*args, **kwargs)
+            except:


Why is it necessary to catch all possible exceptions here and to overwrite their error message with a generic 'Running ... failed'?

Good point. I will append the generic message to the real exception.

reminisce · 2019-09-10T19:34:08Z

@leezu Thanks for the review. I tried the commands you showed, but it works for me. Could you confirm that you are using NumPy with version >= 1.17?

>>> from mxnet import npx, np
>>> a = np.arange(10)
>>> npx.set_np()
>>> a = np.arange(10)
>>> import numpy as onp
>>> print(onp.__version__)
1.17.1
>>> onp.dot(a, a)
array(285.)
>>> type(onp.dot(a, a))
<class 'mxnet.numpy.ndarray'>

reminisce · 2019-09-10T20:41:37Z

@leezu Re: Can test be improved? Yes, absolutely. This PR is for building the skeleton of testing suite. It enables mxnet.numpy operator developers to easily add test cases from the official NumPy package to MXNet. OpArgMngr in test_numpy_interoperatibility.py is for storing those test cases. We will send in those test cases in the follow-up PRs.

leezu · 2019-09-11T11:01:12Z

@leezu Could you confirm that you are using NumPy with version >= 1.17?

You're right, I was using numpy 1.16 and it ran into the well-known issue (I guess there is some github issue tracking it, but not sure) that numpy doesn't correctly convert ndarrays to numpy arrays.

Would it make sense to support Numpy 1.16 when NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1 environment variables is defined? If not, how about bumping the required minimum numpy version in setup.py to 1.17?

https://github.com/apache/incubator-mxnet/blob/e87995db6f8ac7e8d4424001e1db5296398562f6/python/setup.py#L33

reminisce · 2019-09-11T20:01:37Z

@leezu I prefer to bump the required version to 1.17 to use this feature for two reasons. However, I'm not sure if we should bump the numpy version requirement in this or a separate PR. Maybe @szha can provide some insights.

This functionality was claimed as experimental in 1.16. There might be bugs that we are not aware of, and consequently would have to deal with unexpected behaviors. I tried using numpy 1.16 during the development. However, there is some issue with NumPy correctly dispatching calls to MXNet even when the environment variable is set (I printed the environ value and verified it's 1) using nosetests. In 1.17, there is no such problem.
Asking users to set the environment variable manually to enable this functionality may open the door to unknown issues, especially under multi-threaded circumstances (the one I saw may be related to this). To close the possibilities of using it incorrectly and make the user experience straightforward without surprises, we can stick with numpy versions >= 1.17.

szha · 2019-09-12T07:24:35Z

The downside of upgrading is the potential conflict of the numpy dependency version between mxnet’s and other packages’. Shall we just produce proper error message for older numpy versions instead?

reminisce · 2019-09-12T07:42:41Z

I will add proper error message when it is used with numpy 1.16 versions. Thanks.

leezu · 2019-09-12T08:36:27Z

@szha this assumes that people can't upgrade to numpy 1.17 if they are using 1.16, even though there are no breaking changes https://docs.scipy.org/doc/numpy/release.html

If we properly declare the 1.17 dependency, users that install mxnet via pip will automatically get numpy 1.17 and everything will work. If we don't bump the dependency, users have to read through error messages to understand they need to upgrade..

Add test Fix Fix sanity Fix build failure Use array_function protocol only when np.version >= 1.17 Add unit tests for array ufunc protocol Fix pylint Reformat Fix build Fix Refactor test suite for numpy interoperability

szha · 2019-09-18T17:07:35Z

@leezu I'm referring to the situation where another package on an end user's environment declares a dependency on numpy 1.16, in which case the control of upgrading numpy isn't in the hands of the end user.

leezu · 2019-09-18T17:31:06Z

This seems quite hypothetical / uncommon to me. I'm not convinced if we need to take consideration of this, given the user experience impact of not declaring the correct numpy version dependency and our current strategy of throwing runtime errors or having numpy throw errors such as ValueError: dot: too many dimensions in result (which is not very helpful for users).
I don't want to block this PR though.

After all we have bumped numpy dependency for 1.5 release as well.. 1e6a1ab

reminisce · 2019-09-18T18:16:57Z

@leezu @szha How about we merge this PR first and later consider if it's necessary to bump the numpy version to 1.17?

szha · 2019-09-18T19:40:41Z

I'm OK with bumping the version and merging the PR. The error message during installation is sufficiently clear.

reminisce added the Numpy label Sep 5, 2019

reminisce requested a review from szha as a code owner September 5, 2019 05:05

reminisce added this to In progress in numpy via automation Sep 5, 2019

szha reviewed Sep 5, 2019

View reviewed changes

reminisce changed the title ~~[numpy] __array_function__ protocol~~ [numpy] array ufunc and array function protocols Sep 5, 2019

reminisce changed the title ~~[numpy] array ufunc and array function protocols~~ [numpy][DO NOT MERGE] array ufunc and array function protocols Sep 6, 2019

reminisce changed the title ~~[numpy][DO NOT MERGE] array ufunc and array function protocols~~ [numpy] array ufunc and array function protocols Sep 6, 2019

leezu reviewed Sep 10, 2019

View reviewed changes

Add __array_function__

f3548b2

Add test Fix Fix sanity Fix build failure Use array_function protocol only when np.version >= 1.17 Add unit tests for array ufunc protocol Fix pylint Reformat Fix build Fix Refactor test suite for numpy interoperability

reminisce force-pushed the implement_array_function_protocol branch from 32efa10 to f3548b2 Compare September 17, 2019 19:58

Improved error mesages

39ba49d

reminisce force-pushed the implement_array_function_protocol branch from e72720d to 39ba49d Compare September 18, 2019 04:25

Fix np.choice

b5296f4

szha merged commit 477e6f7 into apache:master Sep 18, 2019

numpy automation moved this from In progress to Done Sep 18, 2019

reminisce mentioned this pull request Sep 19, 2019

Bump numpy version >=1.17.0 #16207

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[numpy] array ufunc and array function protocols #16097

[numpy] array ufunc and array function protocols #16097

reminisce commented Sep 5, 2019 •

edited

szha Sep 5, 2019

reminisce Sep 5, 2019

szha Sep 5, 2019

reminisce commented Sep 7, 2019

szha commented Sep 10, 2019

leezu left a comment •

edited

leezu Sep 10, 2019

reminisce Sep 10, 2019

reminisce commented Sep 10, 2019

reminisce commented Sep 10, 2019 •

edited

leezu commented Sep 11, 2019 •

edited

reminisce commented Sep 11, 2019

szha commented Sep 12, 2019

reminisce commented Sep 12, 2019

leezu commented Sep 12, 2019 •

edited

szha commented Sep 18, 2019

leezu commented Sep 18, 2019 •

edited

reminisce commented Sep 18, 2019 •

edited

szha commented Sep 18, 2019

[numpy] array ufunc and array function protocols #16097

[numpy] array ufunc and array function protocols #16097

Conversation

reminisce commented Sep 5, 2019 • edited

Description

Checklist

Essentials

Changes

Comments

szha Sep 5, 2019

Choose a reason for hiding this comment

reminisce Sep 5, 2019

Choose a reason for hiding this comment

szha Sep 5, 2019

Choose a reason for hiding this comment

reminisce commented Sep 7, 2019

szha commented Sep 10, 2019

leezu left a comment • edited

Choose a reason for hiding this comment

leezu Sep 10, 2019

Choose a reason for hiding this comment

reminisce Sep 10, 2019

Choose a reason for hiding this comment

reminisce commented Sep 10, 2019

reminisce commented Sep 10, 2019 • edited

leezu commented Sep 11, 2019 • edited

reminisce commented Sep 11, 2019

szha commented Sep 12, 2019

reminisce commented Sep 12, 2019

leezu commented Sep 12, 2019 • edited

szha commented Sep 18, 2019

leezu commented Sep 18, 2019 • edited

reminisce commented Sep 18, 2019 • edited

szha commented Sep 18, 2019

reminisce commented Sep 5, 2019 •

edited

leezu left a comment •

edited

reminisce commented Sep 10, 2019 •

edited

leezu commented Sep 11, 2019 •

edited

leezu commented Sep 12, 2019 •

edited

leezu commented Sep 18, 2019 •

edited

reminisce commented Sep 18, 2019 •

edited