Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: update NEP 31 for vendoring and being complementary to NEP-13/18 #3

Merged
merged 2 commits into from Aug 24, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
70 changes: 49 additions & 21 deletions doc/neps/nep-0031-uarray.rst
Expand Up @@ -6,7 +6,7 @@ NEP 31 — Context-local and global overrides of the NumPy API
:Author: Ralf Gommers <rgommers@quansight.com>
:Status: Draft
:Type: Standards Track
:Created: 2019-07-31
:Created: 2019-08-22


Abstract
Expand All @@ -16,19 +16,27 @@ This NEP proposes to make all of NumPy's public API overridable via a backend
mechanism, using a library called ``uarray`` `[1]`_

``uarray`` provides global and context-local overrides, as well as a dispatch
mechanism similar to NEP-18 `[2]`_. This NEP proposes to supercede NEP-18,
and is intended as a comprehensive resolution to NEP-22 `[3]`_.
mechanism similar to NEP-18 `[2]`_. First experiences with ``__array_function__``
show that it is necessary to be able to override NumPy functions that
*do not take an array-like argument*, and hence aren't overridable via
``__array_function__``. The most pressing need is array creation and coercion
functions - see e.g. NEP-30 `[9]`_.

This NEP proposes to allow, in an opt-in fashion, overriding any part of the NumPy API.
It is intended as a comprehensive resolution to NEP-22 `[3]`_, and obviates the need to
add an ever-growing list of new protocols for each new type of function or object that needs
to become overridable.

Motivation and Scope
--------------------

The motivation behind this library is manifold: First, there have been several attempts to allow
The motivation behind ``uarray`` is manyfold: First, there have been several attempts to allow
dispatch of parts of the NumPy API, including (most prominently), the ``__array_ufunc__`` protocol
in NEP-13 `[4]`_, and the ``__array_function__`` protocol in NEP-18 `[2]`_, but this has shown the
need for further protocols to be developed, including a protocol for coercion. `[5]`_. The reasons
need for further protocols to be developed, including a protocol for coercion (see `[5]`_). The reasons
these overrides are needed have been extensively discussed in the references, and this NEP will not
attempt to go into the details of why these are needed. Another pain point requiring yet another
protocol is the duck-array protocol. `[9]`_
protocol is the duck-array protocol (see `[9]`_).

This NEP takes a more holistic approach: It assumes that there are parts of the API that need to be
overridable, and that these will grow over time. It provides a general framework and a mechanism to
Expand All @@ -39,41 +47,51 @@ functions that can be easily expressed in terms of others, as well as a reposito
that help in the implementation of duck-arrays that most duck-arrays would require.

The third is the existence of actual, third party dtype packages, and
their desire to blend into the NumPy ecosystem. `[6]`_. This is a separate
their desire to blend into the NumPy ecosystem (see `[6]`_). This is a separate
issue compared to the C-level dtype redesign proposed in `[7]`_, it's about
allowing third-party dtype implementations to work with NumPy, much like third-party array
implementations.

This NEP proposes the following: That ``unumpy`` `[8]`_ becomes the recommended override mechanism
for the NumPy API.
for the parts of the NumPy API not yet covered by ``__array_function__`` or ``__array_ufunc__``,
and that ``uarray`` is vendored into a new namespace within NumPy to give users and downstream dependencies
access to these overrides. This vendoring mechanism is similar to what SciPy decided to do for
making ``scipy.fft`` overridable (see `[10]`_).


Detailed description
--------------------

This section will not attempt to explain the specifics or the mechanism of ``uarray``,
that is explained in the ``uarray`` documentation. `[1]`_ However, the NumPy community
will have input into the design of ``uarray``, and any backward-incompatible changes
will be discussed on the mailing list.
_Note that this section will not attempt to explain the specifics or the mechanism of ``uarray``,_
_that is explained in the ``uarray`` documentation. `[1]`_ However, the NumPy community_
_will have input into the design of ``uarray``, and any backward-incompatible changes_
_will be discussed on the mailing list._

The way we propose the overrides will be used by end users is::

from numpy import unumpy
TODO

The first goal of this NEP is as follows: To complete an overridable version of NumPy,
called ``unumpy`` `[8]`_, the implementation of which is already underway. Again, ``unumpy``
will not be explained here, the reader should refer to its documentation for this purpose.
And a library that implements a NumPy-like API will use it like::

TODO: example corresponding to NEP 30 `duckarray`

The only change this NEP proposes at its acceptance, is to make ``unumpy`` the officially recommended
way to override NumPy. ``unumpy`` will remain a separate repository/package, and will be developed
way to override NumPy. ``unumpy`` will remain a separate repository/package (which we propose to vendor
to avoid a hard dependency, and use the separate ``unumpy`` package only if it is installed)
rather than depend on for the time being), and will be developed
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a soft dependency, like in scipy.fft?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will probably be fine, just thinking about if we should put that in here. The key thing is vendoring. Whether to optionally override the vendored version with an installed version to be able to ship bug fixes quicker is a small detail.

Note also that it's not a normal soft dependency - that typically means that not installed == no functionality.

I'd actually leave it out, since it's not important at this point.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also that it's not a normal soft dependency - that typically means that not installed == no functionality.

Is there an accepted term for this? I was just looking for a word between hard and optional.

Either way, the "optionally overriding" behaviour is very important in order for the vendored version to play nicely with other libraries. For example, if I have a decorator @implements(unumpy.sum) such as in NEP 18, then my backend won't work at all if my unumpy.sum is not the same as the user's unumpy.sum. This is exactly the problem that we get through normal vendoring.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an accepted term for this?

Not that I know of - I think it requires a one sentence description.

then my backend won't work at all if my unumpy.sum is not the same as the user's unumpy.sum

that's a good point. isn't an issue for scipy.fft (right?) but is tricky here. I think the soft dependency solves the issue, as long as other libraries don't do the same. If they either import from the numpy vendored package, or from an installed unumpy directly that should both work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't an issue for scipy.fft (right?)

Yes, this is only an issue for numpy because unumpy is the package defining the multimethods. In scipy.fft we own the multimethods ourself so these cannot be mismatched, even if the uarray libraries were.

I think the soft dependency solves the issue

Completely agree, I just wanted to point out that shipping bug fixes faster is not the only motivation.

primarily with the input of duck-array authors and secondarily, custom dtype authors, via the usual
GitHub workflow. There are a few reasons for this:

* Faster iteration in the case of bugs or issues.
* Faster design changes, in the case of needed functionality.
* Lower likelihood to be stuck with a bad design decision.
* ``unumpy`` will work with older versions of NumPy as well.
* The user and library author opt-in to the override process,
rather than breakages happening when it is least expected.
In simple terms, bugs in ``unumpy`` mean that ``numpy`` remains
unaffected.
* The upgrade pathway to NumPy 2.0 becomes simpler, requiring just
a backend change, and allowing both to exist side-by-side.

**FIXME: this section doesn't match the proposal. in the abstract and motivation anymore.**
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rgommers Can you expand on this a little bit?

obviates the need to add an ever-growing list of new protocols for each new type of function or object that needs to become overridable.

It assumes that there are parts of the API that need to be overridable, and that these will grow over time. It provides a general framework and a mechanism to overridable, and that these will grow over time.

So it's consistent, even though it explains other use-cases. Would you like for this section to be removed? Personally, I'd at least like to mention the possibility of having such a system, and for unumpy as an alternative to other overrides.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'd at least like to mention the possibility of having such a system, and for unumpy as an alternative to other overrides.

Yes, I think it's good to mention that.

Can you expand on this a little bit?

If the whole thing is opt-in and kept in a separate namespace, as in the abstract and motivation sections, then a lot of this reads strangely. The examples don't work as np.<funcname>, because that assumes you've already completely taken over the whole namespace. It's clearer when you add the full example, so I'll comment on your new PR.


Once maturity is achieved, ``unumpy`` be moved into the NumPy organization,
and NumPy will become the reference implementation for ``unumpy``.
Expand Down Expand Up @@ -150,8 +168,13 @@ There are no backward incompatible changes proposed in this NEP.
Alternatives
------------

The current alternative to this problem, already implemented, is a
combination of NEP-18 and NEP-13.
The current alternative to this problem is NEP-30 plus adding more protocols
(not yet specified) in addition to it. Even then, some parts of the NumPy
API will remain non-overridable, so it's a partial alternative.

The main alternative to vendoring ``unumpy`` is to simply move it into NumPy
completely and not distribute it as a separate package. This would also achieve
the proposed goals, however we prefer to keep it a separate package for now.


Discussion
Expand Down Expand Up @@ -204,6 +227,11 @@ References and Footnotes

[9] NEP 30 — Duck Typing for NumPy Arrays - Implementation: https://www.numpy.org/neps/nep-0030-duck-array-protocol.html

.. _[10]:

[10] http://scipy.github.io/devdocs/fft.html#backend-control


Copyright
---------

Expand Down