Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement most ufunc attributes and ufunc.reduce #9123

Merged
merged 20 commits into from Sep 27, 2023

Conversation

guilhermeleobas
Copy link
Collaborator

@guilhermeleobas guilhermeleobas commented Aug 7, 2023

This PR exposes most ufunc attributes and implements ufunc.reduce for ufuncs created with @vectorize

  • ufunc.nin
  • ufunc.nout
  • ufunc.nargs
  • ufunc.ntypes
  • ufunc.types
  • ufunc.identity
  • ufunc.signature (always None)

Implementing ufunc.ntypes or ufunc.types is not possible as Numba needs to call the NumPy ufunc attribute at runtime. However, this is not possible as NumPy doesn't allow creating weak references to an ufunc, which is required by types.BoundFunction.


To-do:

  • Check how Cython implements ufunc.reduce
  • Raise an exception if ufunc doesn't have an identity.

Reduce is implemented for the following set of arguments:

def reduce(array, axis=0, dtype=None, initial=None):
    ...

keeping #1269 (comment) as a reference here.


Code I used to benchmark:

from numba import vectorize, njit
import numpy as np
import time

@vectorize(identity=0)
def add(a, b):
    return a + b


@njit
def bar(a, axis):
    return add.reduce(a, axis=axis)


def measure_time(fn, *args):
    start = time.time()
    fn(*args)
    end = time.time()
    print(f'{fn.__name__}: {end - start}s')


N = 1_000_000_000
T = (10,) * int(np.log10(N))
a = np.arange(N).reshape(T)
axis = tuple(range(a.ndim))

# jit compile `bar`
bar(a, axis)

# measure time
measure_time(bar, a, axis)
measure_time(bar.py_func, a, axis)

@guilhermeleobas
Copy link
Collaborator Author

@kc611, could you please take a look at this PR once you have a chance? I'd like to discuss possible ways to speedup ufunc.reduce

@gmarkall gmarkall requested a review from kc611 August 8, 2023 14:24
@guilhermeleobas guilhermeleobas marked this pull request as ready for review August 15, 2023 12:35
Copy link
Contributor

@mattip mattip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. It would be nice if numba could do reduce ufuncs.

# But this fails as it cannot a weak reference to an ufunc due to NumPy
# not setting the "tp_weaklistoffset" field. See:
# https://github.com/numpy/numpy/blob/7fc72776b972bfbfdb909e4b15feb0308cf8adba/numpy/core/src/umath/ufunc_object.c#L6968-L6983 # noqa: E501

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does NumPy need to explicitly set the field? From reading the documentation I would think it inherits from the object type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When attempting to create a weak reference to an ufunc, CPython raises an error:

>>> import weakref
>>> import numpy as np
>>> weakref.ref(np.minimum)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create weak reference to 'numpy.ufunc' object

numba/np/ufunc/dufunc.py Show resolved Hide resolved
return s

@intrinsic
def compute_flat_idx__(typingctx, strides, itemsize, idx, axis):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: The compute_flat_idx__ is not being used anywhere. Is that intended?

And is it overall faster or slower than compute_flat_idx?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute_flat_idx__ implements the same code as compute_flat_idx but it is just a bit switch table where each case is computed by hand beforehand. In short, it looks something like this:

def compute_flat_idx(axis, ...):
    if axis == 0:
        # compute flat idx when axis == 0
    else axis == 1:
        # compute flat idx when axis == 1
    ...

And is it overall faster or slower than compute_flat_idx?

Faster

numba/np/ufunc/dufunc.py Show resolved Hide resolved
# https://github.com/numpy/numpy/blob/7fc72776b972bfbfdb909e4b15feb0308cf8adba/numpy/core/src/umath/ufunc_object.c#L6968-L6983 # noqa: E501

at = types.Function(template)
attributes = ('nin', 'nout', 'nargs', # 'ntypes', # 'types',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incapability of accessing types can be opened as a separate issue, to be fixed once there is a solution upstream. Otherwise if we don't expect a solution in future, might be a good idea to remove the commented code.

numba/tests/npyufunc/test_dufunc.py Outdated Show resolved Hide resolved
numba/tests/npyufunc/test_dufunc.py Show resolved Hide resolved
numba/np/ufunc/dufunc.py Show resolved Hide resolved
numba/tests/npyufunc/test_dufunc.py Outdated Show resolved Hide resolved
numba/np/ufunc/dufunc.py Show resolved Hide resolved
numba/np/ufunc/dufunc.py Show resolved Hide resolved
numba/np/ufunc/dufunc.py Show resolved Hide resolved
@kc611
Copy link
Collaborator

kc611 commented Sep 11, 2023

@guilhermeleobas Thanks for the patch. This PR mostly looks good. Some nitpicks and questions above before we can mark this as RTM.

Edit: This also needs a documentation entry that support for ufunc.reduce on DUFunc objects is now available.

@guilhermeleobas
Copy link
Collaborator Author

CI failure doesn't seem to be related to this PR.

@guilhermeleobas guilhermeleobas added 4 - Waiting on reviewer Waiting for reviewer to respond to author and removed 3 - Ready for Review labels Sep 15, 2023
Copy link
Collaborator

@kc611 kc611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@gmarkall
Copy link
Member

gpuci run tests

@gmarkall
Copy link
Member

@kc611 Is this good to be RTM (assuming it passes the gpuCI run I just started)?

@kc611
Copy link
Collaborator

kc611 commented Sep 18, 2023

I think it's good to go. (Unless someone objects)

@gmarkall
Copy link
Member

gpuCI failed one CUDA version for some kind of Conda error. I expected if this was going to break CUDA, it'd break for all configurations, so I'm happy to ignore that.

@gmarkall gmarkall added 5 - Ready to merge Review and testing done, is ready to merge and removed 4 - Waiting on reviewer Waiting for reviewer to respond to author labels Sep 18, 2023
@guilhermeleobas
Copy link
Collaborator Author

guilhermeleobas commented Sep 18, 2023

Thanks for the review, folks. I'll create the issues to track future work:

  • Rewrite ufunc.reduce to not use np.ndenumerate
  • Remove experimental feature warning in 2(?) Numba versions?
  • Support ufunc.ntypes and ufunc.types attributes

Also, if it is ok, I'd like another commit that updates documentation and adds a test. Can do that in a second PR.

@sklam sklam added this to the 0.59.0-rc1 milestone Sep 27, 2023
@sklam sklam merged commit 0c04c45 into numba:main Sep 27, 2023
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to merge Review and testing done, is ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants