-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Description
I would like for there to be an easy way to convert a numpy.ndarray -> numpy.ndarray Python function into a generalized universal function (gufunc).
Motivation
In many cases we have a Python function that operates on a multi-dimensional array and produces another multi-dimensional array. As an example, consider the Scikit-Image function, skimage.feature.canny, which consumes and produces a 2d NumPy array.
raw_image = np.random.random((1000, 1000))
processed_image = skimage.feature.canny(raw_image)If I have a stack of these images I would like for the same function to broadcast across extra dimensions.
raw_stack = np.random.random((10, 1000, 1000))
processed_stack = skimage.feature.canny(raw_stack) # this doesn't work todayThis logic could be placed inside each function, but it would be nice to implement it it once and then decorate functions in the future
@numpy.guvectorize(signature="(n, m) -> (n, m)") # or something like this
def canny(...):
...GUFuncs
The behavior that I want there is exactly the behavior of GUFuncs, but currently they're hard to construct, except with CPython or with Numba.
In many cases a pure Python decorator that just used for loops would be welcome here. The performance hit of using Python here is relatively low because the function that we're looping over is fairly costly (often 100s of milliseconds). This is the sort of decorator that a project like Scikit-Image (cc @jni and @stefanv ) or ITK (cc @thewtex) would probably be happy to use.
From a Dask perspective this would be great because we can handle gufuncs nicely, rechunking user-provided Dask arrays so that all core dimensions are single-chunked and then automatically and lazily applied across broadcast dimensions.
This conversation came out of the SciPy conference, where @seberg and @mattip discussed it.