GPU support proposal #150

GenevieveBuckley · 2020-08-03T02:35:06Z

Here's the code showing how we might implement GPU support for the convolve function. Relevant issue: #133

Two questions:

If we use this pattern, how should I update the docstring wrapper? (Do I just decorate the dispatcher instance, or a method thereof?)
Is there a free tier including GPU support available to us for CI? Ideally we'd want to have tests for cupy arrays too.

You can try this PR out like this:

pip uninstall dask_image
pip install git+https://github.com/GenevieveBuckley/dask-image.git@dispatch

import numpy as np
import cupy as cp
import dask.array as da
from dask_image.ndfilters import convolve

s = (10, 10)

# convolve with numpy
a = da.from_array(np.arange(int(np.prod(s)), dtype=np.float32).reshape(s), chunks=5)
w = np.ones(a.ndim * (3,), dtype=np.float32)
result = convolve(a, w)
result.compute()

# convolve with cupy
a = da.from_array(cp.arange(int(np.prod(s)), dtype=cp.float32).reshape(s), chunks=5)
w = cp.ones(a.ndim * (3,), dtype=cp.float32)
result = convolve(a, w)
result.compute()

GenevieveBuckley · 2020-08-03T04:09:47Z

Tests for python 3.6 are failing because the way we are looking up if it's a numpy or cupy dask array did not exist in dask version 0.16.1

I'll try and figure out when that was introduced.

EDITED TO ADD: _meta was introduced in dask/dask#4543. I've updated our python 3.6 CI environment to have the same dependencies as our python 3.7 env, and that seems to have fixed it. That in itself is a bit confusing, since dask=1.1.1 seems to have been released several months before dask/dask#4543, but it works so ¯\_(ツ)_/¯

GenevieveBuckley · 2020-08-03T05:12:09Z

The coveralls report is a very good argument for separating out the dispatcher function registrations from the rest of the code. Your 100% coverage is a great motivator to keep things in good shape (I confess I do need that pressure/motivation!)

dask_image/ndfilters/_conv.py

* Remove reference to closed dask issue 6442 * Dispatch internall from convolve function * Call must include an argument * Restore *args, **kwargs to function signatures * Yes, you do need to import both cupy and cupyx * Dispatch from within the convolve function * Fix up imports * PEP8 linting tidy up

GenevieveBuckley · 2020-08-04T08:40:20Z

dask_image/ndfilters/_conv.py

@@ -17,7 +17,7 @@ def convolve(image,
    depth, boundary = _utils._get_depth_boundary(image.ndim, depth, "none")

    result = image.map_overlap(
-        scipy.ndimage.filters.convolve,
+        convolve_dispatch(image),


This is a little bit weird because we need to pass an argument to the dispatcher (so it can check the array type) but it returns a bare function

It does seem a bit simpler overall

That's a good point. Made a couple suggestions below that may help. Though please let me know if you already tried them and ran into issues.

We may also need to make this change as well. Sorry for missing that before.

Suggested change

convolve_dispatch(image),

convolve_dispatch,

We may also need to make this change as well. Sorry for missing that before.

This makes more sense to me now, thanks!

dask_image/ndfilters/_conv.py

jakirkham

Thanks Genevieve! 😄

This looks good. Made a couple minor suggestions below to hopefully simplify the usage pattern in map_overlap calls. Otherwise LGTM

Am curious if you have had a chance to try using this as well? 🙂

jakirkham · 2020-08-06T02:43:03Z

dask_image/utils/_dispatcher.py

+
+@convolve_dispatch.register(np.ndarray)
+def numpy_convolve(*args, **kwargs):
+    return scipy.ndimage.filters.convolve


Suggested change

return scipy.ndimage.filters.convolve

return scipy.ndimage.filters.convolve(*args, **kwargs)

Is this really what we want here? I had thought the bare function itself was what we wanted returned, and that's what gets put in as the first argument to map_overlap. I didn't try it this way, though.

That's typically how it is used like with sizeof.

As Dispatch instances (like convolve_dispatch) should behave like a function, I think this should work here. Though please let me know if that is not the case.

I think that's because sizeof isn't using map_overlap or similar, so you need to pass the arguments into the function for it to work. Whereas here we have image.map_overlap(func, *args, *kwargs) and I am trying to replace func with the dispatch object (which should return func as a bare function object that is ingested into map_overlap).

I might just be confused about what you're trying to say here though

EDIT: I saw your comment above, I think I understand what you mean now. I'll give that a go and report back

jakirkham · 2020-08-06T02:43:19Z

dask_image/utils/_dispatcher.py

+
+    @convolve_dispatch.register(cupy.ndarray)
+    def cupy_convolve(*args, **kwargs):
+        return cupyx.scipy.ndimage.filters.convolve


Suggested change

return cupyx.scipy.ndimage.filters.convolve

return cupyx.scipy.ndimage.filters.convolve(*args, **kwargs)

jakirkham · 2020-08-06T02:45:38Z

dask_image/utils/_dispatcher.py

+    def __call__(self, arg, *args, **kwargs):
+        """
+        Call the corresponding method based on type of dask array.
+        """
+        meth = self.dispatch(type(arg._meta))
+        return meth(arg, *args, **kwargs)


I think with the changes below we can drop this.

I'm not sure I understand what you mean by dropping this.
The important part here is that we use type(arg._meta) to dispatch according to the type of arrays contained in the dask array. So that part matters, but other than that I don't mind what we do.

jakirkham · 2020-08-06T02:46:31Z

dask_image/ndfilters/_conv.py

@@ -17,7 +17,7 @@ def convolve(image,
    depth, boundary = _utils._get_depth_boundary(image.ndim, depth, "none")

    result = image.map_overlap(
-        scipy.ndimage.filters.convolve,
+        convolve_dispatch(image),


That's a good point. Made a couple suggestions below that may help. Though please let me know if you already tried them and ran into issues.

GenevieveBuckley · 2020-08-07T08:17:55Z

I've only just noticed this, but it seems that when we pass in cupy/dask arrays and dispatch to cupyx.scipy what we get back is a dask array backed by numpy (not cupy). This seems unintuitive to me, is that what's supposed to happen here? I'm not exactly sure at which point of the computation things get turned back into numpy arrays.

Here's a gist that shows what I'm describing: https://gist.github.com/GenevieveBuckley/332a50020ce35baf3cea763e95d12bd8

Did you see anything like this when you did your initial exploration @jakirkham?

GenevieveBuckley · 2020-08-07T08:37:47Z

Oh hang on, I see this did happen to you and you've added a line r._meta = cp.empty(r.ndim * (0,), dtype=r.dtype) in your code here to overwrite what we see there. Further digging reveals the chunks in our output dask array are actually cupy (so that's a relief).

Maybe the best way to handle this overall is to add a chunktype kwarg to map_blocks, similar to the dtype kwarg? We could handle it internally the way you do in your script within dask-image (if you didn't want to pin to a much higher minimum dask version, or wait on a new dask release if we made a PR there)

GenevieveBuckley · 2020-08-07T09:20:10Z

Update: there is a meta kwarg you can use in map_overlap, but if I use meta=image._meta then that seems to break our dispatching (throws up a TypeError: No dispatch for <class 'cupy.core.core.ndarray'>)

... but we only get this type error in dispatching if the weights array is an ordinary cupy array, not a dask array containing cupy chunks.

... and if I do use two dask arrays backed by cupy (for image and weights), even though we don't get the TypeError, it still reports a numpy backed array as the result. Ok, this is enough live debugging for now, I'll pick this up later.

GenevieveBuckley · 2020-08-08T13:14:43Z

Updates:

The issue with meta= is now fixed in the dask master branch, courtesy of Peter Add missing meta= for trim in map_overlap dask#6494
I'm having some trouble with cupy + register_lazy for dispatch. I haven't distilled this down to a minimal example yet though, still not entirely sure where the problem is creeping in.

jakirkham · 2020-08-12T19:44:16Z

cc @grlee77 (who may be interested in this as well 😉)

jakirkham · 2020-08-12T19:48:08Z

Also thanks for the updates here Genevieve. Sorry for the slow reply. Glad Peter's work was helpful. Happy to work through CuPy dispatch issues (even non-minimal ones) if you need another set of eyes 🙂

GenevieveBuckley · 2020-08-17T01:29:36Z

Thanks @jakirkham I appreciate your willingness to help track down the weirdness. (Apologies for my own slow response, I was on leave from work last week. Now I'm back, I'd love to use the upcoming conference as motivation to get this in and done by the end of August if possible - although I understand if your time is much more limited here).

Here is a non-minimal example of the problem with lazy cupy dispatching. You will need a python environment with:

dask=2.23.0 (the latest release)
numpy
cupy

Then you can run the code example from my first comment in this thread with dask-image installed from each of these two branches in turn:

pip install git+https://github.com/GenevieveBuckley/dask-image.git@working-cupy-non-lazy-dispatch

This branch works! (But it requires cupy to be installed by the user, which is very inconvenient)

pip install git+https://github.com/GenevieveBuckley/dask-image.git@failing-cupy-lazy-dispatch

This branch does not work :(

There's only a single commit difference between these branches, where I turn the lazy cupy dispatch into a non-lazy dispatch. All of the code related to this PR can be found in just two files: dsak_image/ndfilters/_conv.py and dask_image/utils/_dispatcher.py

GenevieveBuckley · 2020-08-17T03:49:30Z

Ok, I understand what's going on now. Because map_overlap delays computation, the lazy cupy dispatch function isn't registered at all until the first time you try and run compute() - by which time it's too late to actually use it for the computation. If you try the example above, and run result.compute() twice in a row, then the second time it works. (You can also take a look at dispatch_convolve._lookup before and after each compute call to see this directly)

We didn't see this problem earlier because before I had image.map_overlap(dispatch_convolve(image), ...) which executes dispatch_convolve and registers the cupy function straight away.

For this reason I think we need to return to the pattern we had earlier, where the dispatcher returns a bare function object. (I am open to other solutions if you wanted to suggest something different). Once you're happy with it, we can roll it out to the rest of the dask-image library.

* Remove reference to closed dask issue 6442 * John's suggestions for streamlining dispatch * Use meta kwarg to fix array chunktype repr * This shouldn't make a difference, but checking just in case * Fix up dispatch lazy registration, and output array metadata * Non-lazy cupy dispatcher * Lazy cupy dispatching again * Propagate array type metadata through map_overlap * Try being more specific about the cupy array type * Remove commented trial & error * flake8 - remove unused import * Add docstrings for dispatch helper functions * Bump dask minimum version to 2.8.1 in CircleCI * Bump minimum dask version to 2.8.1 for all CI services

GenevieveBuckley · 2020-08-17T08:03:21Z

Speaking of rolling it out, I've started another branch here #151 to make those broader changes to the rest of the library. Both ndfourier and ndmeasure have some numpy specific array creation code hiding in there, so if you have any suggestions for handling those areas I'd love to hear your thoughts.

GenevieveBuckley · 2020-08-26T21:59:25Z

This has served its purpose as a concrete example to discuss the implementation of #133. So, I'll close this PR now in favour of #151 and future work.

GenevieveBuckley added 8 commits August 3, 2020 11:52

GPU dispatch for convolve

176da77

Make convolve function generic for dispatch

bcf01a2

Add forgotten numpy import

aac1e9a

Fix convolve dispatcher name

4ec4360

Lazy cupyx import

246277d

Lazy import of cupyx.scipy.ndimage

3a7879e

Fix up flake8 warnings

bbba7c6

Flake8 noqa added

2bd4933

Update minimum dask dependency to dask==1.1.1

66e063f

GenevieveBuckley mentioned this pull request Aug 3, 2020

Get the underlying array type for a Dask array dask/dask#6442

Closed

jakirkham reviewed Aug 3, 2020

View reviewed changes

dask_image/ndfilters/_conv.py Outdated Show resolved Hide resolved

GenevieveBuckley commented Aug 4, 2020

View reviewed changes

dask_image/ndfilters/_conv.py Show resolved Hide resolved

jakirkham reviewed Aug 6, 2020

View reviewed changes

GenevieveBuckley mentioned this pull request Aug 17, 2020

GPU support for ndfilters & imread modules #151

Merged

3 tasks

GenevieveBuckley added 2 commits August 17, 2020 18:00

Update _dispatcher.py

2f8e29a

Update _conv.py

e4e15e5

GenevieveBuckley changed the title ~~GPU support~~ GPU support proposal Aug 26, 2020

GenevieveBuckley closed this Aug 26, 2020

GenevieveBuckley mentioned this pull request Sep 1, 2020

GPU support for ndmorph, binary morphological functions #157

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU support proposal #150

GPU support proposal #150

GenevieveBuckley commented Aug 3, 2020 •

edited

Loading

GenevieveBuckley commented Aug 3, 2020 •

edited

Loading

GenevieveBuckley commented Aug 3, 2020

GenevieveBuckley Aug 4, 2020

GenevieveBuckley Aug 4, 2020

jakirkham Aug 6, 2020 •

edited

Loading

jakirkham Aug 6, 2020

GenevieveBuckley Aug 6, 2020

jakirkham left a comment

jakirkham Aug 6, 2020

GenevieveBuckley Aug 6, 2020

jakirkham Aug 6, 2020

GenevieveBuckley Aug 6, 2020 •

edited

Loading

jakirkham Aug 6, 2020

jakirkham Aug 6, 2020

GenevieveBuckley Aug 6, 2020

jakirkham Aug 6, 2020 •

edited

Loading

GenevieveBuckley commented Aug 7, 2020

GenevieveBuckley commented Aug 7, 2020

GenevieveBuckley commented Aug 7, 2020 •

edited

Loading

GenevieveBuckley commented Aug 8, 2020

jakirkham commented Aug 12, 2020

jakirkham commented Aug 12, 2020

GenevieveBuckley commented Aug 17, 2020 •

edited

Loading

GenevieveBuckley commented Aug 17, 2020

GenevieveBuckley commented Aug 17, 2020

GenevieveBuckley commented Aug 26, 2020

	return scipy.ndimage.filters.convolve
	return scipy.ndimage.filters.convolve(args, *kwargs)

	return cupyx.scipy.ndimage.filters.convolve
	return cupyx.scipy.ndimage.filters.convolve(args, *kwargs)

GPU support proposal #150

GPU support proposal #150

Conversation

GenevieveBuckley commented Aug 3, 2020 • edited Loading

GenevieveBuckley commented Aug 3, 2020 • edited Loading

GenevieveBuckley commented Aug 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakirkham Aug 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakirkham left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GenevieveBuckley Aug 6, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakirkham Aug 6, 2020 • edited Loading

Choose a reason for hiding this comment

GenevieveBuckley commented Aug 7, 2020

GenevieveBuckley commented Aug 7, 2020

GenevieveBuckley commented Aug 7, 2020 • edited Loading

GenevieveBuckley commented Aug 8, 2020

jakirkham commented Aug 12, 2020

jakirkham commented Aug 12, 2020

GenevieveBuckley commented Aug 17, 2020 • edited Loading

GenevieveBuckley commented Aug 17, 2020

GenevieveBuckley commented Aug 17, 2020

GenevieveBuckley commented Aug 26, 2020

GenevieveBuckley commented Aug 3, 2020 •

edited

Loading

GenevieveBuckley commented Aug 3, 2020 •

edited

Loading

jakirkham Aug 6, 2020 •

edited

Loading

GenevieveBuckley Aug 6, 2020 •

edited

Loading

jakirkham Aug 6, 2020 •

edited

Loading

GenevieveBuckley commented Aug 7, 2020 •

edited

Loading

GenevieveBuckley commented Aug 17, 2020 •

edited

Loading