-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU support proposal #150
GPU support proposal #150
Conversation
Tests for python 3.6 are failing because the way we are looking up if it's a numpy or cupy dask array did not exist in dask version 0.16.1 I'll try and figure out when that was introduced. EDITED TO ADD: |
The coveralls report is a very good argument for separating out the dispatcher function registrations from the rest of the code. Your 100% coverage is a great motivator to keep things in good shape (I confess I do need that pressure/motivation!) |
* Remove reference to closed dask issue 6442 * Dispatch internall from convolve function * Call must include an argument * Restore *args, **kwargs to function signatures * Yes, you do need to import both cupy and cupyx * Dispatch from within the convolve function * Fix up imports * PEP8 linting tidy up
dask_image/ndfilters/_conv.py
Outdated
@@ -17,7 +17,7 @@ def convolve(image, | |||
depth, boundary = _utils._get_depth_boundary(image.ndim, depth, "none") | |||
|
|||
result = image.map_overlap( | |||
scipy.ndimage.filters.convolve, | |||
convolve_dispatch(image), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little bit weird because we need to pass an argument to the dispatcher (so it can check the array type) but it returns a bare function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem a bit simpler overall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Made a couple suggestions below that may help. Though please let me know if you already tried them and ran into issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also need to make this change as well. Sorry for missing that before.
convolve_dispatch(image), | |
convolve_dispatch, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also need to make this change as well. Sorry for missing that before.
This makes more sense to me now, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Genevieve! 😄
This looks good. Made a couple minor suggestions below to hopefully simplify the usage pattern in map_overlap
calls. Otherwise LGTM
Am curious if you have had a chance to try using this as well? 🙂
|
||
@convolve_dispatch.register(np.ndarray) | ||
def numpy_convolve(*args, **kwargs): | ||
return scipy.ndimage.filters.convolve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return scipy.ndimage.filters.convolve | |
return scipy.ndimage.filters.convolve(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really what we want here? I had thought the bare function itself was what we wanted returned, and that's what gets put in as the first argument to map_overlap
. I didn't try it this way, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's typically how it is used like with sizeof
.
As Dispatch
instances (like convolve_dispatch
) should behave like a function, I think this should work here. Though please let me know if that is not the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's because sizeof
isn't using map_overlap
or similar, so you need to pass the arguments into the function for it to work. Whereas here we have image.map_overlap(func, *args, *kwargs)
and I am trying to replace func with the dispatch object (which should return func
as a bare function object that is ingested into map_overlap
).
I might just be confused about what you're trying to say here though
EDIT: I saw your comment above, I think I understand what you mean now. I'll give that a go and report back
|
||
@convolve_dispatch.register(cupy.ndarray) | ||
def cupy_convolve(*args, **kwargs): | ||
return cupyx.scipy.ndimage.filters.convolve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return cupyx.scipy.ndimage.filters.convolve | |
return cupyx.scipy.ndimage.filters.convolve(*args, **kwargs) |
dask_image/utils/_dispatcher.py
Outdated
def __call__(self, arg, *args, **kwargs): | ||
""" | ||
Call the corresponding method based on type of dask array. | ||
""" | ||
meth = self.dispatch(type(arg._meta)) | ||
return meth(arg, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with the changes below we can drop this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what you mean by dropping this.
The important part here is that we use type(arg._meta)
to dispatch according to the type of arrays contained in the dask array. So that part matters, but other than that I don't mind what we do.
dask_image/ndfilters/_conv.py
Outdated
@@ -17,7 +17,7 @@ def convolve(image, | |||
depth, boundary = _utils._get_depth_boundary(image.ndim, depth, "none") | |||
|
|||
result = image.map_overlap( | |||
scipy.ndimage.filters.convolve, | |||
convolve_dispatch(image), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. Made a couple suggestions below that may help. Though please let me know if you already tried them and ran into issues.
I've only just noticed this, but it seems that when we pass in cupy/dask arrays and dispatch to cupyx.scipy what we get back is a dask array backed by numpy (not cupy). This seems unintuitive to me, is that what's supposed to happen here? I'm not exactly sure at which point of the computation things get turned back into numpy arrays. Here's a gist that shows what I'm describing: https://gist.github.com/GenevieveBuckley/332a50020ce35baf3cea763e95d12bd8 Did you see anything like this when you did your initial exploration @jakirkham? |
Oh hang on, I see this did happen to you and you've added a line Maybe the best way to handle this overall is to add a chunktype kwarg to |
Update: there is a ... but we only get this type error in dispatching if the weights array is an ordinary cupy array, not a dask array containing cupy chunks. ... and if I do use two dask arrays backed by cupy (for |
Updates:
|
cc @grlee77 (who may be interested in this as well 😉) |
Also thanks for the updates here Genevieve. Sorry for the slow reply. Glad Peter's work was helpful. Happy to work through CuPy dispatch issues (even non-minimal ones) if you need another set of eyes 🙂 |
Thanks @jakirkham I appreciate your willingness to help track down the weirdness. (Apologies for my own slow response, I was on leave from work last week. Now I'm back, I'd love to use the upcoming conference as motivation to get this in and done by the end of August if possible - although I understand if your time is much more limited here). Here is a non-minimal example of the problem with lazy cupy dispatching. You will need a python environment with:
Then you can run the code example from my first comment in this thread with dask-image installed from each of these two branches in turn:
This branch works! (But it requires cupy to be installed by the user, which is very inconvenient)
This branch does not work :( There's only a single commit difference between these branches, where I turn the lazy cupy dispatch into a non-lazy dispatch. All of the code related to this PR can be found in just two files: |
Ok, I understand what's going on now. Because We didn't see this problem earlier because before I had For this reason I think we need to return to the pattern we had earlier, where the dispatcher returns a bare function object. (I am open to other solutions if you wanted to suggest something different). Once you're happy with it, we can roll it out to the rest of the |
* Remove reference to closed dask issue 6442 * John's suggestions for streamlining dispatch * Use meta kwarg to fix array chunktype repr * This shouldn't make a difference, but checking just in case * Fix up dispatch lazy registration, and output array metadata * Non-lazy cupy dispatcher * Lazy cupy dispatching again * Propagate array type metadata through map_overlap * Try being more specific about the cupy array type * Remove commented trial & error * flake8 - remove unused import * Add docstrings for dispatch helper functions * Bump dask minimum version to 2.8.1 in CircleCI * Bump minimum dask version to 2.8.1 for all CI services
Speaking of rolling it out, I've started another branch here #151 to make those broader changes to the rest of the library. Both ndfourier and ndmeasure have some numpy specific array creation code hiding in there, so if you have any suggestions for handling those areas I'd love to hear your thoughts. |
Here's the code showing how we might implement GPU support for the
convolve
function. Relevant issue: #133Two questions:
You can try this PR out like this: