-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes ravel failure on 1d arrays (#5229) #8418
Conversation
Many thanks for the PR! gpuci run tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for the PR! This does fix the issue reported in #5229, but I think it still leaves some difference in behaviour between NumPy and Numba, because NumPy returns a new array with the same data, rather than returning the same array. For example,
import numpy as np
from numba import cuda
x = np.arange(10)
dx = cuda.to_device(x)
rx = x.ravel()
rdx = dx.ravel()
print(f"host array is the ravelled host array: {x is rx}")
print(f"device array is the ravelled device array: {dx is rdx}")
host_data = x.__array_interface__['data'][0]
host_ravelled_data = rx.__array_interface__['data'][0]
same_host_data = host_data == host_ravelled_data
print(f"host ravelled data is host original_data: {same_host_data}")
device_data = dx.__cuda_array_interface__['data'][0]
device_ravelled_data = rdx.__cuda_array_interface__['data'][0]
same_device_data = device_data == device_ravelled_data
print(f"device ravelled data is device original_data: {same_device_data}")
prints
host array is the ravelled host array: False
device array is the ravelled device array: True
host ravelled data is host original_data: True
device ravelled data is device original_data: True
I think the real problem is here:
numba/numba/misc/dummyarray.py
Line 386 in c619212
return self |
The dummy array's ravel is only returning itself, but it should also return a list of extents.
Thanks for the comments. I never realized that a "view" was a shallow copy as opposed to sometimes returning the same array, but you are right. Numpy ravel on a contiguous array will return an array which shares data with the original, but is not the same array (different pointers to the same memory). For reference, here is their implementation. I'm not sure where to take it from here. Simply eliminating the following line makes your code "work" (in the sense that it returns a view and not the array), but I don't know if it will make other tests break (I don't have a GPU at the moment to test), and I also don't know all the minutiae of the Numpy implementation to be able to come up with an equivalent implementation. numba/numba/misc/dummyarray.py Line 386 in c619212
With that said, I'm happy to try refactoring dummyarray so as to not "discriminate" against 1D arrays. |
Ah, sorry for not being more clear. In particular for When the code was refactored into the CUDA Array / I think the right direction to move things in here is to return diff --git a/numba/misc/dummyarray.py b/numba/misc/dummyarray.py
index 8debc87b3..de88e0bf6 100644
--- a/numba/misc/dummyarray.py
+++ b/numba/misc/dummyarray.py
@@ -383,7 +383,7 @@ class Array(object):
raise ValueError('order not C|F|A')
if self.ndim <= 1:
- return self
+ return self, list(self.iter_contiguous_extent())
elif (order in 'CA' and self.is_c_contig or
order in 'FA' and self.is_f_contig): However, that doesn't get us all the way there for matching NumPy's behaviour for import numpy as np
a = np.arange(10)
a_view = av = a[::2]
a_view_ravel = a_view.ravel()
print(f"a: {a}")
print(f"a_view: {a_view}")
print(f"a_view_ravel: {a_view_ravel}")
print(f"a_view_flags: {a_view.flags}")
a_view_data = a_view.__array_interface__['data'][0]
a_view_ravel_data = a_view_ravel.__array_interface__['data'][0]
print(f"a_view_data: {a_view_data}")
print(f"a_view_ravel_data: {a_view_ravel_data}") Here numba/numba/cuda/cudadrv/devicearray.py Lines 617 to 623 in c619212
My hunch is that From here we could either:
There might be other discrepancies between the Numba CUDA implementation and NumPy arrays (I have not looked in detail at the NumPy implementation you identified) but I think the above options move us a good way forward towards finishing this PR. How does the above seem? Would some further guidance be helpful? |
Thank you for the explanation, it does make a lot of sense! I think for this PR we could focus on the second option. The first option (supporting ravel on non-contiguous array) might be more suitable for a PR which implements |
OK - that sounds good to proceed - please let me know if I can provide more help in following up. |
do not test stride on cudasim
|
gpuci run tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for the updates! I think this is now looking mostly good, with one question to resolve:
This has been achieved by calling
Array.from_desc
instead ofself.from_desc
inArray.ravel
innumba/misc/dummyarray.py
. By calling the class methodfrom_desc
with the base class instead of the instance, we ensure that it returns a new object instead of itself.
I don't think that's what's going on here - as far as I can tell, a classmethod always receives the class as an implicit first argument (although the docs are not completely explicit about this, it appears to read that way: https://docs.python.org/3/library/functions.html#classmethod). Notably, if I apply:
diff --git a/numba/misc/dummyarray.py b/numba/misc/dummyarray.py
index 3f3815eb4..7ccb4f314 100644
--- a/numba/misc/dummyarray.py
+++ b/numba/misc/dummyarray.py
@@ -386,8 +386,8 @@ class Array(object):
or order in 'FA' and self.is_f_contig):
newshape = (self.size,)
newstrides = (self.itemsize,)
- arr = Array.from_desc(self.extent.begin, newshape, newstrides,
- self.itemsize)
+ arr = self.from_desc(self.extent.begin, newshape, newstrides,
+ self.itemsize)
return arr, list(self.iter_contiguous_extent())
else:
to undo the change, then all the tests still appear to pass - I think earlier on in the patch series they might not have passed because of the 1D special-casing that was in devicearray.py
, but that has since been removed.
Can you try without this change please? Or if I'm misunderstanding what's happening (a completely plausible scenario 🙂) could you clarify what's happening for me please?
You are right, that change is not necessary, calling with class or class instance is the same. I reverted the change. |
gpuci run tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cako many thanks for the update! There's a slight indentation change to resolve (left over from earlier) and then I think this is good to go!
Co-authored-by: Graham Markall <535640+gmarkall@users.noreply.github.com>
Thanks for the fix! gpuci run tests |
Whoops, sorry that I totally missed this! @esc could this have a buildfarm run please? |
You got it! |
This was all green. 🍏 |
As reported in #5229, calling
ravel
on aDeviceNDArray
returns an error. The issue stems fromArray
indummyarray.py
returning the array itself instead of both anArray
and anElement
: https://github.com/numba/numba/blob/main/numba/misc/dummyarray.py#L385-L386This causes the tuple unpacking in
DeviceNDArray.ravel()
to fail as it expectsArray
andElement
.In this PR, a simple fix is issued in
DeviceNDArray.ravel()
by just returningself
whenndim <= 1
, which is whatArray.ravel()
does. This PR also adds a test for 1d ravels.