Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes ravel failure on 1d arrays (#5229) #8418

Merged
merged 8 commits into from
Dec 5, 2022
Merged

Conversation

cako
Copy link
Contributor

@cako cako commented Sep 3, 2022

As reported in #5229, calling ravel on a DeviceNDArray returns an error. The issue stems from Array in dummyarray.py returning the array itself instead of both an Array and an Element: https://github.com/numba/numba/blob/main/numba/misc/dummyarray.py#L385-L386
This causes the tuple unpacking in DeviceNDArray.ravel() to fail as it expects Array and Element.

In this PR, a simple fix is issued in DeviceNDArray.ravel() by just returning self when ndim <= 1, which is what Array.ravel() does. This PR also adds a test for 1d ravels.

@cako cako requested a review from gmarkall as a code owner September 3, 2022 05:25
@cako cako changed the title fixes #5229 fixes ravel failure on 1d arrays (#5229) Sep 3, 2022
@cako cako changed the title fixes ravel failure on 1d arrays (#5229) Fixes ravel failure on 1d arrays (#5229) Sep 3, 2022
@stuartarchibald stuartarchibald added CUDA CUDA related issue/PR 3 - Ready for Review labels Sep 5, 2022
@gmarkall
Copy link
Member

gmarkall commented Sep 6, 2022

Many thanks for the PR!

gpuci run tests

@gmarkall gmarkall self-assigned this Sep 6, 2022
@gmarkall gmarkall added this to the Numba 0.57 RC milestone Sep 6, 2022
Copy link
Member

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the PR! This does fix the issue reported in #5229, but I think it still leaves some difference in behaviour between NumPy and Numba, because NumPy returns a new array with the same data, rather than returning the same array. For example,

import numpy as np
from numba import cuda

x = np.arange(10)
dx = cuda.to_device(x)

rx = x.ravel()
rdx = dx.ravel()

print(f"host array is the ravelled host array: {x is rx}")
print(f"device array is the ravelled device array: {dx is rdx}")

host_data = x.__array_interface__['data'][0]
host_ravelled_data = rx.__array_interface__['data'][0]
same_host_data = host_data == host_ravelled_data
print(f"host ravelled data is host original_data: {same_host_data}")


device_data = dx.__cuda_array_interface__['data'][0]
device_ravelled_data = rdx.__cuda_array_interface__['data'][0]
same_device_data = device_data == device_ravelled_data
print(f"device ravelled data is device original_data: {same_device_data}")

prints

host array is the ravelled host array: False
device array is the ravelled device array: True
host ravelled data is host original_data: True
device ravelled data is device original_data: True

I think the real problem is here:

return self

The dummy array's ravel is only returning itself, but it should also return a list of extents.

@gmarkall gmarkall added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review labels Sep 8, 2022
@cako
Copy link
Contributor Author

cako commented Sep 9, 2022

Thanks for the comments. I never realized that a "view" was a shallow copy as opposed to sometimes returning the same array, but you are right. Numpy ravel on a contiguous array will return an array which shares data with the original, but is not the same array (different pointers to the same memory). For reference, here is their implementation.

I'm not sure where to take it from here. Simply eliminating the following line makes your code "work" (in the sense that it returns a view and not the array), but I don't know if it will make other tests break (I don't have a GPU at the moment to test), and I also don't know all the minutiae of the Numpy implementation to be able to come up with an equivalent implementation.

return self

With that said, I'm happy to try refactoring dummyarray so as to not "discriminate" against 1D arrays.

@gmarkall
Copy link
Member

gmarkall commented Sep 9, 2022

Ah, sorry for not being more clear. dummyarray.Array is not really like a NumPy array so it doesn't have the same API for its methods - its purpose and design is (I believe) to hold the bits of implementation needed for device arrays common to different targets - at one point it was used by both CUDA and ROCm. Now that ROCm got deprecated and removed, only the CUDA target is using them.

In particular for dummyarray.Array.ravel(), the interface is (not specified anywhere concrete, only implied to be by the implementation) is that it is a function that accepts the order parameter, and returns a tuple of a dummy array (which may or may not be self and a list of extents).

When the code was refactored into the CUDA Array / dummyarray.Array split, it looks like this interface was violated by dummyarray.Array.ravel() when ndim is 1, and it doesn't return a tuple, only itself (see 3dc6214), and there was never a specific test that hit this, so the error never got caught.

I think the right direction to move things in here is to return self, and contiguous extents generated with iter_contiguous_extent() - a minimal change implementing this would look like:

diff --git a/numba/misc/dummyarray.py b/numba/misc/dummyarray.py
index 8debc87b3..de88e0bf6 100644
--- a/numba/misc/dummyarray.py
+++ b/numba/misc/dummyarray.py
@@ -383,7 +383,7 @@ class Array(object):
             raise ValueError('order not C|F|A')
 
         if self.ndim <= 1:
-            return self
+            return self, list(self.iter_contiguous_extent())
 
         elif (order in 'CA' and self.is_c_contig or
                           order in 'FA' and self.is_f_contig):

However, that doesn't get us all the way there for matching NumPy's behaviour for ravel() - the exception message (NotImplementedError("ravel on non-contiguous array")) is now a bit misleading and gives us a clue as to what else could be wrong - it seems you can ravel a non-contiguous 1D array. E.g.:

import numpy as np

a = np.arange(10)
a_view = av = a[::2]
a_view_ravel = a_view.ravel()

print(f"a: {a}")
print(f"a_view: {a_view}")
print(f"a_view_ravel: {a_view_ravel}")

print(f"a_view_flags: {a_view.flags}")

a_view_data = a_view.__array_interface__['data'][0]
a_view_ravel_data = a_view_ravel.__array_interface__['data'][0]
print(f"a_view_data: {a_view_data}")
print(f"a_view_ravel_data: {a_view_ravel_data}")

Here a_view is non-contiguous and 1D, but can be ravelled - but when you do, a new array is returned - it's not a shallow copy (as can be observed from the different data pointers. However, the CUDA target doesn't support this:

if extents == [self._dummy.extent]:
return cls(shape=newarr.shape, strides=newarr.strides,
dtype=self.dtype, gpu_data=self.gpu_data,
stream=stream)
else:
raise NotImplementedError("operation requires copying")

My hunch is that iter_contiguous_extents() will return some extents different to the extents of self._dummy, so it should fall into the else branch.

From here we could either:

  • Support ravel() of 1D non-contiguous arrays by implementing a copy of the data (plausible to implement and will bring the API closer to that of NumPy's but is more work than the alternative)
  • Or, not support ravelling 1D non-contiguous arrays, add a test for this failing case, and add a note at an appropriate place in the docs to state that ravel() is supported, but only for contiguous data (if it doesn't say that already).

There might be other discrepancies between the Numba CUDA implementation and NumPy arrays (I have not looked in detail at the NumPy implementation you identified) but I think the above options move us a good way forward towards finishing this PR.

How does the above seem? Would some further guidance be helpful?

@cako
Copy link
Contributor Author

cako commented Sep 10, 2022

Thank you for the explanation, it does make a lot of sense! I think for this PR we could focus on the second option. The first option (supporting ravel on non-contiguous array) might be more suitable for a PR which implements .flatten(), since this is what numpy's .ravel() falls back on.

@gmarkall
Copy link
Member

I think for this PR we could focus on the second option

OK - that sounds good to proceed - please let me know if I can provide more help in following up.

do not test stride on cudasim
@cako
Copy link
Contributor Author

cako commented Nov 12, 2022

  1. DeviceNDArray.ravel() now works on 1D arrays exactly the same as it does on higher-dimensional arrays. This has been achieved by simply eliminating special cases for 1D arrays in DeviceNDArray.ravel in numba/cuda/cudadrv/devicearray.py and Arrayravel in numba/misc/dummyarray.py.
  2. DeviceNDArray.ravel() now always returns a different object than its input. The returned array shares the same data as the original array. This has been achieved by calling Array.from_desc instead of self.from_desc in Array.ravel in numba/misc/dummyarray.py. By calling the class method from_desc with the base class instead of the instance, we ensure that it returns a new object instead of itself. The data remains the same.
  3. DeviceNDArray.ravel() docs are now explicit that it does not operate on non-contiguous arrays.
  4. Testing has been enhanced to ensure that behavior in items 1 and 2 is respected for any dimensional array. It also includes a check on trying to ravel strided (non-contiguous) arrays.

@gmarkall
Copy link
Member

gpuci run tests

Copy link
Member

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the updates! I think this is now looking mostly good, with one question to resolve:

This has been achieved by calling Array.from_desc instead of self.from_desc in Array.ravel in numba/misc/dummyarray.py. By calling the class method from_desc with the base class instead of the instance, we ensure that it returns a new object instead of itself.

I don't think that's what's going on here - as far as I can tell, a classmethod always receives the class as an implicit first argument (although the docs are not completely explicit about this, it appears to read that way: https://docs.python.org/3/library/functions.html#classmethod). Notably, if I apply:

diff --git a/numba/misc/dummyarray.py b/numba/misc/dummyarray.py
index 3f3815eb4..7ccb4f314 100644
--- a/numba/misc/dummyarray.py
+++ b/numba/misc/dummyarray.py
@@ -386,8 +386,8 @@ class Array(object):
                 or order in 'FA' and self.is_f_contig):
             newshape = (self.size,)
             newstrides = (self.itemsize,)
-            arr = Array.from_desc(self.extent.begin, newshape, newstrides,
-                                  self.itemsize)
+            arr = self.from_desc(self.extent.begin, newshape, newstrides,
+                                 self.itemsize)
             return arr, list(self.iter_contiguous_extent())
 
         else:

to undo the change, then all the tests still appear to pass - I think earlier on in the patch series they might not have passed because of the 1D special-casing that was in devicearray.py, but that has since been removed.

Can you try without this change please? Or if I'm misunderstanding what's happening (a completely plausible scenario 🙂) could you clarify what's happening for me please?

numba/misc/dummyarray.py Outdated Show resolved Hide resolved
@cako
Copy link
Contributor Author

cako commented Nov 14, 2022

You are right, that change is not necessary, calling with class or class instance is the same. I reverted the change.

@gmarkall
Copy link
Member

gpuci run tests

numba/misc/dummyarray.py Outdated Show resolved Hide resolved
Copy link
Member

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cako many thanks for the update! There's a slight indentation change to resolve (left over from earlier) and then I think this is good to go!

Co-authored-by: Graham Markall <535640+gmarkall@users.noreply.github.com>
@gmarkall
Copy link
Member

Thanks for the fix!

gpuci run tests

@gmarkall
Copy link
Member

gmarkall commented Dec 5, 2022

Whoops, sorry that I totally missed this! @esc could this have a buildfarm run please?

@gmarkall gmarkall added Pending BuildFarm For PRs that have been reviewed but pending a push through our buildfarm 4 - Waiting on CI Review etc done, waiting for CI to finish and removed 4 - Waiting on author Waiting for author to respond to review labels Dec 5, 2022
@esc
Copy link
Member

esc commented Dec 5, 2022

Whoops, sorry that I totally missed this! @esc could this have a buildfarm run please?

You got it! Build numba_smoketest_cuda_yaml_181 has started

@esc
Copy link
Member

esc commented Dec 5, 2022

Whoops, sorry that I totally missed this! @esc could this have a buildfarm run please?

You got it! Build numba_smoketest_cuda_yaml_181 has started

This was all green. 🍏

@gmarkall gmarkall added BuildFarm Passed For PRs that have been through the buildfarm and passed 5 - Ready to merge Review and testing done, is ready to merge and removed Pending BuildFarm For PRs that have been reviewed but pending a push through our buildfarm 4 - Waiting on CI Review etc done, waiting for CI to finish labels Dec 5, 2022
@gmarkall
Copy link
Member

gmarkall commented Dec 5, 2022

@esc Thanks for the run and results!

@cako Many thanks for the contribution!

@sklam sklam merged commit 7f5202c into numba:main Dec 5, 2022
@cako cako deleted the patch-1d-ravel branch December 6, 2022 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to merge Review and testing done, is ready to merge BuildFarm Passed For PRs that have been through the buildfarm and passed CUDA CUDA related issue/PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants