Fixes ravel failure on 1d arrays (#5229) #8418

cako · 2022-09-03T05:25:02Z

As reported in #5229, calling ravel on a DeviceNDArray returns an error. The issue stems from Array in dummyarray.py returning the array itself instead of both an Array and an Element: https://github.com/numba/numba/blob/main/numba/misc/dummyarray.py#L385-L386
This causes the tuple unpacking in DeviceNDArray.ravel() to fail as it expects Array and Element.

In this PR, a simple fix is issued in DeviceNDArray.ravel() by just returning self when ndim <= 1, which is what Array.ravel() does. This PR also adds a test for 1d ravels.

gmarkall · 2022-09-06T09:19:41Z

Many thanks for the PR!

gpuci run tests

gmarkall

Many thanks for the PR! This does fix the issue reported in #5229, but I think it still leaves some difference in behaviour between NumPy and Numba, because NumPy returns a new array with the same data, rather than returning the same array. For example,

import numpy as np
from numba import cuda

x = np.arange(10)
dx = cuda.to_device(x)

rx = x.ravel()
rdx = dx.ravel()

print(f"host array is the ravelled host array: {x is rx}")
print(f"device array is the ravelled device array: {dx is rdx}")

host_data = x.__array_interface__['data'][0]
host_ravelled_data = rx.__array_interface__['data'][0]
same_host_data = host_data == host_ravelled_data
print(f"host ravelled data is host original_data: {same_host_data}")


device_data = dx.__cuda_array_interface__['data'][0]
device_ravelled_data = rdx.__cuda_array_interface__['data'][0]
same_device_data = device_data == device_ravelled_data
print(f"device ravelled data is device original_data: {same_device_data}")

prints

host array is the ravelled host array: False
device array is the ravelled device array: True
host ravelled data is host original_data: True
device ravelled data is device original_data: True

I think the real problem is here:

numba/numba/misc/dummyarray.py

Line 386 in c619212

return self

The dummy array's ravel is only returning itself, but it should also return a list of extents.

cako · 2022-09-09T03:27:18Z

Thanks for the comments. I never realized that a "view" was a shallow copy as opposed to sometimes returning the same array, but you are right. Numpy ravel on a contiguous array will return an array which shares data with the original, but is not the same array (different pointers to the same memory). For reference, here is their implementation.

I'm not sure where to take it from here. Simply eliminating the following line makes your code "work" (in the sense that it returns a view and not the array), but I don't know if it will make other tests break (I don't have a GPU at the moment to test), and I also don't know all the minutiae of the Numpy implementation to be able to come up with an equivalent implementation.

numba/numba/misc/dummyarray.py

Line 386 in c619212

return self

With that said, I'm happy to try refactoring dummyarray so as to not "discriminate" against 1D arrays.

gmarkall · 2022-09-09T09:47:17Z

Ah, sorry for not being more clear. dummyarray.Array is not really like a NumPy array so it doesn't have the same API for its methods - its purpose and design is (I believe) to hold the bits of implementation needed for device arrays common to different targets - at one point it was used by both CUDA and ROCm. Now that ROCm got deprecated and removed, only the CUDA target is using them.

In particular for dummyarray.Array.ravel(), the interface is (not specified anywhere concrete, only implied to be by the implementation) is that it is a function that accepts the order parameter, and returns a tuple of a dummy array (which may or may not be self and a list of extents).

When the code was refactored into the CUDA Array / dummyarray.Array split, it looks like this interface was violated by dummyarray.Array.ravel() when ndim is 1, and it doesn't return a tuple, only itself (see 3dc6214), and there was never a specific test that hit this, so the error never got caught.

I think the right direction to move things in here is to return self, and contiguous extents generated with iter_contiguous_extent() - a minimal change implementing this would look like:

diff --git a/numba/misc/dummyarray.py b/numba/misc/dummyarray.py
index 8debc87b3..de88e0bf6 100644
--- a/numba/misc/dummyarray.py
+++ b/numba/misc/dummyarray.py
@@ -383,7 +383,7 @@ class Array(object):
             raise ValueError('order not C|F|A')
 
         if self.ndim <= 1:
-            return self
+            return self, list(self.iter_contiguous_extent())
 
         elif (order in 'CA' and self.is_c_contig or
                           order in 'FA' and self.is_f_contig):

However, that doesn't get us all the way there for matching NumPy's behaviour for ravel() - the exception message (NotImplementedError("ravel on non-contiguous array")) is now a bit misleading and gives us a clue as to what else could be wrong - it seems you can ravel a non-contiguous 1D array. E.g.:

import numpy as np

a = np.arange(10)
a_view = av = a[::2]
a_view_ravel = a_view.ravel()

print(f"a: {a}")
print(f"a_view: {a_view}")
print(f"a_view_ravel: {a_view_ravel}")

print(f"a_view_flags: {a_view.flags}")

a_view_data = a_view.__array_interface__['data'][0]
a_view_ravel_data = a_view_ravel.__array_interface__['data'][0]
print(f"a_view_data: {a_view_data}")
print(f"a_view_ravel_data: {a_view_ravel_data}")

Here a_view is non-contiguous and 1D, but can be ravelled - but when you do, a new array is returned - it's not a shallow copy (as can be observed from the different data pointers. However, the CUDA target doesn't support this:

numba/numba/cuda/cudadrv/devicearray.py

Lines 617 to 623 in c619212

    
           if extents == [self._dummy.extent]: 
        
               return cls(shape=newarr.shape, strides=newarr.strides, 
        
                          dtype=self.dtype, gpu_data=self.gpu_data, 
        
                          stream=stream) 
        
           else: 
        
               raise NotImplementedError("operation requires copying")

My hunch is that iter_contiguous_extents() will return some extents different to the extents of self._dummy, so it should fall into the else branch.

From here we could either:

Support ravel() of 1D non-contiguous arrays by implementing a copy of the data (plausible to implement and will bring the API closer to that of NumPy's but is more work than the alternative)
Or, not support ravelling 1D non-contiguous arrays, add a test for this failing case, and add a note at an appropriate place in the docs to state that ravel() is supported, but only for contiguous data (if it doesn't say that already).

There might be other discrepancies between the Numba CUDA implementation and NumPy arrays (I have not looked in detail at the NumPy implementation you identified) but I think the above options move us a good way forward towards finishing this PR.

How does the above seem? Would some further guidance be helpful?

cako · 2022-09-10T04:09:32Z

Thank you for the explanation, it does make a lot of sense! I think for this PR we could focus on the second option. The first option (supporting ravel on non-contiguous array) might be more suitable for a PR which implements .flatten(), since this is what numpy's .ravel() falls back on.

gmarkall · 2022-09-12T10:53:59Z

I think for this PR we could focus on the second option

OK - that sounds good to proceed - please let me know if I can provide more help in following up.

do not test stride on cudasim

cako · 2022-11-12T21:27:25Z

DeviceNDArray.ravel() now works on 1D arrays exactly the same as it does on higher-dimensional arrays. This has been achieved by simply eliminating special cases for 1D arrays in DeviceNDArray.ravel in numba/cuda/cudadrv/devicearray.py and Arrayravel in numba/misc/dummyarray.py.
DeviceNDArray.ravel() now always returns a different object than its input. The returned array shares the same data as the original array. This has been achieved by calling Array.from_desc instead of self.from_desc in Array.ravel in numba/misc/dummyarray.py. By calling the class method from_desc with the base class instead of the instance, we ensure that it returns a new object instead of itself. The data remains the same.
DeviceNDArray.ravel() docs are now explicit that it does not operate on non-contiguous arrays.
Testing has been enhanced to ensure that behavior in items 1 and 2 is respected for any dimensional array. It also includes a check on trying to ravel strided (non-contiguous) arrays.

gmarkall · 2022-11-14T11:21:47Z

gpuci run tests

gmarkall

Many thanks for the updates! I think this is now looking mostly good, with one question to resolve:

This has been achieved by calling Array.from_desc instead of self.from_desc in Array.ravel in numba/misc/dummyarray.py. By calling the class method from_desc with the base class instead of the instance, we ensure that it returns a new object instead of itself.

I don't think that's what's going on here - as far as I can tell, a classmethod always receives the class as an implicit first argument (although the docs are not completely explicit about this, it appears to read that way: https://docs.python.org/3/library/functions.html#classmethod). Notably, if I apply:

diff --git a/numba/misc/dummyarray.py b/numba/misc/dummyarray.py
index 3f3815eb4..7ccb4f314 100644
--- a/numba/misc/dummyarray.py
+++ b/numba/misc/dummyarray.py
@@ -386,8 +386,8 @@ class Array(object):
                 or order in 'FA' and self.is_f_contig):
             newshape = (self.size,)
             newstrides = (self.itemsize,)
-            arr = Array.from_desc(self.extent.begin, newshape, newstrides,
-                                  self.itemsize)
+            arr = self.from_desc(self.extent.begin, newshape, newstrides,
+                                 self.itemsize)
             return arr, list(self.iter_contiguous_extent())
 
         else:

to undo the change, then all the tests still appear to pass - I think earlier on in the patch series they might not have passed because of the 1D special-casing that was in devicearray.py, but that has since been removed.

Can you try without this change please? Or if I'm misunderstanding what's happening (a completely plausible scenario 🙂) could you clarify what's happening for me please?

numba/misc/dummyarray.py

cako · 2022-11-14T18:19:26Z

You are right, that change is not necessary, calling with class or class instance is the same. I reverted the change.

gmarkall · 2022-11-14T20:54:09Z

gpuci run tests

numba/misc/dummyarray.py

gmarkall

@cako many thanks for the update! There's a slight indentation change to resolve (left over from earlier) and then I think this is good to go!

Co-authored-by: Graham Markall <535640+gmarkall@users.noreply.github.com>

gmarkall · 2022-11-15T10:06:05Z

Thanks for the fix!

gpuci run tests

gmarkall · 2022-12-05T09:48:12Z

Whoops, sorry that I totally missed this! @esc could this have a buildfarm run please?

esc · 2022-12-05T09:53:03Z

Whoops, sorry that I totally missed this! @esc could this have a buildfarm run please?

You got it! Build numba_smoketest_cuda_yaml_181 has started

esc · 2022-12-05T11:26:57Z

Whoops, sorry that I totally missed this! @esc could this have a buildfarm run please?

You got it! Build numba_smoketest_cuda_yaml_181 has started

This was all green. 🍏

gmarkall · 2022-12-05T11:36:39Z

@esc Thanks for the run and results!

@cako Many thanks for the contribution!

#8418 (comment)

fixes numba#5229

4b68465

cako requested a review from gmarkall as a code owner September 3, 2022 05:25

cako changed the title ~~fixes #5229~~ fixes ravel failure on 1d arrays (#5229) Sep 3, 2022

cako changed the title ~~fixes ravel failure on 1d arrays (#5229)~~ Fixes ravel failure on 1d arrays (#5229) Sep 3, 2022

stuartarchibald added CUDA CUDA related issue/PR 3 - Ready for Review labels Sep 5, 2022

gmarkall self-assigned this Sep 6, 2022

gmarkall added this to the Numba 0.57 RC milestone Sep 6, 2022

gmarkall requested changes Sep 8, 2022

View reviewed changes

gmarkall added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review labels Sep 8, 2022

cako and others added 4 commits November 12, 2022 10:59

Merge branch 'numba:main' into patch-1d-ravel

144df07

fix: remove special case of ndim == 1 in ravel

a63b40d

fix: ensure ravel always returns a new object

63888ea

fix: update documentation for ravel

69e3da2

cako requested review from sklam and stuartarchibald as code owners November 12, 2022 20:35

fix: separate stride testing in ravel

82d97f0

do not test stride on cudasim

gmarkall requested changes Nov 14, 2022

View reviewed changes

numba/misc/dummyarray.py Outdated Show resolved Hide resolved

fix: revert change in 63888ea

3c87426

gmarkall reviewed Nov 14, 2022

View reviewed changes

numba/misc/dummyarray.py Outdated Show resolved Hide resolved

gmarkall previously requested changes Nov 14, 2022

View reviewed changes

Update numba/misc/dummyarray.py

4012e13

Co-authored-by: Graham Markall <535640+gmarkall@users.noreply.github.com>

gmarkall added Pending BuildFarm For PRs that have been reviewed but pending a push through our buildfarm 4 - Waiting on CI Review etc done, waiting for CI to finish and removed 4 - Waiting on author Waiting for author to respond to review labels Dec 5, 2022

sklam approved these changes Dec 5, 2022

View reviewed changes

sklam merged commit 7f5202c into numba:main Dec 5, 2022

cako deleted the patch-1d-ravel branch December 6, 2022 05:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes ravel failure on 1d arrays (#5229) #8418

Fixes ravel failure on 1d arrays (#5229) #8418

cako commented Sep 3, 2022 •

edited

Loading

gmarkall commented Sep 6, 2022

gmarkall left a comment

cako commented Sep 9, 2022

gmarkall commented Sep 9, 2022

cako commented Sep 10, 2022

gmarkall commented Sep 12, 2022

cako commented Nov 12, 2022

gmarkall commented Nov 14, 2022

gmarkall left a comment

cako commented Nov 14, 2022

gmarkall commented Nov 14, 2022

gmarkall left a comment

gmarkall commented Nov 15, 2022

gmarkall commented Dec 5, 2022

esc commented Dec 5, 2022

esc commented Dec 5, 2022

gmarkall commented Dec 5, 2022

Fixes ravel failure on 1d arrays (#5229) #8418

Fixes ravel failure on 1d arrays (#5229) #8418

Conversation

cako commented Sep 3, 2022 • edited Loading

gmarkall commented Sep 6, 2022

gmarkall left a comment

Choose a reason for hiding this comment

cako commented Sep 9, 2022

gmarkall commented Sep 9, 2022

cako commented Sep 10, 2022

gmarkall commented Sep 12, 2022

cako commented Nov 12, 2022

gmarkall commented Nov 14, 2022

gmarkall left a comment

Choose a reason for hiding this comment

cako commented Nov 14, 2022

gmarkall commented Nov 14, 2022

gmarkall left a comment

Choose a reason for hiding this comment

gmarkall commented Nov 15, 2022

gmarkall commented Dec 5, 2022

esc commented Dec 5, 2022

esc commented Dec 5, 2022

gmarkall commented Dec 5, 2022

cako commented Sep 3, 2022 •

edited

Loading