Fix for non-contiguous strides #4736

viclafargue · 2022-05-17T14:00:00Z

github-actions · 2022-06-23T15:03:03Z

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions · 2022-09-21T16:02:54Z

This PR has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates.

viclafargue · 2022-11-18T14:11:13Z

rerun tests

csadorf · 2022-11-23T09:58:04Z

python/cuml/common/array.py

+                else:
+                    cupy_data = cp.array(data, copy=True, order='C')
+                    self._ptr = cupy_data.data.ptr
+                    self._owner = cupy_data if cupy_data.flags.owndata \
+                        else data
+                    self.order = 'C'
+                    self.strides = cupy_data.strides


Since the newly created array should have a conformant CAI, could we just call the constructor again?

Suggested change

else:

cupy_data = cp.array(data, copy=True, order='C')

self._ptr = cupy_data.data.ptr

self._owner = cupy_data if cupy_data.flags.owndata \

else data

self.order = 'C'

self.strides = cupy_data.strides

else:

cupy_data = cp.array(data, copy=True, order='C')

super().__init__(data=cupy_data)

csadorf · 2022-11-23T10:00:31Z

python/cuml/common/memory_utils.py

+    itemsize = cp.dtype(dtype).itemsize
+    shape = list(shape)
+    strides = list(strides)


I don't think there is a need for these copies and you can also just use the sliced view of those arrays directly.

csadorf · 2022-11-23T10:24:42Z

python/cuml/common/memory_utils.py

+        shape = shape[::-1]
+        for dim_size in shape[:-1]:
+            strides.append(dim_size * strides[-1])
+        strides = strides[::-1]

    else:
        raise ValueError('Order must be "F" or "C". ')


Suggested change

raise ValueError('Order must be "F" or "C". ')

raise ValueError('Order must be "F" or "C".')

csadorf · 2022-11-23T10:53:45Z

python/cuml/common/memory_utils.py

+        shape = shape[::-1]
+        for dim_size in shape[:-1]:
+            strides.append(dim_size * strides[-1])
+        strides = strides[::-1]


We could consider to use a combination of itertools.accumulate() and operator.mul() to compute the strides a bit more succinctly:

from itertools import accumulate from operator import mul f_strides = list(accumulate(shape[:-1], func=mul, initial=item_size)) c_strides = list(accumulate(shape[:0:-1], func=mul, initial=item_size))[::-1]

Edit: If you like the suggestion, I can run some benchmarks to ensure that this isn't slower by any chance.

Thanks for the review. Sure that could be interesting :)

Looks like loops are the fastest, however I was able to apply a micro optimization that is a bit faster (see below):

Loops: 100.00% Loops (alt): 94.73% With accumulate: 132.87%

I ran this a few times and results appear pretty consistent.

Benchmark code

from itertools import accumulate from operator import mul def compute_strides(shape, item_size): tuple(accumulate(shape[:-1], func=mul, initial=item_size)) tuple(accumulate(shape[:0:-1], func=mul, initial=item_size))[::-1] def compute_strides_loops(shape, item_size): strides = [item_size] for dim_size in shape[:-1]: strides.append(dim_size * strides[-1]) tuple(strides) strides = [item_size] shape = shape[::-1] for dim_size in shape[:-1]: strides.append(dim_size * strides[-1]) tuple(strides[::-1]) def compute_strides_loops_alternative(shape, item_size): strides = [item_size] for dim_size in shape[:-1]: strides.append(dim_size * strides[-1]) tuple(strides) strides = [item_size] for dim_size in shape[:0:-1]: strides.append(dim_size * strides[-1]) tuple(strides[::-1]) if __name__ == "__main__": from timeit import timeit result_w_loops = timeit( "compute_strides((2, 3), 8)", setup="from benchmark_strides import compute_strides_loops as compute_strides", ) result_w_loops_alt = timeit( "compute_strides((2, 3), 8)", setup="from benchmark_strides import compute_strides_loops_alternative as compute_strides", ) result_w_accumulate = timeit( "compute_strides((2, 3), 8)", setup="from benchmark_strides import compute_strides", ) print(f"Loops: {result_w_loops / result_w_loops:.2%}") print(f"Loops (alt): {result_w_loops_alt / result_w_loops:.2%}") print(f"With accumulate: {result_w_accumulate / result_w_loops:.2%}")

Micro-optimization:

# F-order strides = [item_size] for dim_size in shape[:-1]: strides.append(dim_size * strides[-1]) tuple(strides) # C-order strides = [item_size] for dim_size in shape[:0:-1]: strides.append(dim_size * strides[-1]) tuple(strides[::-1])

dantegd · 2022-11-29T00:05:56Z

@gpucibot merge

Fixes rapidsai#4731 Authors: - Victor Lafargue (https://github.com/viclafargue) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4736

Fix for non-contiguous strides

9eddb76

viclafargue requested a review from a team as a code owner May 17, 2022 14:00

github-actions bot added the Cython / Python Cython or Python issue label May 17, 2022

cjnolet added this to PR-WIP in v22.08 Release via automation May 23, 2022

viclafargue changed the base branch from branch-22.06 to branch-22.08 May 24, 2022 14:26

github-actions bot added the inactive-30d label Jun 23, 2022

caryr35 added this to PR-WIP in v22.10 Release via automation Aug 4, 2022

caryr35 moved this from PR-WIP to PR-Needs review in v22.10 Release Aug 4, 2022

caryr35 removed this from PR-WIP in v22.08 Release Aug 4, 2022

github-actions bot added the inactive-90d label Sep 21, 2022

caryr35 added this to PR-WIP in v22.12 Release via automation Oct 18, 2022

caryr35 moved this from PR-WIP to PR-Needs review in v22.12 Release Oct 18, 2022

caryr35 removed this from PR-Needs review in v22.10 Release Oct 18, 2022

viclafargue changed the base branch from branch-22.08 to branch-22.12 November 17, 2022 17:55

Merge branch 'branch-22.12' into non-contiguous-stride-fix

e7390ba

viclafargue added bug Something isn't working non-breaking Non-breaking change labels Nov 18, 2022

order is C when no stride

dd717fb

csadorf reviewed Nov 23, 2022

View reviewed changes

viclafargue added 2 commits November 23, 2022 15:33

addressing review

4b18579

fix

20ea3ef

dantegd approved these changes Nov 29, 2022

View reviewed changes

v22.12 Release automation moved this from PR-Needs review to PR-Reviewer approved Nov 29, 2022

rapids-bot bot merged commit 07f0bc4 into rapidsai:branch-22.12 Nov 29, 2022

v22.12 Release automation moved this from PR-Reviewer approved to Done Nov 29, 2022

viclafargue mentioned this pull request Dec 2, 2022

Provide host CumlArray and associated infrastructure #4908

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for non-contiguous strides #4736

Fix for non-contiguous strides #4736

viclafargue commented May 17, 2022

github-actions bot commented Jun 23, 2022

github-actions bot commented Sep 21, 2022

viclafargue commented Nov 18, 2022

csadorf Nov 23, 2022

csadorf Nov 23, 2022

csadorf Nov 23, 2022

csadorf Nov 23, 2022 •

edited

viclafargue Nov 23, 2022

csadorf Nov 23, 2022

dantegd commented Nov 29, 2022

	raise ValueError('Order must be "F" or "C". ')
	raise ValueError('Order must be "F" or "C".')

Fix for non-contiguous strides #4736

Fix for non-contiguous strides #4736

Conversation

viclafargue commented May 17, 2022

github-actions bot commented Jun 23, 2022

github-actions bot commented Sep 21, 2022

viclafargue commented Nov 18, 2022

csadorf Nov 23, 2022

Choose a reason for hiding this comment

csadorf Nov 23, 2022

Choose a reason for hiding this comment

csadorf Nov 23, 2022

Choose a reason for hiding this comment

csadorf Nov 23, 2022 • edited

Choose a reason for hiding this comment

viclafargue Nov 23, 2022

Choose a reason for hiding this comment

csadorf Nov 23, 2022

Choose a reason for hiding this comment

dantegd commented Nov 29, 2022

csadorf Nov 23, 2022 •

edited