-
-
Notifications
You must be signed in to change notification settings - Fork 132
Implement roll (circular shift of elements) #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #160 +/- ##
==========================================
+ Coverage 96.9% 96.96% +0.05%
==========================================
Files 11 11
Lines 1197 1218 +21
==========================================
+ Hits 1160 1181 +21
Misses 37 37
Continue to review full report at Codecov.
|
hameerabbasi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things are excellent here, mostly! Just a few comments.
sparse/coo/common.py
Outdated
| # roll across specified axis | ||
| else: | ||
| axis = normalize_axis_tuple(axis, a.ndim, allow_duplicate=True) | ||
| broadcasted = np.core.multiarray.broadcast(shift, axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just broadcasting the shape? If so, we should use our own utility function. See
Lines 358 to 388 in b51d749
| def _get_broadcast_shape(shape1, shape2, is_result=False): | |
| """ | |
| Get the overall broadcasted shape. | |
| Parameters | |
| ---------- | |
| shape1, shape2 : tuple[int] | |
| The input shapes to broadcast together. | |
| is_result : bool | |
| Whether or not shape2 is also the result shape. | |
| Returns | |
| ------- | |
| result_shape : tuple[int] | |
| The overall shape of the result. | |
| Raises | |
| ------ | |
| ValueError | |
| If the two shapes cannot be broadcast together. | |
| """ | |
| # https://stackoverflow.com/a/47244284/774273 | |
| if not all((l1 == l2) or (l1 == 1) or ((l2 == 1) and not is_result) for l1, l2 in | |
| zip(shape1[::-1], shape2[::-1])): | |
| raise ValueError('operands could not be broadcast together with shapes %s, %s' % | |
| (shape1, shape2)) | |
| result_shape = tuple(max(l1, l2) for l1, l2 in | |
| zip_longest(shape1[::-1], shape2[::-1], fillvalue=1))[::-1] | |
| return result_shape |
In general, I don't think it's good to rely on an undocumented NumPy function, which could change and break things at any time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm what about normalize_axis_tuple is there a utility function for that as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind I found normalize_axis in utils.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll need to add allow_repeat into that one, I believe.
sparse/tests/test_coo.py
Outdated
| x = np.array([0, 1, 0, 2, 0, 3, 0, 0, 4, 0, 0, 0, 5, 0, 6]) | ||
| xs = sparse.as_coo(x) | ||
| for sh in (0, 2, -2, 20, -20): | ||
| assert_eq(np.roll(x, sh), sparse.roll(xs, sh).todense()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .todense() can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting.... This causes the test to fail at assert is_canonical(...) what is this checking for? Do I need to resort the coordinates or something inside sparse.roll(...) so that I maintain lexographic order of the indices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COO arrays are meant to be immutable. This means you shouldn't construct and then change their coords/data.
What you should do instead is, make coords/data and pass that to the constructor.
is_canonical checks if the coords have any duplicates, and are sorted in lexographic order. This is needed by many operations internally. In your case, they're not sorted, because you're changing them directly in the object, instead of passing them to the constructor. I suggest you use COO(..., has_duplicates=False) (this will sort for you).
sparse/tests/test_coo.py
Outdated
| for ax in (None, 0, 1): | ||
| for sh in (0, 2, -2, 10, -10): | ||
| assert_eq(np.roll(x2, sh, axis=ax), | ||
| sparse.roll(x2s, sh, axis=ax).todense()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .todense() can be removed.
sparse/tests/test_coo.py
Outdated
| shifts = [(0, 1), (1, 0), (-1, 1), (1, -1)] | ||
| for sh in shifts: | ||
| assert_eq(np.roll(x2, sh, axis=(0, 1)), | ||
| sparse.roll(x2s, sh, axis=(0, 1)).todense()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .todense() can be removed.
sparse/tests/test_coo.py
Outdated
| for ax in ((0, 0, 0), (1, 1, 1), (0, 1, 1), (0, 0, 1)): | ||
| for sh in [(0, 1, 0), (0, 1, 1), (-1, 1, -1), (1, -1, 1)]: | ||
| assert_eq(np.roll(x2, sh, axis=(0, 1, 1)), | ||
| sparse.roll(x2s, sh, axis=(0, 1, 1)).todense()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .todense() can be removed.
sparse/tests/test_coo.py
Outdated
|
|
||
| # empty | ||
| x = np.array([]) | ||
| assert_eq(np.roll(x, 1), sparse.roll(sparse.as_coo(x), 1).todense()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .todense() can be removed.
sparse/tests/test_coo.py
Outdated
| COO.from_iter(x) | ||
|
|
||
|
|
||
| def test_roll_coo(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: Maybe follow the pattern of using sparse.random to generate something of a given shape and then roll it using both NumPy and sparse implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mandatory: Parameterize your tests. There are a lot of examples in the tests already, and docs here.
| broadcasted = np.core.multiarray.broadcast(shift, axis) | ||
|
|
||
| if broadcasted.ndim > 1: | ||
| raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a test for this branch. See this SO question.
sparse/coo/common.py
Outdated
|
|
||
| for sh, ax in broadcasted: | ||
| result.coords[ax] += sh | ||
| result.coords[ax] %= result.shape[ax] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask for a test that verifies that the original remains unchanged after the roll operation?
I suspect that result.coords is a.coords here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is something I wasn't 100% sure on. What is the best/preferred way to deepcopy a COO array at the moment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't a built-in method, because we kind of assume immutability, and if things are immutable, you don't need to worry about deep copies.
However, you can just do coords = np.array(a.coords) and similarly for data and then proceed. You're also modifying the object after construction (which is a big no-no), so this method is best anyway.
|
Also needs an addition in |
|
I think all comments are addressed. Thanks so much for the feedback. A few remaining questions:
|
hameerabbasi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should this function handle a non-COO input? Should I add a call to as_coo(...) at the top?
Yes, call as_coo at the top.
Should I do more work to check inputs (e.g. if user provides a 2D sequence for axis or shift, should I throw a more descriptive error message?)
Ideally, NumPy error messages should be mirrored. But as long as this fails, it's alright.
sparse/coo/common.py
Outdated
| return roll(a.reshape((-1,)), shift, 0).reshape(a.shape) | ||
|
|
||
| # rolling does nothing for empty array | ||
| elif a.nnz == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: Is this branch really required? I'd do the same for all special cases that aren't required. Will the above code already handle it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yep let me get rid of that branch quickly
ff25aaf to
c1e8a3e
Compare
|
Squashed everything to a single commit. Ready for final review. |
|
I need to request changes, but the review mechanism on GitHub seems broken right now. I'll come back to it soon. But in essence: Replace |
hameerabbasi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final comments, hopefully. Thanks for your patience with me. Only one is mandatory.
sparse/coo/common.py
Outdated
| "If 'shift' is a 1D sequence, " | ||
| "'axis' must have equal length.") | ||
|
|
||
| if np.asarray(axis).ndim > 1 or np.asarray(shift).ndim > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use np.ndim(...) instead of np.asarray(...).ndim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice. thanks 👍
sparse/coo/common.py
Outdated
| "'axis' must have equal length.") | ||
|
|
||
| if np.asarray(axis).ndim > 1 or np.asarray(shift).ndim > 1: | ||
| raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mandatory: Add test for this ValueError, perhaps as a parametrized test in test_valerr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that should be in there already at line 1585.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. There are a few things wrong with that. It's not parameterized. And when the top line raises an error, that one is detected and the last line is ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see here that it isn't covered here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks for the patient explanation. Should be fixed now.
sparse/coo/common.py
Outdated
| shift = np.full(len(axis), shift) | ||
|
|
||
| # ensure axis is iterable | ||
| elif not np.iterable(axis): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace np.iterable(axis) with isinstance(axis, tuple) (normalize_axis does that for you).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
sparse/coo/common.py
Outdated
| shift = (shift,) | ||
|
|
||
| # handle broadcasting | ||
| if np.iterable(axis) and len(shift) == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace np.iterable(axis) with isinstance(axis, tuple) (normalize_axis does that for you).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
sparse/coo/common.py
Outdated
| axis = normalize_axis(axis, a.ndim) | ||
|
|
||
| # make shift iterable | ||
| if not np.iterable(shift): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace np.iterable(shift) with isinstance(shift, collections.Iterable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
sparse/coo/common.py
Outdated
| Output array, with the same shape as a. | ||
| """ | ||
| from .core import COO, as_coo | ||
| import collections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do this at the top of the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep. done in latest commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And remove it from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a commit that's not syncing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my mistake. Did you push it to the repo, are you sure it isn't local?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the same. I'm assuming GitHub is migrating stuff, and since it uses microservices, one thing is breaking at a time. Guess we'll just have to wait until things catch up and the tests run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you possibly make a minor change and re-push to the repo? Possibly squash all the commits?
|
Squashed everything and forced update. Looks like the tests ran now. |
|
Thanks for the patience and the addition, @ahwillia! This is in! |

This is my attempt to implement the equivalent of
numpy.rollwhich I think would be really useful to have in this package.I think this is nearly done, but it could be improved. In particular, the function is supposed to produce a copy of the array but I wasn't sure the preferred way of doing that so I called the
COOconstructor directly. Also, my function assumes a COO array as input, which is maybe not so flexible as other sparse array types are developed?Thanks again for providing this awesome package. Looking forward to your comments!