-
-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: allow numpy.apply_along_axis() to work with ndarray subclasses #7918
Conversation
This commit modifies the numpy.apply_along_axis() function so that if it is called with an ndarray subclass, the internal func1d calls receive subclass instances and the overall function returns an instance of the subclass. There are two new tests for these two behaviours.
I like anything that makes numpy more subclass aware, but worry this will not quite work at least for the I think one could circumvent these problems by not running |
@@ -107,7 +107,7 @@ def apply_along_axis(func1d, axis, arr, *args, **kwargs): | |||
res = func1d(arr[tuple(i.tolist())], *args, **kwargs) | |||
outarr[tuple(ind)] = res | |||
k += 1 | |||
return outarr | |||
return arr.__array_wrap__(outarr) | |||
else: | |||
Ntot = product(outshape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, for this branch, I would suggest to run
res.__array_wrap__(outarr)
just after creating outarr
.
After looking at the test cases: what is your own usage where you would want the subclass to be preserved? Sticking with numpy's own |
My use case is this: I have an ndarray subclass storing sets of temporal MR data which has additional metadata properties such as the time between data points, spectrometer frequency etc. For example I might have a 3D array which contains temporal data on an x-y grid. In light of your suggestions I agree that the matrix test case is not necessarily an ideal choice, it was motivated by a desire to use existing functions rather than add my own additional ones, but that caused it to become rather far from my original use case. I will change this to something closer to what I wanted. |
Modified pull request so that result of func1d call gets to call __array_wrap__ instead of the input array. Modified tests to work with this
I have also run into the same issue with array scalars coming from np.mean etc. I guess the correct solution in this case is to set the shape of the outarr to be 1 in that dimension if the shape of res == (). This can then be short-circuited if desired at the subclass level in array_wrap by checking if the array is in fact an array scalar and returning the scalar instead, which allows functions like np.mean to be used to get the scalars if desired. |
Modified pull request to handle case where calling func1d generates scalar arrays
@bennyrowland - OK, sounds like we agree on what it should do! I also agree that setting the dimension to But maybe best to let the real numpy developers judge that (I'm guessing this may become sufficiently "new feature" that an e-mail to the mailing list would be in order; @ahaldane? @charris?) I did manage to come up with what I think is a decent test case using masked arrays (sadly, simple things like
|
@mhvk Actually in my current version I decided to squeeze that dimension if a scalar array is returned. Certainly could add a keepdims keyword argument though. If we are going to start adding arguments to apply_along_axis, one thing I thought might be interesting was optionally passing the index of the current slice to func1d, so that each call knows where it is in the larger grid. This could be used to index into other arguments passed through from the apply_along_axis call, for example to give each axis custom data. An example use case for this might be: if I have a 2D array containing a set of temporal data and I want to filter a different frequency from each dataset.
I did send a notification email to numpy-discussion to invite comment on this PR, was that not the right mailing list? |
@@ -112,8 +112,9 @@ def apply_along_axis(func1d, axis, arr, *args, **kwargs): | |||
Ntot = product(outshape) | |||
holdshape = outshape | |||
outshape = list(arr.shape) | |||
outshape[axis] = len(res) | |||
outshape[axis] = len(res) if res.shape != () else 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this could just be outshape[axis] = res.size
.
@bennyrowland - I had two very small additional comments, where the most important might be whether or not to assume that p.s. Of course, with this PR you do beg the question of why not also do |
Modified pull request to cast result of func1d into ndarray (or subclass) and ensure that it has the necessary size, dtype and __array_wrap properties
@mhvk - some further excellent suggestions I have acted on, as you say we will have to wait for an opinion from @charris or @rkern. I did take a look at |
This definitely needs tests before it's ready to merge. |
@shoyer, thanks for commenting. There are already three new tests in this PR, what further tests would you like to see included? |
def test_preserve_subclass(self): | ||
def double(row): | ||
return row * 2 | ||
m = np.matrix(np.arange(4).reshape((2, 2))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor style thing -- I think this test would be clearer as just np.matrix([[0, 1], [2, 3]])
, without the reshape.
result = apply_along_axis(double, 0, m) | ||
assert isinstance(result, np.matrix) | ||
assert_array_equal( | ||
result, np.matrix([0, 2, 4, 6]).reshape((2, 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. I find reshape
on np.matrix
especially non-intuitive (and I could only guess at whether passing 1D input to np.matrix
does).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple (very) minor suggestions for the tests, but otherwise looks good to me.
@bennyrowland oops, somehow I missed those :) |
Modified pull request to make stylistic changes to tests suggested by @shoyer
OK, in it goes. Thanks! |
This commit modifies the numpy.apply_along_axis() function so that if
it is called with an ndarray subclass, the internal func1d calls
receive subclass instances and the overall function returns an instance
of the subclass. There are two new tests for these two behaviours.