Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix np.ma.median with only one non-masked value and an axis argument #8030

Merged

Conversation

lesteve
Copy link
Contributor

@lesteve lesteve commented Sep 7, 2016

Fix #8029.

@@ -736,7 +736,7 @@ def _median(a, axis=None, out=None, overwrite_input=False):
# duplicate high if odd number of elements so mean does nothing
odd = counts % 2 == 1
if asorted.ndim > 1:
np.copyto(low, high, where=odd)
low[odd] = high[odd]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.copyto(low, high, where=odd) doesn't unmask the values in contrary to low[odd] = high[odd].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there are indexing errors because odd is boolean...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I am following, do you have an example where things go wrong?

Copy link
Member

@charris charris Sep 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were problems when odd was a scalar or scalar array. Perhaps that no longer happens with the other changes...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now that we are using the count function, that may be fixed. In fact, by the time we get there 0-d inputs should have been raveled to 1-d and 1-d arrays have been handled, so the asorted.ndim > 1 check should not be needed either.

I don't think any of the values should be masked except for zero values of count, which brings up the question of negative indices in creating low.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid I haven't got a grasp of the issue as good as yours, but what about replacing h - 1 by np.maximum(h - 1, 0) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way you can keep using np.copyto as previously.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should work also, I was just a bit concerned about speed. OTOH, it is a bit more obvious. Does is work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, can probably remove low._sharedmask = False also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that should work also, I was just a bit concerned about speed. OTOH, it is a bit more obvious. Does is work?

The tests pass on my machine, I pushed the change. Let's see what Travis and AppVeyor have to say!

If we do that, can probably remove low._sharedmask = False also.

OK I can try that too afterwards.

@charris
Copy link
Member

charris commented Sep 7, 2016

When (if) this passes the tests, would you squash the commits?

@lesteve lesteve force-pushed the fix-ma-median-with-only-one-non-masked-value branch from 2f06b05 to ed30048 Compare September 7, 2016 17:10
@lesteve
Copy link
Contributor Author

lesteve commented Sep 7, 2016

When (if) this passes the tests, would you squash the commits?

The tests were passing so I squashed the commits. BTW just in case you didn't know about it you can do squash + merge via the github UI these days.

@charris
Copy link
Member

charris commented Sep 7, 2016

I don't really trust git to do it right after trying their revert button ;) Haven't tried squashing though.

@charris charris merged commit 3333e8b into numpy:master Sep 7, 2016
@charris
Copy link
Member

charris commented Sep 7, 2016

Thanks @lesteve. I still want to revisit ma.median at some point, but I think we now have something that works. Of course, I've thought that before. If nothing else, the tests need to have better coverage.

@charris charris added this to the 1.11.2 release milestone Sep 7, 2016
@charris charris removed this from the 1.11.2 release milestone Sep 7, 2016
@lesteve lesteve deleted the fix-ma-median-with-only-one-non-masked-value branch September 7, 2016 17:57
@lesteve
Copy link
Contributor Author

lesteve commented Sep 7, 2016

Great!

If nothing else, the tests need to have better coverage.

If you have suggestions of weird edge cases to add I'd be happy to give it a go.

@charris
Copy link
Member

charris commented Sep 7, 2016

If nothing else, there should be tests for different values of ndim, say up to at least three, with all different valid axis values tested with different amounts of masking from all to none, and maybe empty arrays also (which will currently fail for some cases). There is also the out and overwrite_input keywords. It is a large combinational mess, but if we had had such tests we wouldn't have run into so many problems.

@lesteve
Copy link
Contributor Author

lesteve commented Sep 8, 2016

Thanks for the suggestions, I'll try to find some time to add more tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants