Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profile_line's reduce_func parameter should be np.nanmean by default #5361

Closed
PinkShnack opened this issue Apr 28, 2021 · 9 comments
Closed

Comments

@PinkShnack
Copy link
Contributor

Description

When we implemented the reduce_func parameter for skimage.measure.profile_line I didn't realise that np.mean doesn't handle nan very well.

Scenario example 1:
By setting mode="constant" and cval=np.nan, profile_line will return nan when calculated by np.mean.

This could be fixed by your deprecation of mode="constant". Why is this being deprecated by the way?

Scenario example 1:
I guess this could also be a problem when an array/image just has some nans in it.

Proposal

Therefore, I propose changing the default reduce_func=np.mean parameter of skimage.measure.profile_line to reduce_func=np.nanmean. I can create a PR if you all feel this is worthwhile.

@PinkShnack PinkShnack changed the title bug: profile_line's reduce_func parameter could be np.nanmean by default bug: profile_line's reduce_func parameter should be np.nanmean by default Apr 28, 2021
@rfezzani
Copy link
Member

Hum, I will not call this a bug since NaN are IMHO exceptions, it must not rule the general case.
A user setting mode="constant" and cval=np.nan knows what he is doing and must also set reduce_func=np.nanmean by himself!

@PinkShnack PinkShnack changed the title bug: profile_line's reduce_func parameter should be np.nanmean by default profile_line's reduce_func parameter should be np.nanmean by default Apr 28, 2021
@PinkShnack
Copy link
Contributor Author

Changed it from bug.
Indeed in the first case you're right, good reasoning.
But for the second case, wouldn't it be better for it to just handle np.nan?

@grlee77
Copy link
Contributor

grlee77 commented Apr 28, 2021

But for the second case, wouldn't it be better for it to just handle np.nan?

np.nanmean handles NaNs but is quite a bit slower than np.mean, so it is a tradeoff between convenience and performance. If most users are not working with Data involving NaNs then we should probably not change the default.

@jni
Copy link
Member

jni commented Apr 29, 2021

Yeah, I agree with @grlee77 and @rfezzani: nanmean is slower and correct nan handling is very context-dependent, so we have in general preferred to leave this to users. Having said this, I can see how one might want to measure things near the edge of images and not include measurements from outside the image, in which case constant_value=np.nan, reduce_func=np.nanmean would be the right approach. And it seems like a common use case.

Perhaps a middle approach would be to include this example in the docstring?

@PinkShnack
Copy link
Contributor Author

All makes sense, thanks for all the comments!

Perhaps a middle approach would be to include this example in the docstring?

Brilliant, I shall do just that.

@rfezzani
Copy link
Member

Thank you @PinkShnack, any PR is welcome of course 😉. I close this issue, but feel free to reopen it if you wish.

@PinkShnack
Copy link
Contributor Author

A late reply from me....
A while ago I tried to change the docstring as mentioned above. However, the current docstring examples don't run correctly. It seems that when the default parameters were changed, no one updated the docstring examples.

Does scikit image test the docstring examples by default with your current github actions/CI integration?

If this has been corrected in the mean-time, then ignore the above.

@grlee77
Copy link
Contributor

grlee77 commented Aug 5, 2021

Does scikit image test the docstring examples by default with your current github actions/CI integration?

I would have thought so, but perhaps this got lost somewhere along the line with the move the GHA. Let me look into it.

However, the current docstring examples don't run correctly.

Indeed, the results in the example appear to still correspond to mode='constant' instead of the current default.

@grlee77
Copy link
Contributor

grlee77 commented Aug 5, 2021

@PinkShnack, thanks for raising the issue here about the broken examples. I opened #5502 about fixing this and running the doctests on CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants