New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve SSIM documentation and warn about data range. #6595
Conversation
Many researchers rely on this function, but gives in case of floating-point data wrong results. I added some explanation on what might go wrong and how to accommodate for this. Personally, I'm in my fourth year of my PhD research and only find this out now. Probably, hundreds of papers have published incorrect SSIMs due to this subtlety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this contribution! It should be highly valuable to many users.
`dtype_range` in `skimage.util.dtype.py` has defined intervals from -1 to | ||
+1. This yields an estimate of 2, instead of 1, which is most often |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... for signed integers. Then the question is that of type inference.
Maybe link to https://scikit-image.org/docs/stable/user_guide/data_types.html as well.
Perhaps we should error out unless the data range is explicitly provided? |
Yes! It would be safer this way. These docs improvements would go into the error message then. Only error when floating-point input images? |
Implementation would be simpler to always error, but will break a bunch of code unnecessarily. Erroring on float only is therefore slightly preferable, IMO. |
I think erroring is extremely useful actually. As this change propagates through to people across the world, they'll have to pause for a second and think about this. I would guess many people will figure out their results were wrong. In general, this is going to be annoying for a lot of individuals, but at least now they know, and it feels like the most honest thing to do. |
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
Sort of the only case where the estimate is going to be reliably correct is if you have uint8 data. All other cases can be vague and misleading. You very easily have a numpy array of uint8 type where in some calculation you add a |
All makes good sense, I agree with your assessment. |
@mcourteaux would you like to implement the erroring behaviour on floating-point data here, by adding new commits? @stefanv or should we merge this first (since this PR already brings an improvement) and let @mcourteaux submit a follow-up PR? |
I could give it a go. Currently on a break. Will do this in a couple of days. Feel free to also just merge it and then in some days, I submit a new PR. |
Wonderful! I have just approved this PR; it takes an approval by another maintainer before we can merge it. Your second PR will be most welcome. Thanks again, @mcourteaux! |
Thanks! I filed a tracking issue. |
Many researchers rely on this function, but gives in case of floating-point data wrong results. I added some explanation on what might go wrong and how to accommodate for this. Personally, I'm in my fourth year of my PhD research and only find this out now. Probably, hundreds of papers have published incorrect SSIMs due to this subtlety.