Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow float->float conversion of any range #3052

Merged
merged 1 commit into from
Sep 6, 2018

Conversation

jni
Copy link
Member

@jni jni commented May 3, 2018

Fixes #3051.

When converting from float, we were checking for valid input ranges
before checking whether the target format was also float, in which
case the range doesn't matter. This PR corrects this so that float-only
conversions, which don't entail any rescaling, don't check for input
range.

Checklist

[It's fine to submit PRs which are a work in progress! But before they are merged, all PRs should provide:]

For reviewers

(Don't remove the checklist below.)

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.

Copy link
Member

@soupault soupault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, I don't think that the original issue is a bug at all.
From my standpoint, working with an image not satisfying our dtype conventions shouldn't be encouraged by the functions we ship.
On the other hand, there are some cases which the proposed modification could be justified for - e.g. when image intensity doesn't fully fit into [0; 1], [-1; 1] due to float representation innacuracies, or cases where preserve_range is meant to be used.
If you are 100% sure that loosening a constraint here won't affect the usability of the library, I'm fine with the change.

@jni
Copy link
Member Author

jni commented May 4, 2018

@soupault no, our policy, which may or may not be enshrined in the docs [1], is not to mess with input data unless absolutely necessary.

Secondly, we aim to preserve precision in our operations. If you want to preserve round-trip accuracy on rgb2lab, this means that Lab images have floating point values. If you want to have Lab values match the outside world, this means float values outside of the allowed recommended range:

>>> color.rgb2lab([[[1., 0., 0.]]])
array([[[53.24058794, 80.09230823, 67.20275104]]])

So, we must be able to deal with floating point values outside [0, 1].

Finally, any function that works for float64 and throws an error for float32 is buggy. Support for one (at the API level) should imply support for the other.

You are potentially right that preserve_range should be unambiguous and ubiquitous wherever these issues are encountered. However img_as_* currently don't have this option.

.. [1] The closest I could find is this from our data types page: "we aim to preserve the data range and type of input images". It is not strong enough wording.

@soupault soupault added this to the 0.14 milestone May 9, 2018
@hmaarrfk
Copy link
Member

@jni I think this just needs a rebase.

Fixes scikit-image#3051.

When converting from float, we were checking for valid input ranges
*before* checking whether the target format was also float, in which
case the range doesn't matter. This PR corrects this so that float-only
conversions, which don't entail any rescaling, don't check for input
range.
@jni
Copy link
Member Author

jni commented May 22, 2018

@hmaarrfk well-spotted. @stefanv please merge at your earliest convenience. =)

@codecov-io
Copy link

codecov-io commented May 22, 2018

Codecov Report

Merging #3052 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3052      +/-   ##
==========================================
+ Coverage   85.91%   85.91%   +<.01%     
==========================================
  Files         336      336              
  Lines       27305    27308       +3     
==========================================
+ Hits        23459    23462       +3     
  Misses       3846     3846
Impacted Files Coverage Δ
skimage/util/tests/test_dtype.py 100% <100%> (ø) ⬆️
skimage/util/dtype.py 89.15% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 255b4d5...e29ed81. Read the comment docs.

@stefanv
Copy link
Member

stefanv commented May 23, 2018

I think this PR is incorrect. img_as_float specifically is meant to check ranges of input data. If a function does not require input range checking, it should use image.astype(float), not img_as_float.

@stefanv
Copy link
Member

stefanv commented May 23, 2018

colorconv, on the other hand, has numerous mistakes in the way it interprets data. That module should be fixed to only interpret RGB data as within the dtype limits.

@stefanv
Copy link
Member

stefanv commented May 28, 2018

This should also be addressed by #3111

@jni
Copy link
Member Author

jni commented May 29, 2018

@stefanv I don't think the function name provides any indication of "I'm going to shoot you if your range is incorrect". For that I would use something like ensure_float_range. Our position with range conversions should be to raise an error when absolutely necessary and no earlier. This function currently raises an error way too early, in a place that is unexpected by most of our users (as numerous bug reports stemming from this behaviour show).

@jni
Copy link
Member Author

jni commented May 29, 2018

Also note that #3110 addresses a slightly different problem.

@soupault soupault modified the milestones: 0.14, 0.14.1 May 29, 2018
@jni
Copy link
Member Author

jni commented Jul 24, 2018

@stefanv can you have another look at this? We just got another ping in #3051

@stefanv
Copy link
Member

stefanv commented Jul 24, 2018

This change pushes the responsibility for range checking down to each individual skimage function. Have we done a survey to check whether there are any of these that require the range (0, 1)? Once that survey is done, this would be a reasonable change.

@JDWarner
Copy link
Contributor

JDWarner commented Jul 25, 2018

There are more than we might expect, particularly in the exposure submodule, which do need the assumed float range. Probably also color conversion.

It is worth noting that there are many other instances which will prove difficult to detect and correct - as some of our algorithms' parameters are dependent on the image range.

  1. Some features will technically work, but with defaults that could now be orders of magnitude off. This is a usability and accessibility issue.
  2. Similarly, there are potentially serious public facing reproducibility concerns. A particular set of parameters with the exact same inputs will no longer produce the same results in the same code.

Not trying to rain on any parades; I do see the benefits of letting sleeping floats lie. But the behavior of using img_as_float as an input type AND range sanitizer is extremely long-standing and ingrained.

Would it perhaps be better to retain the auto-scaling as default behavior, with the option to disable it for those who wish, or are unsatisfied with our current behavior? Alternatively, could we satisfy those who want to return to a standard range by providing a method where they can pass their image and we return the exact scaling factor which will be applied, so they can return the result to the same space?

@jni
Copy link
Member Author

jni commented Jul 29, 2018

@JDWarner just to be clear: this PR does not change the conversion behaviour when an input image is within range. It only removes the assertion that an input float image must be within range when passing it through. This means that this will only break someone's code if they were try:ing and catching the ValueError in case of an out-of-range image. I find this scenario extremely unlikely.

Going forward, we can (and should) have any functions that depend on an explicit range check to make that check explicitly.

@soupault soupault modified the milestones: 0.14.1, 0.15 Aug 20, 2018
@jni
Copy link
Member Author

jni commented Aug 21, 2018

Here's another argument that I discovered while doing this audit: img_as_float already lets float64 images through unchanged, and we depend on this behaviour during our testing. From skimage/feature/tests/test_hog.py:

img = img_as_float(data.astronaut()[:256, :].mean(axis=2))`

mean(axis=2) takes mean on the raw uint8 array, and returns a float64 array with values in [0, 255]. img_as_float lets those through like a champ. So this PR is merely extending this same behaviour to float32.

@@ -94,6 +94,11 @@ def test_float_out_of_range():
img_as_int(too_low)


def test_float_float_all_ranges():
arr_in = np.array([[-10., 10., 1e20]], dtype=np.float32)
np.testing.assert_all_close(img_as_float(arr_in), arr_in)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not strict equality?

@emmanuelle emmanuelle merged commit 68a0818 into scikit-image:master Sep 6, 2018
@stefanv
Copy link
Member

stefanv commented Sep 6, 2018

👍

@jni
Copy link
Member Author

jni commented Sep 6, 2018

🎉

@jni
Copy link
Member Author

jni commented Sep 6, 2018

@meeseeksdev please backport to v0.14.x

@jni
Copy link
Member Author

jni commented Sep 6, 2018

@meeseeksdev backport to v0.14.x

@jni
Copy link
Member Author

jni commented Sep 6, 2018

Haha turns out you can't be polite to a bot. No wonder they will eventually exterminate the human race. But, I tried!

@Carreau
Copy link
Contributor

Carreau commented Sep 6, 2018

I'll see what I can do about that.

@meeseeksdev please behave better.

@lumberbot-app
Copy link

lumberbot-app bot commented Sep 6, 2018

Awww, sorry Carreau you do not seem to be allowed to do that, please ask a repository maintainer.

@Carreau
Copy link
Contributor

Carreau commented Sep 6, 2018

Lol.

soupault added a commit that referenced this pull request Sep 7, 2018
…2-on-v0.14.x

Backport PR #3052 on branch v0.14.x (Allow float->float conversion of any range)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants