Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix huge imshow range #18458

Merged
merged 2 commits into from Sep 14, 2020
Merged

Conversation

tacaswell
Copy link
Member

PR Summary

closes #18415

We need to special case when rescaling the user input for
interpolation with LogNorm. The issue is that due to the inherent
precision limits of floating point math we can end up with errors on
the order if 1e-18 of the overall data range when rescaling. If the
bottom vlim is close to 0 and the top is order 10e20 than this error
can result in the rescaled vmin being negative which then fails when
we (temporarily) change the vmin/vmax on the norm.

We started adjusting the vmin/vmax to work around the same issue with
float precision in #17636 / 3dea5c7 to make sure that values exactly
equal to the limits did not move across the boundary and get
erroneously mapped is over/under.

Long term we may want to add the ability for the norms to suggest to
their clients what the constraints on the vmin/vmax are, but
hard-coding a special case for LogNorm is the pragmatic solution.

There are still some issues with this sort of extreme data range. With the default interpretation (anti-aliased) when zoomed out we will get stippling that disappears when you zoom in (because the interpolation kernels are not perfect and the constant background at -1 picks up noise on a scale big enough to get above 100 (this is because we shrink and re-scale the data to the [0, 1] range before interpolation and then scale back out to the full range at the end). It is extra visible on this example because things are flipping between "bad" (defaults to white) and "good but small".

PR Checklist

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (run flake8 on changed files to check).

closes matplotlib#18415

We need to special case when rescaling the user input for
interpolation with LogNorm.  The issue is that due to the inherent
precision limits of floating point math we can end up with errors on
the order if 1e-18 of the overall data range when rescaling.  If the
bottom vlim is close to 0 and the top is order 10e20 than this error
can result in the rescaled vmin being negative which then fails when
we (temporarily) change the vmin/vmax on the norm.

We started adjusting the vmin/vmax to work around the same issue with
float precision in matplotlib#17636 / 3dea5c7 to make sure that values exactly
equal to the limits did not move across the boundary and get
erroneously mapped is over/under.

Long term we may want to add the ability for the norms to suggest to
their clients what the constraints on the vmin/vmax are, but
hard-coding a special case for LogNorm is the pragmatic solution.
@tacaswell tacaswell added this to the v3.3.2 milestone Sep 11, 2020
@jklymak
Copy link
Member

jklymak commented Sep 11, 2020

This is probably fine for 3.3.2 but this really needs a proper rethink with tolerances for floating point error built in consistently and in a way we can explain to end users.

@tacaswell
Copy link
Member Author

but this really needs a proper rethink with tolerances for floating point error built in consistently and in a way we can explain to end users

I'm really not sure there is a "right" way to do this because no matter what you do it is wrong in some situation.

@jklymak
Copy link
Member

jklymak commented Sep 11, 2020

I agree its frustrating! But I feel we've been playing whack-a-mole - I think a bit of research and carefully laying out what we do and can't do would go a long way to fixing things.

FWIW, I think the current bug is pretty bad. I think that the "bug" we were trying to fix should really have been the user's domain. If you have vmin=1 and vmax=1e20, you cannot expect x=1e20 to be guaranteed to not be "over", and the user should know enough to put a bit of slop into the vmax they choose.

@dopplershift
Copy link
Contributor

@jklymak I'm not sure this is the right venue to hash this out, but it's not even obvious to me why you'd need any margin on vmax.

@jklymak
Copy link
Member

jklymak commented Sep 11, 2020

I think #16910 (comment) describes the issue relatively well, though I guess we eventually came to the wrong conclusion.

@tacaswell
Copy link
Member Author

I stand by the conclusion we landed on, we should not expect users to be aware of our internal mechanizations / representations of the data / transforms and we should not silently re-order values.

All of this difficulty is driven by us using the Agg re-sampler for our up/down sampling of the images and it's [0, 1] + hard clipping constraints (which was in turn driven by changing the default colormap, seeing red in a viridis colored image, and realizing the colormapping -> resampling in RGBA space was not the right thing to do). If we replaced the resampler with something without those constraints we would at least be able to move all of these issues to the other side of the function call or avoid all together.

@tacaswell
Copy link
Member Author

The two py38 failures are collisions with nbconvert. Not sure why it is windows only, but more recent jobs run with nbconvert 6.0.2 and these failed with 6.0.1, restarted the jobs to hopefully pick up the new versions.

@jklymak
Copy link
Member

jklymak commented Sep 14, 2020

I'm just suggesting that for 3.4 someone sit down and write it out carefully and prove that its mathematically correct, and that we clearly delineate any floating point limitations we can't get around.

@anntzer
Copy link
Contributor

anntzer commented Sep 14, 2020

Wouldn't it be simpler at some point to just rewrite the resampling code ourselves? It can't be that hard(*)...

(* conditions apply)

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the current problem...

@jklymak
Copy link
Member

jklymak commented Sep 14, 2020

Wouldn't it be simpler at some point to just rewrite the resampling code ourselves? It can't be that hard(*)...

2-D convolution is pretty slow so I'd think it would have to be done in C? How quick is skimage and would we want that as a dependency? OTOH I don't see that it does filtering as it resamples, so its different than Agg. Same with scipy.ndimage.zoom

@anntzer
Copy link
Contributor

anntzer commented Sep 14, 2020

I don't mind rewriting the resampling code in C myself (some day).

@jklymak
Copy link
Member

jklymak commented Sep 14, 2020

... and all the filter/interp method support etc? It just seems like the kind of thing that people have done 100 times. Its just the Agg algorithm doesn't track over/under, whereas that is a "feature" we have.

@anntzer
Copy link
Contributor

anntzer commented Sep 14, 2020

The filters are likely not that hard to implement themselves, as they are probably just one formula each...

@dopplershift dopplershift merged commit 0b21c7c into matplotlib:master Sep 14, 2020
@lumberbot-app
Copy link

lumberbot-app bot commented Sep 14, 2020

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
$ git checkout v3.3.x
$ git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
$ git cherry-pick -m1 0b21c7c2adbd547e4357d3ada9b0b3ab72d441af
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
$ git commit -am 'Backport PR #18458: Fix huge imshow range'
  1. Push to a named branch :
git push YOURFORK v3.3.x:auto-backport-of-pr-18458-on-v3.3.x
  1. Create a PR against branch v3.3.x, I would have named this PR:

"Backport PR #18458 on branch v3.3.x"

And apply the correct labels and milestones.

Congratulation you did some good work ! Hopefully your backport PR will be tested by the continuous integration and merged soon!

If these instruction are inaccurate, feel free to suggest an improvement.

tacaswell pushed a commit to tacaswell/matplotlib that referenced this pull request Sep 14, 2020
Merge pull request matplotlib#18458 from tacaswell/fix_huge_imshow_range

Fix huge imshow range

Conflicts:
	lib/matplotlib/colors.py
          - had spurious commit in the initial PR which re-factored
          code not on the 3.3.x branch
	lib/matplotlib/tests/test_image.py
          - conflicts from many tests added to bottom of test_image.py
          on default branch
@tacaswell tacaswell deleted the fix_huge_imshow_range branch September 14, 2020 21:06
jklymak added a commit that referenced this pull request Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

imshow with LogNorm crashes with certain inputs
5 participants