Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scatter color: moving #10809 forward #12422

Merged
merged 16 commits into from Dec 17, 2018
Merged

Conversation

efiring
Copy link
Member

@efiring efiring commented Oct 6, 2018

PR Summary

This is a minimal update plus fixup for #10809, followed by a changeset to address #10381.
Along the way, it turned out that examples/units/units_scatter.py was doubly broken. It purported to show masked array support by basic_units.py, but didn't, because the support didn't exist; and it's third panel labeled the y-axis as Minutes but showed the values in Hz. Both of these problems are fixed in this PR.

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

@efiring
Copy link
Member Author

efiring commented Oct 7, 2018

#12293 is causing the merge conflict, and when I try to solve it via a rebase I run into a hard failure because the author name is missing on this commit:

| * commit 2def1850ccc697dce1505b3ee2d3b15747a8bb78
| | Author:  <qi.cui@mail.utoronto.ca>
| | Date:   Fri Mar 16 01:19:57 2018 -0400
| | 
| |     fix scatter not showing points w/ valid x/y invalid color #4354 and added test

I'm sure all this can be straightened out, but I don't want to spend the time on it until the ultimate design is clear.

@@ -4010,7 +4010,7 @@ def dopatch(xs, ys, **kwargs):
label_namer="y")
def scatter(self, x, y, s=None, c=None, marker=None, cmap=None, norm=None,
vmin=None, vmax=None, alpha=None, linewidths=None,
verts=None, edgecolors=None,
verts=None, edgecolors=None, plotinvalid=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make the key-word only now that we are 3only?

@anntzer
Copy link
Contributor

anntzer commented Nov 12, 2018

@efiring Thoughts about also keeping points with nonfinite coordinates (x/y) in? See #12801 for rationale.

@efiring
Copy link
Member Author

efiring commented Nov 13, 2018

@anntzer This PR is not deleting any points; it is passing masked arrays to PathCollection, so I think it addresses all of #12801, but I haven't tested that.
Any idea how to solve the merge/rebase problem I reference above?

@anntzer
Copy link
Contributor

anntzer commented Nov 13, 2018

Indeed, this PR also fixes the issue with invalid coordinates. I'll look into the rebase.

@anntzer
Copy link
Contributor

anntzer commented Nov 13, 2018

@efiring Can I force-push to your branch, or do you prefer that I open a new PR?

For the record: the first commit (by @QiCuiHub) has no author name (I'm actually puzzled how you can create one given that git appears to reject such patches), so I exported it in patch format with

$ git format-patch 2def1850~..2def1850 --stdout >/tmp/patch

manually edited the patch to add a username (QiCuiHub) based on the sole other commit from that author that I could find (https://github.com/OpenExoplanetCatalogue/open_exoplanet_catalogue/commit/9919d9204be3b88f3deb6994541fcf9fbd5fb26b.patch) and applied the patch with

$ git am -3 </tmp/patch

(resolving the merge conflicts along the way).

I then rebased the rest of the commits on top of that.

There are some other things I'd suggest changing (can handle them once the rebase is done):

  • replace the image comparison test by a image equivalency test (it's easy to do in this specific case) and again edit the log to remove the baseline image from the history,
  • make plotinvalid kwonly, as suggested by @tacaswell above,
  • make combine_masks privates, because heh,
  • extra space in test_colorbar,
  • probably worth adding a changelog note.

@efiring
Copy link
Member Author

efiring commented Nov 13, 2018

@anntzer Go ahead and force-push. Thank you.

@anntzer
Copy link
Contributor

anntzer commented Nov 13, 2018

Force-pushed the rebase; I'll work on the other items mentioned above.

@anntzer
Copy link
Contributor

anntzer commented Nov 14, 2018

Handled all my own points above.
Added an additional commit renaming plotinvalid to plotnonfinite in reference to the standard isfinite function (https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.isfinite.html https://en.cppreference.com/w/c/numeric/math/isfinite) (where "nonfinite" means either infinite or nan), as I mildly prefer this name. Feel free to revert if you disagree.

Copy link
Contributor

@anntzer anntzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually in this design plotnonfinite doesn't do anything; even when it is not set, nan-valued points will be plotted with the "bad" color.
Usually this won't have an effect as the default "bad" color is "none", but this can be seen with e.g.

cm = get_cmap("viridis", 16); cm.set_bad("k")
scatter(range(3), range(3), c=[0, np.nan, 1], cmap=cm)

Given that the "default" behavior is not changed and that plotnonfinite doesn't really achieve anything, perhaps we should just not add that flag?

@efiring
Copy link
Member Author

efiring commented Nov 14, 2018

@anntzer Good point--I missed that. It appears to mean, however, that the Collection is not handling masked arrays as offsets the way I thought it was--it must be ignoring the masking. I thought I checked that when I started this PR. On the other hand, presumably the reason that scatter deletes points (I think I wrote that part long ago) is that originally masked offsets were not respected.
Regardless of how that is resolved I would be happy to leave out the new kwarg so that scatter would be consistent with other Mappables, but this would require allowing an abrupt, but for most people invisible, change in behavior.

@anntzer
Copy link
Contributor

anntzer commented Nov 14, 2018

Just to clarify:

  1. Currently PathCollection doesn't respect masking.
  2. As it doesn't respect masking, plotnonfinite has no effect as it only tries to mask nonfinite points.
  3. Merging the PR with or without plotnonfinite would result in the same, mostly invisible change in behavior; I would thus favor merging it without introducing the new plotnonfinite API (happy to fix the PR for that).
  4. Strictly maintaining the previous behavior (not drawing nonfinite-valued points) would require fixing PathCollection to respect masking.

Agreed in the understanding of the situation? Agree with the strategy in 3?

@efiring
Copy link
Member Author

efiring commented Nov 14, 2018

@anntzer I do not agree with the strategy in 3, because with or without that strategy the PR is based on a false premise and is supplying masked arrays in a context where the masks are ignored. Either the Collection should be modified to accept masked arrays, or scatter should be modified to not supply them. I would like to investigate both possibilities, with a bias toward the former. I realize that in many lower-level places masked arrays end up replaced by floating-point arrays with NaNs, in part because they are easier to handle in C++ code. If that conversion is what is needed here, then it is just a question of where that conversion should occur.

@anntzer
Copy link
Contributor

anntzer commented Nov 15, 2018

Looking at how e.g. Line2D handle masked arrays, they actually have to keep both the masked and unmasked-nan-filled versions around (self._xorig/self._x) because indeed there is no "simple" C-API for the masked arrays (and also because xorig is unitful whereas x is unitless), and then do a careful dance to keep track of modifications to them.
Frankly, I'd rather not reproduce this contraption (which is also costly memorywise) for other artists; in general I find masked arrays a bit of a pain for what they provide (but I did see your comment arguing otherwise in the recent numpy-discussion thread).
My preference would be to make scatter not provide masked arrays. If instead we do want PathCollections to handle masked arrays then I think we should generally handle them at the renderer level (as I said above there's no simple C-API, but you can still access the mask attribute, etc., just need to be careful with refcounts and all; also note that the renderers "already" have to handle nans so they normally already have an "invalid value" code path).

@timhoffm
Copy link
Member

Ok, did only read the full thread after reviewing the code 😏. Comments withdrawn as there are more fundamental issues to discuss first.

@efiring
Copy link
Member Author

efiring commented Nov 19, 2018

It turns out that Collection was handling masked arrays as intended, but there was a bug in scatter, and one in its helper, _combine_masks. In addition to fixing those (I hope), I added a test for the plotnonfinite=False case.

@anntzer
Copy link
Contributor

anntzer commented Nov 19, 2018

Indeed, looks like this fixes most issues.
Last point: right now infinite-valued points are plotted (when plotnonfinite=True) using the bad color, rather than the over/under colors, which I think is a more intuitive behavior. (However looking at the colors.py code it's not clear it's that easy to fix, as this would effectively require computing the min/max having masked out the infinities, but then unmask them for the colormapping. So perhaps it's not worth it, at least for this PR.)

@anntzer anntzer dismissed their stale review November 19, 2018 10:14

comments handled

@anntzer
Copy link
Contributor

anntzer commented Dec 12, 2018

I force pushed the branch with the image edited out, but the rest of the commit still here.

@tacaswell tacaswell merged commit 11072f2 into matplotlib:master Dec 17, 2018
@tacaswell
Copy link
Member

Thanks @efiring and @anntzer !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants