Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make pcolor more mesh-like #25027

Merged
merged 1 commit into from Jul 6, 2023
Merged

Conversation

greglucas
Copy link
Contributor

@greglucas greglucas commented Jan 19, 2023

PR Summary

This is a follow-on from #24638
and makes pcolor more similar to pcolormesh with 2D arrays of x/y/c and just uses a different draw implementation. This should maintain backwards compatibility as we are subclassing the PolyCollection still to use that draw method.

  • pcolor: draw every quad individually as Polygons
  • pcolormesh: draw all quads together (depending on backend implementation)

Basically, this is an attempt to allow all inputs/getters/setters to be able to use 2D meshes rather than requiring raveling arrays and needing to remember when to use 1D/2D quantities for various functions.

closes #25770

PR Checklist

Documentation and Tests

  • Has pytest style unit tests (and pytest passes)
  • [N/A] Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • [N/A] New plotting related features are documented with examples.

Release Notes

  • [N/A] New features are marked with a .. versionadded:: directive in the docstring and documented in doc/users/next_whats_new/
  • [N/A] API changes are marked with a .. versionchanged:: directive in the docstring and documented in doc/api/next_api_changes/
  • [N/A] Release notes conform with instructions in next_whats_new/README.rst or next_api_changes/README.rst

Copy link
Member

@efiring efiring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm strongly in favor of this, but I have questions about one detail and one matter of strategy.

X = np.hstack((X[:, [0]] - dX[:, [0]],
X[:, :-1] + dX,
X[:, [-1]] + dX[:, [-1]]))
if isinstance(X, np.ma.core.MaskedArray):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more compact you can use if np.ma.isMA(X): or if np.ma.isMaskedArray(X):. Both just call isinstance(...). If you want to use the explicit isinstance, you can still make it more compact be deleting the .core part. The most compact version would be

hstack = np.ma.hstack if np.ma.isMA(X) else np.hstack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I also cleaned up the handling above this as well.

@@ -2111,3 +2111,39 @@ def get_cursor_data(self, event):
if contained and self.get_array() is not None:
return self.get_array().ravel()[info["ind"]]
return None


class PolyQuadMesh(PolyCollection, QuadMesh):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to reverse the inheritance order, and use "super" below? Is the PolyQuadMesh more like a QuadMesh than a PolyCollection, falling back on the latter only for its draw method, as is already done explicitly below? Or is the current order correct with respect to all inherited methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we actually want most of the methods from PolyCollection because we are drawing Polygons and want that set_paths() method. We really want the QuadMesh coordinates stored internally is all I think. I reworked the inheritance a bit in the most recent commit.

One thing I need to think about a bit more is how to handle the masking of paths here... len(self._paths()) != coords[:2].size when there are masked elements. So, we may actually have to overwrite the set_paths() here to make sure that we are updating the proper facecolors to correspond to the mesh.

Currently, we return a PolyCollection that has a smaller number of paths, so if you want to update the array with the proper colors you need to account for that resizing yourself. (We do that in Cartopy, so perhaps just porting that upstream here is the way to go)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out my understanding of multiple inheritance was, and probably still is, incomplete. There are arguments in favor of "composition instead of inheritance", and I wonder whether this is an example where using composition--making an internal QuadMesh instance, and referring to its contents explicitly when needed--might be more readable, robust, and maintainable. It would reduce the inheritance to single--much easier to fit in one's brain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, is there a part of QuadMesh code that should be factored out into a mixin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a _MeshData mixin as a separate commit. Let me know your thoughts on how that organization seems now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mixin looks fine to me.

@greglucas
Copy link
Contributor Author

Discussed this on the call today and there was some interest in pursuing the class mixin idea here to try and make the multiple-inheritance a little bit more manageable.

We want the coordinate handling of QuadMesh, but the drawing capability of PolyCollection. The mixin will likely look like some 2D coordinate handling class without a .draw() method. Then the current QuadMesh would still implement its own .draw() relying on the coordinate handling from the mixin. This would eliminate some of the confusion for where the coordinate handling/drawing are coming from if the mixin is designed to only handle the 2D coordinate work and not the drawing.

class QuadCoordinatesMixin
    def __init__(self, coordinates):
        self._coordinates = coordinates
    
    def set_array(self, A):
        # current quadmesh array validation logic for the shapes

    def get_coordinates(self):
        return self._coordinates

    # ... other (static)methods like get/set_paths()?

 class QuadMesh(QuadCoordinatesMixin, Collection):
    # override draw/path methods from Collection

class PolyQuadMesh(QuadCoordinatesMixin, PolyCollection):
    # override draw/path methods from PolyCollection to utilize 2D coordinates from the mixin

Other thoughts and comments on the design are more than welcome!

@jklymak jklymak added the status: needs comment/discussion needs consensus on next step label Feb 6, 2023
@jklymak
Copy link
Member

jklymak commented Feb 20, 2023

This seems to be in draft state, but feel free to move back...

@greglucas
Copy link
Contributor Author

@rcomer, I think it was fairly straight forward to support RGB(A) here as well. I just pushed a new commit that seems to work well with that case, although I didn't add any tests yet.

@efiring, pinging you to get your thoughts on this restructure again.

@rcomer
Copy link
Member

rcomer commented Apr 29, 2023

I tried this out with Cartopy, and it breaks the usage there because Cartopy expects a masked array to be compressed and subsequently passes an array to set_array based on that assumption. Here is a simple example:

import matplotlib.pyplot as plt
import numpy as np

arr = np.ma.array(np.arange(12).reshape(3, 4), mask=[[0, 0, 0, 0],
                                                     [1, 1, 1, 1],
                                                     [0, 1, 0, 1]])
               
fig, ax = plt.subplots()
pc = ax.pcolor(arr)

pc.set_array(np.ones(6))

plt.show()

With v3.7 and main, this gives

test

With this branch we get

ValueError: For X (5) and Y (4) with flat shading, A should have shape (3, 4, 3) or (3, 4, 4) or (3, 4) or (12,), not (6,)

There is also handling in Cartopy that expects to get the compressed array out of get_array. It's very easy to fix all of this in Cartopy, but I wonder if there should be a deprecation pathway?

@greglucas
Copy link
Contributor Author

I'm not sure if the "compressed" nature of the output is a feature or a bug... A similar "bug" for scatter changing the size of the array and the colors #9891. In Cartopy, we decided to work with the compression, but I really don't think that is the right way to do it because you have to keep track of everything externally and remember how you compressed your masks each time. I've just pushed up a new commit that changes the number of polys based on the masked array passed into set_array() and grows/shrinks appropriately.

One thing this does bring up though is that there is an inconsistency between the array() methods which assume 2d in this PR and facecolors() / hatch etc.. that are all on the flattened PolyCollection with masked elements removed. So I am making a bit of an assumption that the array is the quantity of importance to the MeshCollection and not the others...

I checked seaborn and Astropy and didn't see any usage of pcolor(), so I am guessing this is pretty limited in actual impacted users.

@efiring
Copy link
Member

efiring commented May 1, 2023

I think the compression is a misfeature, an unfortunate historical artifact, so the question is not whether to keep it in the long run but whether to try to support it through a deprecation cycle. I haven't looked closely, but it seems like temporarily continuing to allow the compressed 1-D input (with a deprecation warning) could be done now that you are saving the mask.

@rcomer
Copy link
Member

rcomer commented May 1, 2023

I completely agree that it's better without the compression. It's just the transition that might be a problem. This will break users of GeoAxes.pcolormesh with wrapping until they get an updated Cartopy. (Apologies if you already thought of that and have plans for a quick release of Cartopy).

@greglucas
Copy link
Contributor Author

Thanks for both of your suggestions! I added a deprecation warning for when the compressed size of the array is used. We still defer to super's set_array() by making our own filled array, so this special casing just needs to be removed in a couple of releases.

@efiring
Copy link
Member

efiring commented May 14, 2023

I'm not at all sure it would be worthwhile, but it would be possible to give the _MeshData.get_facecolor etc. methods a flatten kwarg so that they would flatten for PolyQuadMesh, but maintain their dimensions for QuadMesh, making it more consistently 2-D.

@greglucas
Copy link
Contributor Author

I'm not at all sure it would be worthwhile, but it would be possible to give the _MeshData.get_facecolor etc. methods a flatten kwarg so that they would flatten for PolyQuadMesh, but maintain their dimensions for QuadMesh, making it more consistently 2-D.

I agree it would be nicer for consistency and that is actually the way it is on the current main branch after #24638. But, a few downstream packages test against get_facecolors() and thus expect the flattened shape (apparently get_array() is not as much of an issue downstream which I guess makes some sense in they already know the data they passed in so don't need to test against that yet again). The discussion over flattening or not for QuadMesh is in this issue: #25162

So the current state of this PR would have array/data as 2d arrays, but facecolors and edgecolors (and all other collection properties) are always flattened. I think this is reasonable for taking incremental steps forward here and we can approach turning more collection properties into 2d meshes in future PRs if we think it is worthwhile.

@greglucas greglucas force-pushed the pcolor-2dmesh branch 3 times, most recently from 7486412 to f182fc7 Compare June 3, 2023 23:17
super().__init__(coordinates=coordinates, shading=shading)
Collection.__init__(self, **kwargs)

self._antialiased = antialiased
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using a different approach to handling the antialiased kwarg in #26062. See lines 1991-2007 there in the modified collections.py file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'm happy to rebase/update if that goes in first. This is just using the same logic that was there before. I'm trying to keep this PR focused on just the substantial rewrites and leave other updates for follow-on PRs.

@tacaswell tacaswell added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Jun 15, 2023
@greglucas greglucas force-pushed the pcolor-2dmesh branch 2 times, most recently from 29709b2 to 890afaf Compare June 16, 2023 02:41
@greglucas
Copy link
Contributor Author

@jklymak I added two new notes, one for the deprecation and one trying to give a description of the pcolor() capabilities including masking and hatching in a "whats new" entry. Open to other suggestions if you think something else would be better here.

@@ -23,11 +23,11 @@
"lib/matplotlib/colorbar.py:docstring of matplotlib.colorbar:1"
],
"matplotlib.axes.Axes.patch": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this file getting hit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc build was complaining about inherited private members (if I remember correctly). This file is automatically generated, so there are also other missing references that have been updated unrelated to this PR:

matplotlib/doc/conf.py

Lines 184 to 185 in f588d2b

# change this to True to update the allowed failures
missing_references_write_json = False

``PolyQuadMesh`` is a new class for drawing quadrilateral meshes
----------------------------------------------------------------

``plt.pcolor()`` now returns a ``PolyQuadMesh`` class. It is a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to the class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's helpful to point out what used to be returned and maybe any major usage changes we expect the user to see (if any)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated it with references to what used to be returned PolyCollection, and tried to indicate that we are subclassing PolyCollection still, so things should hopefully be transparent to the end-user.

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.pcolormesh.html#differences-pcolor-pcolormesh needs to be modified, and that is probably a good place to explain a bit more about what is going on here.

@greglucas
Copy link
Contributor Author

I think https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.pcolormesh.html#differences-pcolor-pcolormesh needs to be modified, and that is probably a good place to explain a bit more about what is going on here.

Thanks for pointing that out, I updated that with a new sentence at the end and updated the mentions of PolyCollection to point to PolyQuadMesh now.

@@ -0,0 +1,4 @@
``PolyQuadMesh.set_array()`` with compressed values
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
... is deprecated. Instead, pass the full 2D array of values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is note isn't useful to users as PolyQuadMesh is new. I think you need something like the object returned from pcolor wants 2D values passed to set_array. "Compressed values" isn't right either. I think you need to be more loquacious here. Maybe:

Object returned by `pcolor` has changed to `PolyQuadMesh`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The old object was a `PolyCollection` with flattened vertices and array data.   
The new  `set_array` method on `PolyQuadMesh` requires a 2D array rather than a flattened 
array.  The old flattened data is still accepted, but deprecated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I expanded upon this. But, just to note that flattened data can be input still (same as QuadMesh), the only thing deprecated is the part where a mask was used and previously compressed and lost information about which polygons were present.

return ~(mask | self._original_mask)
# Take account of the array data too, temporarily avoiding
# the compression warning and resetting the variable after the call
temp_deprecated_compression = self._deprecated_compression
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use _setattr_cm here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, much nicer, thanks! Just pushed that update. Yes, I believe this is ready to go.

@tacaswell
Copy link
Member

I think this is ready to go?

This adds a new collection PolyQuadMesh that is a combination of
a PolyCollection and QuadMesh, storing all of the coordinate data
and arrays through QuadMesh, but drawing each quad individually
through the PolyCollection.draw() method.

The common mesh attributes and methods are stored in a private
mixin class _MeshData that is used for both QuadMesh and PolyQuadMesh.

Additionally, this supports RGB(A) arrays for pcolor() calls now.

Previously to update a pcolor array you would need to set only the
compressed values. This required keeping track of the mask by the user.
For consistency with QuadMesh, we should only accept full 2D arrays
that include the mask if a user desires. This also allows for the
mask to be updated when setting an array.
@@ -818,12 +823,12 @@ def test_autolim_with_zeros(transform, expected):
np.testing.assert_allclose(ax.get_xlim(), expected)


def test_quadmesh_set_array_validation():
def test_quadmesh_set_array_validation(pcfunc):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional, but I'd rather put an explicit parametrization on the handful of functions that use this. Because of the indirections, it's hard to see what's going on with the fixture.

Suggested change
def test_quadmesh_set_array_validation(pcfunc):
@pytest.mark.parametrize(pcfunc, ["pcolormesh", "pcolor"])
def test_quadmesh_set_array_validation(pcfunc):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference either, probably slightly lean towards fixture as it currently is. I am away from a computer for a week though, so if someone wants to make updates to this feel free to push to my branch.

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets merge and follow up if anything is problematic

@jklymak jklymak merged commit d59d7e5 into matplotlib:main Jul 6, 2023
37 of 39 checks passed
@scottshambaugh
Copy link
Contributor

scottshambaugh commented Jul 6, 2023

@jklymak It looks like this is failing CI on the main branch. Not immediately obvious to me what the issue is.

@jklymak
Copy link
Member

jklymak commented Jul 6, 2023

Can you elaborate on what broke

@rcomer
Copy link
Member

rcomer commented Jul 6, 2023

The problem is that the x- and y-values in the test have alternating columns that are all masked, so become completely masked arrays within _interp_grid. Previously np.hstack was used, which would have stripped the mask. But now we're using np.ma.hstack.

def _interp_grid(X):
# helper for below
if np.shape(X)[1] > 1:
dX = np.diff(X, axis=1)/2.
if not (np.all(dX >= 0) or np.all(dX <= 0)):
_api.warn_external(
f"The input coordinates to {funcname} are "
"interpreted as cell centers, but are not "
"monotonically increasing or decreasing. "
"This may lead to incorrectly calculated cell "
"edges, in which case, please supply "
f"explicit cell edges to {funcname}.")
hstack = np.ma.hstack if np.ma.isMA(X) else np.hstack
X = hstack((X[:, [0]] - dX[:, [0]],
X[:, :-1] + dX,
X[:, [-1]] + dX[:, [-1]]))

I guess the question is whether the test represents a realistic use-case.

@jklymak
Copy link
Member

jklymak commented Jul 6, 2023

I'm confused why this was passing CI before?

@rcomer
Copy link
Member

rcomer commented Jul 7, 2023

I'm confused why this was passing CI before?

Having slept on it, I realise it shouldn't have. I have opened #26273 to modify the test.

@greglucas greglucas deleted the pcolor-2dmesh branch July 9, 2023 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. status: needs comment/discussion needs consensus on next step topic: pcolor/pcolormesh
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH]: support RGB(A) in pcolor
7 participants