Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored matplotlib ElementPlots by adding higher-level API #438

Merged
merged 34 commits into from Feb 16, 2016

Conversation

Projects
None yet
2 participants
@philippjfr
Copy link
Contributor

philippjfr commented Feb 3, 2016

This refactor avoids a lot of the duplication that has spread across all existing matplotlib plots. It should also make it simpler for a user to create a new plot. I'll summarize the overall API once I've worked on it a little bit more.

Philipp Rudiger added some commits Feb 3, 2016

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Feb 3, 2016

Great! Will this put us in a position where we can document these classes and their methods properly? Maybe even write some sort of developer documentation about it?

@philippjfr

This comment has been minimized.

Copy link
Contributor Author

philippjfr commented Feb 13, 2016

The general idea behind this PR is to simplify the implementation of matplotlib plotting classes to reduce redundancy and make it easier to implement new matplotlib plotting classes.

Previously the main methods to implement a plotting class were:

  • initialize_plot: Would have to handle everything from unpacking the data in the element, getting ranges, getting styles, calling the the plotting function, storing the artist on handles and call finalize_axis. This meant that there were ~10 lines of boilerplate per plotting class.
  • update_handles: This method would update existing artists by accessing the handles and manually setting the new data. This had to be implemented if just to tear down the existing artist and redraw it from scratch. It also required boilerplate to get the styles and unpack the data in a very similar way as the initialize_plot call would already have done.

To make this old API workable a lot of plotting classes implemented custom workarounds, e.g. so that each plot didn't have two separate implementations to unpack the element. However in developing the bokeh backend I came up with a better solution. I've now ported some of the ideas from the bokeh backend to matplotlib. It's important to mention that because the matplotlib API is sometimes awkward this new API is entirely optional and some plots will retain the old API for now.

The general idea is to separate the unpacking of the data and keyword arguments from the element from the actual plotting. The ElementPlot baseclass now has a default implementation of initialize_plot, which does all the boilerplate stuff mentioned above, then calls a new method called get_data and passes the output from that method to another new method called init_artist. These two methods are generally enough to implement a plot type but if you want to manually update the handles on a plot instead of recreating them you also implement a update_handles method, which can itself call get_data to get the latest data. If the update_handles isn't subclassed the previous artist will be torn down by a method called teardown_handles and then reinitialized using init_artist.

The signature for these methods is the following:

  • get_data [Required]:
    • Inputs: element, ranges, style
    • Outputs: plot_args, plot_kwargs, axis_kwargs
    • This method should unpack the element and return the appropriate arguments for the plotting call. Additionally it may also supply axis_kwargs, which allow supplying custom ticks and axis labels (e.g. think of a HeatMap where the x-axis and y-axis ticks are derived from the data).
  • init_artist [Required]:
    • Inputs: axis, plot_args, plot_kwargs
    • Output: Dictionary of artists
    • This method is given an axis to draw onto as well as plot_args and plot_kwargs to pass to the plotting call, should return a dictionary of artists, by convention the main artist should have the key 'artist'.
  • update_handles [Optional]:
    • Optional method to update existing handles, if not implemented will simply recreate artist for each frame. Otherwise should use get_data method to unpack required data and then update artists manually with the data.
    • Inputs: key, axis, element, ranges, style
    • Output: Optional dictionary of axis arguments, e.g. xticks, yticks etc.
  • teardown_handles [Optional]:
    • Only required if some custom teardown procedure is required, i.e. if no update_handles method has been implemented and the plot does not consist of a simple matplotlib artist or collection. Should go through handles and remove any which are to be redrawn.
    • Inputs: None
    • Outputs: None

Overall implementing all this roughly halves the plotting code required and in future when matplotlib provides a better interface to update existing artists, it should be possible to feed he output of get_data straight to the artist without manually updating various attributes. As you can see this almost halves the length of the various plotting implementations I have refactored but there's still full flexibility to override initialize_plot and update_handles completely as is still the case for Histogram and Bar plots, where more flexibility is required.

@philippjfr philippjfr added this to the v1.5.0 milestone Feb 13, 2016

@philippjfr philippjfr self-assigned this Feb 13, 2016

Philipp Rudiger added some commits Feb 13, 2016

@philippjfr philippjfr force-pushed the mpl_refactor branch from 192f63e to 17e2778 Feb 13, 2016

@philippjfr philippjfr force-pushed the mpl_refactor branch from 04e5f6d to 5d0c781 Feb 14, 2016

@philippjfr philippjfr force-pushed the mpl_refactor branch from 5d0c781 to 8614435 Feb 14, 2016

@philippjfr philippjfr force-pushed the mpl_refactor branch from 315b06e to 003e9b7 Feb 14, 2016

Philipp Rudiger added some commits Feb 14, 2016

@philippjfr

This comment has been minimized.

Copy link
Contributor Author

philippjfr commented Feb 15, 2016

Okay I was wrong the 3d stuff wasn't the only change. There are three display changes:

  1. 3d plot edgecolors are handled better now resulting in some changes in Elements tutorial.
  2. HeatMap annotations now drawn slightly differently resulting in one display change in Continuous Coordinates.
  3. VectorFields thickness has changed and I can't figure out why. I checked that all the arguments to the quiver call are the same in both implementations. Mysterious but barely visually noticeable.

The first two are fixes/improvements and I really don't know what to do about 3). All three are very minor and purely aesthetic changes, so I'll focus on documenting now.


ranges = self.compute_ranges(self.hmap, key, ranges)
ranges = match_spec(element, ranges)
def init_artist(self, ax, plot_data, plot_kwargs):

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2016

Contributor

Should be called init_artists if multiple artists can be initialized here.

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2016

Author Contributor

Agreed, will rename.

@philippjfr philippjfr changed the title Initial refactor of matplotlib ElementPlots Refactored matplotlib ElementPlots by adding higher-level API Feb 16, 2016

return self._finalize_axis(self.keys[-1], ranges=ranges)
if element.get_dimension(self.size_index):
sizes = element.dimension_values(self.size_index)
ms = style.pop('s') if 's' in style else plt.rcParams['lines.markersize']

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2016

Contributor

Happy to use rcParams here? I thought rcParams could sometimes be tricky...

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2016

Author Contributor

Hmm I don't think there are any particular issues but I guess we could require that a 's' is supplied to apply points scaling.

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2016

Author Contributor

I think it's best to postpone this to another PR as to avoid changing any more test data.



def _get_min_dist(self, vfield):
"Get the minimum sampling distance."
xys = np.array([complex(x,y) for x,y in zip(vfield.dimension_values(0),
vfield.dimension_values(1))])
xys = vfield.array([0, 1]).view(dtype=np.complex128)

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2016

Contributor

Does 128 matter? Won't any complex datatype do and if so, why specify 128?

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2016

Author Contributor

Good point, it's relying on the two columns being 64-bit, will have to double check this or revert it.

artist = self.handles['artist']
data, array, clim = self.get_data(element, ranges)
(data,), kwargs, axis_kwargs = self.get_data(element, ranges, style)

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2016

Contributor

A unituple? I assume there are instances where this tuple is longer?

Edit: Later on I see it returned as a list. Are you expecting a tuple or a list? A list seems sensible and works nicer to pattern match a single element...

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2016

Author Contributor

It's the arguments to the plotting call and you're right they should be consistent.

def get_data(self, element, ranges, style):
xs, ys, zs = (element.dimension_values(i) for i in range(3))
self._compute_styles(element, ranges, style)
# Temporary fix until color handling is deterministic in py3

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2016

Contributor

Is it python 3 or matplotlib the issue here?

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2016

Author Contributor

The behavior seems to be deterministic in python 2 but not python 3, not sure if it's down to dictionary ordering or something else.

cdim = points.get_dimension(self.color_index)
style = self.style[self.cyclic_index]
offsets, style, _ = self.get_data(element, ranges, style)
artist._offsets3d = offsets

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2016

Contributor

You don't seem to use offset later so I would assign it straight to artist._offsets3d.

artist = axis.plot_trisurf(x, y, z, vmax=vrange[1],
vmin=vrange[0], **style_opts)
self.handles['artist'] = artist
return (x, y, z), style, {}

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2016

Contributor

Now back to returning data as a tuple...

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2016

Author Contributor

Right as above, they should be consistent.




class DrawPlot(ElementPlot):

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2016

Contributor

Good to see a class go. Think we have any backward compatibility concerns about this though?

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2016

Author Contributor

Bas' Sphere thing is the only example of this that I know of.

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2016

Contributor

I also had a networkX prototype that used this but that never went anywhere...

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2016

Author Contributor

Basically all you do now is define a teardown_handles method containing:

if self.zorder == 0: self.handles['axis'].cla()

and then define a get_data and init_artist method, if you ever figure out how to update the plot by changing an existing artist you just replace the teardown_handles method with an update_handles method and it will use that instead.

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Feb 16, 2016

Made a few minor comments but overall it does seem like a nice improvement. As it is a fairly large PR, I have a few general comments:

  • We need to be very careful when updating the test data. We must make sure we know exactly what changed and that it is ok.
  • Probably worth running flakes (i.e flake8) on your branch to see if it catches anything before merging. Sometimes flakes helps catch actual bugs...
  • Originally the handles dictionary was meant for matplotlib artists. Seems handles can now hold almost any matplotlib component...

Anyway, I'm happy to merge once these issues are addressed. Frankly, I expect we will find bugs related to this refactor but I do think this is worth having cleaner, more easily understandable code.

@philippjfr

This comment has been minimized.

Copy link
Contributor Author

philippjfr commented Feb 16, 2016

We need to be very careful when updating the test data. We must make sure we know exactly what changed and that it is ok.

I've carefully reviewed the test data changes and listed the 3 things that have changed above. The only thing I'm sort of worried about is the VectorField changes as their width has changed slightly and I can't figure out why.

Probably worth running flakes (i.e flake8) on your branch to see if it catches anything before merging. Sometimes flakes helps catch actual bugs.

Definitely a good idea.

Originally the handles dictionary was meant for matplotlib artists. Seems handles can now hold almost any matplotlib component...

We've always been putting the axis and figure in it, and apart from that it's generally only used for artists. The artist key is treated somewhat specially as that is where the legend code expects to find the artist and by default that is what gets removed if you don't define an update_handles method so that it can simply be redrawn completely.

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Feb 16, 2016

I understand this is ready for a merge and the tests are passing. I'm glad that the plotting code will be clearer to understand after this!

jlstevens added a commit that referenced this pull request Feb 16, 2016

Merge pull request #438 from ioam/mpl_refactor
Refactored matplotlib ElementPlots by adding higher-level API

@jlstevens jlstevens merged commit 5c0a001 into master Feb 16, 2016

3 of 4 checks passed

continuous-integration/travis-ci/push The Travis CI build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.1%) to 69.198%
Details
s3-reference-data-cache Test data is cached.
Details

@philippjfr philippjfr modified the milestones: 1.4.4, v1.5.0 Feb 25, 2016

@philippjfr philippjfr deleted the mpl_refactor branch Apr 1, 2016

@philippjfr philippjfr modified the milestones: v1.5.0, 1.4.4 Apr 20, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.