Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored matplotlib ElementPlots by adding higher-level API #438

Merged
merged 34 commits into from Feb 16, 2016

Conversation

philippjfr
Copy link
Member

This refactor avoids a lot of the duplication that has spread across all existing matplotlib plots. It should also make it simpler for a user to create a new plot. I'll summarize the overall API once I've worked on it a little bit more.

@jlstevens
Copy link
Contributor

Great! Will this put us in a position where we can document these classes and their methods properly? Maybe even write some sort of developer documentation about it?

@philippjfr
Copy link
Member Author

The general idea behind this PR is to simplify the implementation of matplotlib plotting classes to reduce redundancy and make it easier to implement new matplotlib plotting classes.

Previously the main methods to implement a plotting class were:

  • initialize_plot: Would have to handle everything from unpacking the data in the element, getting ranges, getting styles, calling the the plotting function, storing the artist on handles and call finalize_axis. This meant that there were ~10 lines of boilerplate per plotting class.
  • update_handles: This method would update existing artists by accessing the handles and manually setting the new data. This had to be implemented if just to tear down the existing artist and redraw it from scratch. It also required boilerplate to get the styles and unpack the data in a very similar way as the initialize_plot call would already have done.

To make this old API workable a lot of plotting classes implemented custom workarounds, e.g. so that each plot didn't have two separate implementations to unpack the element. However in developing the bokeh backend I came up with a better solution. I've now ported some of the ideas from the bokeh backend to matplotlib. It's important to mention that because the matplotlib API is sometimes awkward this new API is entirely optional and some plots will retain the old API for now.

The general idea is to separate the unpacking of the data and keyword arguments from the element from the actual plotting. The ElementPlot baseclass now has a default implementation of initialize_plot, which does all the boilerplate stuff mentioned above, then calls a new method called get_data and passes the output from that method to another new method called init_artist. These two methods are generally enough to implement a plot type but if you want to manually update the handles on a plot instead of recreating them you also implement a update_handles method, which can itself call get_data to get the latest data. If the update_handles isn't subclassed the previous artist will be torn down by a method called teardown_handles and then reinitialized using init_artist.

The signature for these methods is the following:

  • get_data [Required]:
    • Inputs: element, ranges, style
    • Outputs: plot_args, plot_kwargs, axis_kwargs
    • This method should unpack the element and return the appropriate arguments for the plotting call. Additionally it may also supply axis_kwargs, which allow supplying custom ticks and axis labels (e.g. think of a HeatMap where the x-axis and y-axis ticks are derived from the data).
  • init_artist [Required]:
    • Inputs: axis, plot_args, plot_kwargs
    • Output: Dictionary of artists
    • This method is given an axis to draw onto as well as plot_args and plot_kwargs to pass to the plotting call, should return a dictionary of artists, by convention the main artist should have the key 'artist'.
  • update_handles [Optional]:
    • Optional method to update existing handles, if not implemented will simply recreate artist for each frame. Otherwise should use get_data method to unpack required data and then update artists manually with the data.
    • Inputs: key, axis, element, ranges, style
    • Output: Optional dictionary of axis arguments, e.g. xticks, yticks etc.
  • teardown_handles [Optional]:
    • Only required if some custom teardown procedure is required, i.e. if no update_handles method has been implemented and the plot does not consist of a simple matplotlib artist or collection. Should go through handles and remove any which are to be redrawn.
    • Inputs: None
    • Outputs: None

Overall implementing all this roughly halves the plotting code required and in future when matplotlib provides a better interface to update existing artists, it should be possible to feed he output of get_data straight to the artist without manually updating various attributes. As you can see this almost halves the length of the various plotting implementations I have refactored but there's still full flexibility to override initialize_plot and update_handles completely as is still the case for Histogram and Bar plots, where more flexibility is required.

@philippjfr
Copy link
Member Author

Okay I was wrong the 3d stuff wasn't the only change. There are three display changes:

  1. 3d plot edgecolors are handled better now resulting in some changes in Elements tutorial.
  2. HeatMap annotations now drawn slightly differently resulting in one display change in Continuous Coordinates.
  3. VectorFields thickness has changed and I can't figure out why. I checked that all the arguments to the quiver call are the same in both implementations. Mysterious but barely visually noticeable.

The first two are fixes/improvements and I really don't know what to do about 3). All three are very minor and purely aesthetic changes, so I'll focus on documenting now.


ranges = self.compute_ranges(self.hmap, key, ranges)
ranges = match_spec(element, ranges)
def init_artist(self, ax, plot_data, plot_kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be called init_artists if multiple artists can be initialized here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, will rename.

@philippjfr philippjfr changed the title Initial refactor of matplotlib ElementPlots Refactored matplotlib ElementPlots by adding higher-level API Feb 16, 2016
return self._finalize_axis(self.keys[-1], ranges=ranges)
if element.get_dimension(self.size_index):
sizes = element.dimension_values(self.size_index)
ms = style.pop('s') if 's' in style else plt.rcParams['lines.markersize']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to use rcParams here? I thought rcParams could sometimes be tricky...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't think there are any particular issues but I guess we could require that a 's' is supplied to apply points scaling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's best to postpone this to another PR as to avoid changing any more test data.

@jlstevens
Copy link
Contributor

Made a few minor comments but overall it does seem like a nice improvement. As it is a fairly large PR, I have a few general comments:

  • We need to be very careful when updating the test data. We must make sure we know exactly what changed and that it is ok.
  • Probably worth running flakes (i.e flake8) on your branch to see if it catches anything before merging. Sometimes flakes helps catch actual bugs...
  • Originally the handles dictionary was meant for matplotlib artists. Seems handles can now hold almost any matplotlib component...

Anyway, I'm happy to merge once these issues are addressed. Frankly, I expect we will find bugs related to this refactor but I do think this is worth having cleaner, more easily understandable code.

@philippjfr
Copy link
Member Author

We need to be very careful when updating the test data. We must make sure we know exactly what changed and that it is ok.

I've carefully reviewed the test data changes and listed the 3 things that have changed above. The only thing I'm sort of worried about is the VectorField changes as their width has changed slightly and I can't figure out why.

Probably worth running flakes (i.e flake8) on your branch to see if it catches anything before merging. Sometimes flakes helps catch actual bugs.

Definitely a good idea.

Originally the handles dictionary was meant for matplotlib artists. Seems handles can now hold almost any matplotlib component...

We've always been putting the axis and figure in it, and apart from that it's generally only used for artists. The artist key is treated somewhat specially as that is where the legend code expects to find the artist and by default that is what gets removed if you don't define an update_handles method so that it can simply be redrawn completely.

@jlstevens
Copy link
Contributor

I understand this is ready for a merge and the tests are passing. I'm glad that the plotting code will be clearer to understand after this!

jlstevens added a commit that referenced this pull request Feb 16, 2016
Refactored matplotlib ElementPlots by adding higher-level API
@jlstevens jlstevens merged commit 5c0a001 into master Feb 16, 2016
@philippjfr philippjfr modified the milestones: 1.4.4, v1.5.0 Feb 25, 2016
@philippjfr philippjfr deleted the mpl_refactor branch April 1, 2016 14:28
@philippjfr philippjfr modified the milestones: v1.5.0, 1.4.4 Apr 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants