Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Pickle backend #9357

Closed
jnothman opened this issue Oct 11, 2017 · 15 comments
Closed

ENH: Pickle backend #9357

jnothman opened this issue Oct 11, 2017 · 15 comments

Comments

@jnothman
Copy link

Feature summary

In non-interactive usages, I often want to generate numerous plots. But sometimes I will want to be able to tweak them after the fact. I would ordinarily use a script that allow me to specify an output filename (or filename format) and savefig is used to dump multiple figures; but pdf, png, svg, etc. are too lossy to tweak after the fact. If I could specify /path/to/output.pkl and the script would automatically dump a pickle, then layout could be tweaked later without having to modify the plotting script (which I may or may not have developed). Indeed, I could have generic post-processing scripts to perform layout tweaking and dump to disk again using the backend/format of my choice.

It looks like it would be quite straightforward to implement a backend to perform pickle output (although there may be questions of whether a more efficient (joblib) or more lenient (dill) pickle variant be employed).

I could implement it for myself, but I'm not sure how to then automatically register the backend, so I can use an existing script containing a savefig. (Or have I missed the existence of some matplotlibrc register_backend command?)

Do others think this would be useful?

@jklymak
Copy link
Member

jklymak commented Oct 11, 2017

You can already pickle figure objects. Did you mean something more than this?

import numpy as np
import matplotlib.pyplot as plt
import pickle as pl

fig, ax = plt.subplots()
x = np.linspace(0,2*np.pi)
y = np.sin(x)
ax.plot(x,y)

# Save figure handle to disk
pl.dump(fig, file('figure.pickle', 'wb'))

@jnothman
Copy link
Author

jnothman commented Oct 11, 2017

I mean that fig.savefig('out.pkl') should be supported. Thus if I have a command-line script, for instance, which saves a series of figures to disk at specified paths, I need only provide a pkl extension for it to dump something I can still interact with, or modify before rendering to image.

@anntzer
Copy link
Contributor

anntzer commented Oct 11, 2017

FWIW a similar idea was rejected at #3819 (and I now agree with the rationale there).

In practice this can easily be implemented by adding a print_pkl method on the canvas (see backend_bases.py):

canvas = plt.gcf().canvas
canvas.print_pkl = lambda fname, **kwargs: pickle.dump(canvas.figure, open(fname, "wb"))
plt.savefig("...")

which you can nicely wrap in an enable_pkl or whatever function.

@jnothman
Copy link
Author

jnothman commented Oct 11, 2017 via email

@anntzer
Copy link
Contributor

anntzer commented Oct 11, 2017

You can explicitly pass a format to savefig, the only question is whether to actively encourage people to save pickles. Right now we are definitely not making guarantees for cross-version compatibility of pickles, which is IMO the main argument for not encouraging their use.

@jnothman
Copy link
Author

jnothman commented Oct 11, 2017 via email

@anntzer
Copy link
Contributor

anntzer commented Oct 11, 2017

Alternatively, you can just patch the print_pkl method into FigureCanvasBase (e.g. you may have a utils module that you generally import? just import matplotlib and patch it from there) and be done with it.

@jnothman
Copy link
Author

jnothman commented Oct 11, 2017 via email

@tacaswell
Copy link
Member

👎 we keep the pickle support working primarily to support multi-process work (for the case where you have and expensive to generate, but small at the end figure).

Please do not patch FigureCanvasBase on import.

See http://matplotlib.org/tutorials/introductory/usage.html#backends If you have custom sub-class of your backend of choice that provides pkl you can just do matplotlib.use("module://pickle_of_doom").

@jklymak
Copy link
Member

jklymak commented Oct 11, 2017

I'm still not following why pickle.dump is any worse than figure.savefig.

@jnothman
Copy link
Author

@jklymak because I already have scripts using savefig. Let's make it more tangible. I'd written my scripts to use savefig for experimental analysis. Now I want to publish some of them. Rather than go hack my code, I'd like to just specify a different path, one that ends in .pkl. Then it should behave as I seek transparently.

Thanks @tacaswell, I've realised a module:// backend is acceptable, as long as my base backend is consistent across platforms I want to apply this on. If I want to provide it as a tool for others to use, well, then, that's not really sufficient without having a way to allow the user to configure a base backend for my tool, and so the madness ensues (thus flat is better than nested).

Having a script which allows me to run other scripts in a context in which FigureCanvasBase, or at least the current backend, is patched is more portable.

But then why would you not just offer me a way to register a backend just for one extension in my matplotlibrc?

@jnothman
Copy link
Author

PS: yes, it's useful for multiprocessing; here it is for incremental processing, where similarly the figure(s) may be expensive to generate (or depends on data that is expensive to load or aggregate).

@tacaswell
Copy link
Member

@jnothman
Copy link
Author

jnothman commented Oct 11, 2017 via email

@jnothman
Copy link
Author

I've not yet played with it much, but here's the result of this conversation: https://github.com/jnothman/pickleback. I have a problem where a pickled figure is restored with its canvas attribute set to None and I do not understand why. Insights are welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants