Recently there was a thread on the matplotlib-users mailing list about the speed of the PDF backend:
Whereas the Mac OS X and the Cairo backends make use of new_gc and gc.restore to keep track of the graphics context, the PDF backend uses check_gc and an internal stack of graphics contexts. Since nowadays matplotlib has gc.restore functionality, I don't think that that is needed any more. Removing these bits of code from the PDF backend may speed up this backend, as well as result in smaller PDF files.
See this revision for when gc.restore was added to matplotlib:
The principles behind graphics with Mac OS X and Cairo are very similar to those of PDF, so one could simply look at those backends and use the corresponding code in the PDF backend.
Probably the Cairo backend is easiest to follow, as it is in pure Python.
Below is an outline of what is needed. The key point is that since nowadays each call to new_gc is balanced by a call to gc.restore, we can simply print out Op.gsave and Op.grestore directly without having to push and pop GraphicsContextPdf instances. The other operations on the GraphicsContextPdf object can also be executed directly.
In the init method,
self.gc = self.new_gc()
should be replaced with
self.gc = GraphicsContextPdf(self.file)
The check_gc method and all calls to this method can be removed.
In the draw_image method, we need to still do the clipping, and we need to add something like
clippath, clippath_trans = gc.get_clip_path()
and output Op.clip appropriately somewhere between the Op.gsave and Op.grestore. (It may help to look at the Cairo backend to see how this method is implemented there).
In the draw_path method, if rgbFace is not None, then between an Op.gsave and an Op.grestore we need to change the fill color by Op.setrgb_nonstroke. (Comparing to the Cairo backend may help here).
In the new_gc method, instead of creating a new GraphicsContextPdf object, we simply apply Op.gsave to the current graphics context object. This is the important part.
We need to add a method "restore" which emits Op.grestore. Since in matplotlib each call to new_gc is balanced with a call to restore, each Op.gsave is balanced by an Op.grestore.
The following methods should be renamed, and the corresponding PDF operation can now be executed directly:
capstyle_cmd ---> set_capstyle
joinstyle_cmd ---> set_joinstyle
linewidth_cmd ---> set_linewidth
dash_cmd ---> set_dashes
alpha_cmd ---> set_alpha
rgb_cmd ---> set_foreground
For the method hatch_cmd, I am not quite sure how to implement it for PDF. Hatching was not implemented for the Cairo background, but it was implemented for the MacOSX backend (in the C code), so it must be possible in the PDF backend also.
The push and pop methods can be removed.
Instead of the method clip_cmd, we new need a method set_clip_path. This method can apply Op.clip directly, without having to search for the appropriate graphics context.
The delta method and the copy_properties methods can be removed.
The repr method can be removed.
The .parent attribute can be removed.
Thanks for the suggestion!
There's an initial attempt at jkseppan:pdf-context, which isn't complete yet (doesn't do hatching, probably breaks usetex). Unfortunately, that version is slower and produces larger output than master. I tried with the test_speed2.py script provided by Gökhan Sever, and on my laptop (with nums=2, i.e. two pages of output), master takes about 16.3 seconds:
python ../test_speed2.py 15.87s user 0.44s system 99% cpu 16.339 total
python ../test_speed2.py 15.82s user 0.46s system 99% cpu 16.293 total
python ../test_speed2.py 15.81s user 0.43s system 99% cpu 16.267 total
and the refactored code takes about 17.8 seconds:
python ../test_speed2.py 17.32s user 0.45s system 99% cpu 17.797 total
python ../test_speed2.py 17.32s user 0.45s system 99% cpu 17.803 total
python ../test_speed2.py 17.59s user 0.46s system 99% cpu 18.185 total
Looking at the output, it seems that the code setting up the graphics context is doing quite a bit of repeated work (e.g. setting alpha to one value, then immediately to another; and it seems to wrap every tick mark in multiple layers of contexts, with identical setup for each one). Because the pdf backend has an explicit stack of graphics contexts, it can output only the part of the setup that is different from the previous drawing command. This saves on I/O, which is probably a net win, even though keeping track of the stack means more computation.
Thanks for trying this out.
This is very interesting. If you see repeated work in the PDF output, then the same unnecessary work is also being done in the Cairo and Mac OS X backends. I understand that the existing PDF backend can save I/O by avoiding the unnecessary commands, but I would expect that this can be done further upstream also (e.g., if you see multiple layers of contexts around each tick mark, then this suggests that there are unneeded calls to new_gc somewhere upstream).
I'll try the same refactoring in the Postscript backend to see what happens there (and also since Postscript is a bit more human-readable it may be easier to understand).
So I found pdf.compression in matplotlibrc, set it to 0 for uncompressed PDF, which gave me PDF output that is more or less human-readable. Then I could look at the PDF output with the current PDF backend and the refactored one. Most if not all of the difference is coming from the tick marks. Skipping the tick marks reduces the running time from about 17.2 seconds to 13.1 seconds for the current PDF backend; the refactored PDF backend has approximately running time if we skip the tick marks.
The tick marks are drawn in the draw() method of the Axis class in lib/matplotlib/axis.py as follows:
for tick in ticks_to_draw:
so each tick is drawn independently. We may be able to speed up the PDF backend and other backends if we can make use of what is conserved between the ticks.
@mdehoon Is this still a problem?
Yes I think so, but I haven't found time to look at this in more detail. It may be better to keep this open until we can resolve it. But if you prefer to close issues that are not immediately actionable, then that is OK with me too.