Option to Compress Graphs for pgf-backend #5983

overdetermined · 2016-02-09T11:13:59Z

Working with the awesome .pgf backend, I hit a limit as to how pretty I can make the graphs look:

TeX capacity exceeded, sorry [main memory size={large number here}]

The reason is obvious, when i produce a scatter plot with tens of thousands of objects, it will inevitably produce very large *.pgf documents. Now, I may have overlooked some features and am not an expert on this, but in some cases it may be beneficial if there was an option to save the graph itself as a png and let latex only do the annotations (afterall, its the beautiful typesetting we are after).

As an example my current workaround ("hack") is posted below. Obviously the quality of the graphs is reduced, and the pdf in my example has a bigger filesize, but i have produced *.pgfs of several mb which reduced to some kb in pdf (if they can be compiled at all). In my example the *.pgf size reduces by a third.
I would try to implement this myself in matplotlib, but am not sure where to start.

Is this a good feature, or are there better ways to achieve what I want?

import numpy as np
import matplotlib as mpl
import matplotlib.image as mpimg

mpl.rc('pdf', fonttype=42)

pgf_with_latex = {                      # setup matplotlib to use latex for output
    "pgf.texsystem": "pdflatex",        # change this if using xetex or lautex
    "text.usetex": True,                # use LaTeX to write all text
    "font.family": "serif",
    }
mpl.rcParams.update(pgf_with_latex)

import matplotlib.pyplot as plt

def newfig(width):
    plt.clf()
    fig = plt.figure()#figsize=figsize(width))
    ax = fig.add_subplot(111)
    #ax2 = fig.add_subplot(122)
    return fig, ax

def savefig(filename):
    plt.savefig('{}.pgf'.format(filename))
    plt.savefig('{}.pdf'.format(filename))

np.random.seed(42)
randomData = np.random.rand(50,4)
fig, ax = newfig(0.9)
tmp =  ax.scatter(x=randomData[:,0],
                y=randomData[:,1]*10,
                s=randomData[:,2]*1000,
                c=randomData[:,3])

#remember the figure size
frame_ax = ax.get_window_extent().get_points()
frame_fig = fig.get_window_extent().get_points()

#keep the axis labels
ylim = ax.get_ylim()
xlim = ax.get_xlim()
extent = [xlim[0],xlim[1],ylim[0],ylim[1]]
#ax.annotate('help',xy=[0.5,0.5],xytext=[0.4,0.4]) #annotating has to be done after
savefig('large')

ax.axis('off')
plt.savefig('empty.png')
ax.axis('on')

tmp.remove() #remove the graphs
img = mpimg.imread('empty.png')
#print(img.shape)

#get only the figure part
x1 = (int(round(frame_ax[0][0]/frame_fig[1][0]*img.shape[1])))
x2 = (int(round(frame_ax[1][0]/frame_fig[1][0]*img.shape[1])))
y1 = (int(round(frame_ax[0][1]/frame_fig[1][1]*img.shape[0])))
y2 = (int(round(frame_ax[1][1]/frame_fig[1][1]*img.shape[0])))
img2 = img[y1:y2,x1:x2] #crop image at figure

ax.imshow(img2,extent=extent, aspect='auto')#,extent=extent,aspect='auto')

savefig('small')

plt.show()

Notes:
In this code I plot, save an image of just the plot without the axis as 'empty.png';
The problem is that this has the size of the figure and not the graph itself, so i have to crop the image.
This cropped image is then included using .imshow.
I looked in the source and knew that the pgf backend has a draw_image function, but got lost trying to figure out how it determines whether the plotted stuff is an image or not.
I may have confused my x's and y's.

The text was updated successfully, but these errors were encountered:

jenshnielsen · 2016-02-09T11:24:59Z

I didn't have time to read you example in detail but I think set_rasterized(True) may do what you want.

Something like

import matplotlib.pyplot as plt
import numpy as np
a = np.random.rand(10000)
b = np.random.rand(10000)
c = plt.scatter(a,b)
c.set_rasterized(True)

Should ensure that the scatter is rendered as a bitmap.

Hope that is helpful.

overdetermined · 2016-02-09T11:27:10Z

That is exactly what I want. Thank you!
I spent almost as much time looking for a solution, as I did trying to hack a workaround, but I guess I learned something. Thank you for the great support and very quick reply indeed.

jenshnielsen · 2016-02-09T11:33:19Z

Great. I think we need to document that better. I will rename this issue to reflect that if ok with you?

overdetermined · 2016-02-09T11:41:41Z

Sure. I was looking for set_rasterized in the docs after you mentioned it, and it is somewhat hard to find I guess.

jenshnielsen · 2016-02-09T11:46:30Z

I think the section about PGF should explain this. We actually have a test of this in the pgf backend https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/tests/test_backend_pgf.py#L160 but I don't think the docs here http://matplotlib.org/users/pgf.html mention it.

jenshnielsen · 2016-02-22T20:52:28Z

Close now that the docs update has been merged

jenshnielsen · 2016-02-22T20:52:44Z

Thanks for the work!

overdetermined mentioned this issue Feb 9, 2016

Suggestion for Rasterization to docs pgf-backend #5984

Merged

tacaswell added the Documentation label Feb 14, 2016

tacaswell added this to the 1.5.2 (Critical bug fix release) milestone Feb 14, 2016

jenshnielsen closed this as completed Feb 22, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to Compress Graphs for pgf-backend #5983

Option to Compress Graphs for pgf-backend #5983

overdetermined commented Feb 9, 2016

jenshnielsen commented Feb 9, 2016

overdetermined commented Feb 9, 2016

jenshnielsen commented Feb 9, 2016

overdetermined commented Feb 9, 2016

jenshnielsen commented Feb 9, 2016

jenshnielsen commented Feb 22, 2016

jenshnielsen commented Feb 22, 2016

Option to Compress Graphs for pgf-backend #5983

Option to Compress Graphs for pgf-backend #5983

Comments

overdetermined commented Feb 9, 2016

jenshnielsen commented Feb 9, 2016

overdetermined commented Feb 9, 2016

jenshnielsen commented Feb 9, 2016

overdetermined commented Feb 9, 2016

jenshnielsen commented Feb 9, 2016

jenshnielsen commented Feb 22, 2016

jenshnielsen commented Feb 22, 2016