Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Off-axes markers unnecessarily saved to PDF #2423

Merged
merged 1 commit into from Jan 9, 2014

Conversation

mdboom
Copy link
Member

@mdboom mdboom commented Oct 1, 2013

Plotting a bunch of markers without lines, then changing the axes limits so none of the points are visible, and then saving the result to a PDF, results in a file just as big as if the markers were all visible within their default axes limits. This doesn't happen for line only plots.

I guess the PDF backend clips the lines before deciding what to save to file, but it doesn't clip the markers. Not only does this result in an unnecessarily large PDF file (and potentially a security issue, because data is unintentionally leaking out), but rendering that file in a PDF viewer can be a lot slower as well.

Here's some demo code:

import numpy as np
x = np.random.random(20000)
y = np.random.random(20000)

figure()
plot(x, y, 'k.') # use markers only
pyplot.savefig('dots.pdf')
xlim(2, 3) # move axes away for empty plot
pyplot.savefig('dots_empty.pdf')
'''
file sizes in bytes:
dots.pdf:       327505
dots_empty.pdf: 327921
'''
figure()
plot(x, y, 'k-') # use lines only
pyplot.savefig('lines.pdf')
xlim(2, 3) # move axes away for empty plot
pyplot.savefig('lines_empty.pdf')
'''
file sizes in bytes:
lines.pdf:       310071
lines_empty.pdf:   5905
'''

@tacaswell
Copy link
Member

@ghost ghost assigned mdboom Oct 1, 2013
@mdboom
Copy link
Member

mdboom commented Oct 1, 2013

@mspacek: Can you confirm the attached PR fixes your issue?

@mspacek
Copy link
Contributor Author

mspacek commented Oct 2, 2013

Thanks! Yes, it fixes the example provided above, but unfortunately that example was a bit too simplistic. In my specific case, I'm using scatter() and assigning a colour to each and every point. In that case, after applying the fix, the empty plot saved to PDF is still as large as the non-empty plot. It's the colour assignment that seems to be the culprit. Otherwise, scatter() seems to work as it should:

import numpy as np
x = np.random.random(20000)
y = np.random.random(20000)
c = np.random.random(20000)

figure()
scatter(x, y)
pyplot.savefig('scatter.pdf')
xlim(2, 3) # move axes away for empty plot
pyplot.savefig('scatter_empty.pdf')
'''
file sizes in bytes:
scatter.pdf:       324187
scatter_empty.pdf:   6617
'''
figure()
scatter(x, y, c=c)
pyplot.savefig('scatter_color.pdf')
xlim(2, 3) # move axes away for empty plot
pyplot.savefig('scatter_color_empty.pdf')
'''
file sizes in bytes:
scatter_color.pdf:       410722
scatter_color_empty.pdf: 413541
'''

@mdboom
Copy link
Member

mdboom commented Oct 2, 2013

Scatter is an entirely different code path from markers (except in the optimized case where the points are all the same color/size/etc in which it falls back to markers). Scatter uses collections (which are more flexible/powerful), and thus will require a much more complex solution. Can you open a new bug for that?

@mspacek
Copy link
Contributor Author

mspacek commented Oct 2, 2013

Done. Opened #2488

Thanks!


pdf = io.BytesIO()
fig.savefig(pdf, format="pdf")
assert len(pdf.getvalue()) < 8000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable. Is there a similar we can apply to test svg and ps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We probably should.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this get done?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No -- sorry to be off the radar for so long. I'll look into this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I merged in the knowledge that this didn't get done. It wasn't a blocker so was happier moving things forwards rather than holding it up.

@mdboom - hope you and the family are feeling better now.

pelson added a commit that referenced this pull request Jan 9, 2014
Off-axes markers unnecessarily saved to PDF
@pelson pelson merged commit 9e8acb2 into matplotlib:master Jan 9, 2014
@mdboom
Copy link
Member

mdboom commented Jan 13, 2014

8eede37 updates this to also test SVG. We can't do PS, unfortunately, because it closes the file handle (that's a hairier bug for another time).

@mdboom mdboom deleted the cull-off-of-figure-markers branch March 3, 2015 18:43
@ebauch
Copy link

ebauch commented Jul 1, 2017

Hi all,

sorry to reopen this old shoe, but this problem does still seem to exists in Python when I save an SVG file. However, exporting a PDF seems to be okay.

image

using matplotlib 1.5.3
python 3.4 (anaconda)
linux x64

@tacaswell
Copy link
Member

What are you using as your svg renderer? There was a long standing bug with rsvg, now fixed, (see #4341) where they did not correctly render valid svg.

@ebauch
Copy link

ebauch commented Jul 1, 2017

ah, interesting. I don't know exactly what version it is. however, chromium renders it correctly. I use Inkscape for post-processing my svg images, and Inkscape also gets it wrong, i.e. renders it with the data not clipped (inkscape 0.91 and 0.92).

@anntzer
Copy link
Contributor

anntzer commented Jul 1, 2017

Can you provide a minimal example? thanks.

@ebauch
Copy link

ebauch commented Jul 2, 2017

@anntzer

svg + pdf.zip

import numpy as np
import matplotlib.pyplot as plt

f1 = plt.figure(num=None, figsize=(9/2.54/3, 9/2.54/2), dpi=400, facecolor='w', edgecolor='k')

c1 = 'blue'

xx = np.logspace(np.log(0.01), np.log(500), 100);
p = 3;
T2 = 10 # us
coh = lambda t: np.exp(-t/T2)
plt.plot(xx, coh(xx),lw=1.5, color=c1)

plt.xlim([0.01, 100000])
plt.ylim([0, 1.1])

ax = plt.gca()
ax.tick_params(axis="both", which="both", bottom="off", top="off",    
                labelbottom="off", left="off", right="off", labelleft="off")

plt.xscale('log')
plt.yscale('linear')

plt.savefig('exp.svg')
plt.savefig('exp.pdf')
plt.show()

@LeBarbouze
Copy link

Same problem here: the svg export renders correctly with Firefox, but Inkscape does not clip data outside requested axis limits. Did you find a solution @ebauch ?

@ebauch
Copy link

ebauch commented Nov 12, 2017

@LeBarbouze I just import pdfs in Inkscape these days, works much better! Python exported SVG get it wrong most of the time.

@tacaswell
Copy link
Member

The problem is not the svgs generated by Matplotlib, it is old versions of librsvg not properly handling valid svg.

@ebauch
Copy link

ebauch commented Nov 16, 2017

@tacaswell sorry, I didn't mean to imply matplotlib is the culprit.

@tacaswell
Copy link
Member

@ebauch Sorry, I am a tad touchy about this (it consumed a fair amount of oxygen a while ago and resulted in my reading decent chunk of the svg spec). Definitely do not want to re-litigate it 😉 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants