-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Acrobat: An error exists on this page. (with multiple SVG imports) #960
Comments
Thank you for the detailed report @gmischler! I made some tests this morning:
|
It's probably not something in the SVG data itself, but in how it interacts with compression. Adding the same SVG several times causes a lot of repetition in the text (they end up identical except for the placement/scaling transform), resulting in a very high compression ratio. Apparently we're not handling that situation in exactly the way as the acrobat reader expects. I've found that some other software sometimes adds a "Length1" value to content streams. By the specs this is only meant (and mandatory) for compressed font data, where it gives the uncompressed size of the data. I experimented with adding that to the content stream of my example file, but didn't see any change in behaviour. Given that it is off-spec, that isn't really a surprise, but it was worth a shot. Acrobat reader seems to issue (or not) those warnings depending on arbitrary criteria (including the Windows version, according to some reports). So it may well be that there's something in our use of compression it generally doesn't like, but only complains about when the compression rate is particularly high. |
In Have you tried displaying I'd be curious to know if this could problem happens with other PDF readers... |
I have been digging a little deeper into the resulting zlib compressed streams, but could not find much... import zlib
from fpdf import FPDF
from pypdf import PdfReader
for svg_file in ("test/svg/svg_sources/arcs01.svg", "test/svg/svg_sources/arcs02.svg"):
print(svg_file)
pdf = FPDF()
pdf.add_page()
pdf.image(svg_file, w=30, h=30)
pdf.image(svg_file, w=30, h=30)
pdf.image(svg_file, w=30, h=30)
pdf.output("issue_960.pdf")
reader = PdfReader("issue_960.pdf")
compressed_stream = reader.pages[0]["/Contents"]._data
# cf. https://www.rfc-editor.org/rfc/rfc1950
cmf, flg = compressed_stream[0], compressed_stream[1]
print(f"* cmf=0x{cmf:X} flg=0x{flg:X}") # 0x78 0x9C => zlib: Default Compression
decompressor = zlib.decompressobj(wbits=zlib.MAX_WBITS)
decompressed_data = decompressor.decompress(compressed_stream)
print(f"* length of decompressed data: {len(decompressed_data)} bytes")
print(f"* compression ratio: {100*len(compressed_stream)/len(decompressed_data):.2f}%")
print(f"* end of the compressed data stream reached? {decompressor.eof=}")
print(f"* {decompressor.unconsumed_tail=}")
print(f"* {decompressor.unused_data=}")
print() Output:
You are right @gmischler, this problems really seems correlated with a high compression ratio being used:
|
I suspect that Adobe Acrobat Reader decompression function is implemented a bit like that, for "safety" reasons: import zlib
def acrobat_decompress(compressed_data, growth_max=12):
max_length = len(compressed_data) * growth_max
decompressor = zlib.decompressobj()
decompressed_data = decompressor.decompress(compressed_data, max_length=max_length)
if not decompressor.eof:
raise RuntimeError(f"Uncompressed content is at least {growth_max} times bigger than compressed data")
return decompressed_data Of course, |
I made some extra tests with several source SVG files:
So it's not just a maximum ratio that is taken in consideration by Acrobat... |
Maybe |
Zlib comes with Python. My 3.10 installation uses 1.2.11, but I doubt that this makes any difference in the output. A warning from fpdf2 seems a bit pointless as long as we don't know what the problem is. What is the user supposed to do with it? Do all the affected files contain SVG data? I've tried to reproduce the error with other repetitive content subject to high compression, with no success. So it could still be some subtlety in the graphics commands, which acrobat only complains about under certain arbitrary circumstances. It would really be helpful if soeone with Acrobat Pro could run those files through the preflight function. If the problem is real (and not just a viewer bug), that would give us the information directly from the horses mouth. |
When I use Acrobat, I get the same error when printing a PDF. The only requirement is that there is a "path" in the code. Minimal test code: from fpdf import FPDF
pdf = FPDF()
pdf.add_page()
with pdf.new_path() as path:
path.move_to(1, 1)
path.line_to(9, 9)
path.close()
pdf.output("test.pdf") Then print test.pdf using Acrobat reader. The error should appear right after printing. The problem persists when pdf.compress = False |
I think this is a different problem, so I moved your comment into a dedicated issue 🙂 |
Different problem, same workaround -> #1144 also fixes this one. |
I was having exactly the same issue when using SVGs. It was not 100% reproducible, and happening rarely.. Is there an ETA to land |
If @gmischler & @andersonhc agree, I think we could perform a new release this month! 🙂 |
While implementing "image paragraphs" for text regions, Acrobat reader suddenly started complaining about my test file:
Of course they want you to buy their other software to create PDFs, so the message is deliberately unhelpful.
Error details
I could boil it down to sections containing imported SVG data. Strangely it takes a certain amount of data until the error triggers. With the SVG logo, it either takes three of them on one page, or two and a bunch of text (at least that are the combinations I found).
None of the other viewers and validators that I have easy access to indicate any errors.
Processing the file with qpdf and "--normalize-content=y" (or "--qdf") fixes the problem. But I was unable to glean any useful information from a comparison.
I've seen reports that Adobe Preflight gives useful and detailed error reports. So if anyone has that available, it might lead us somewhere.
Minimal code
(for some reason, github doesn't want me to include PDF files here...)
Environment
fpdf2
version used: current HEADThe text was updated successfully, but these errors were encountered: