New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing text to specific pdf seems to break the structure #78
Comments
I have another file with the same problem (I think). Attaching it here! Edit: I was using v0.6.1-rc4 when I generated this file |
Hello @kevinswartz. I took a look at this today. Something about the source document seems to be causing You can sort of work around the problem by saving the document without object streams: // With Object Streams
PDFDocumentWriter.saveToBytes(pdfDoc);
// Without Object Streams
PDFDocumentWriter.saveToBytes(pdfDoc, { useObjectStreams: false }); Acrobat was able to open the documents you shared after saving with Of course, this doesn't actually fix the bug. So I'll continue looking into this and let you know what I find. |
Thanks @Hopding , |
@kevinswartz The only real benefit to using object streams is that it makes the resulting PDF file a bit smaller. Many PDF libraries don't support object streams at all, and only write PDFs without them. PDF files contain a structure known as a Cross Reference Table (since PDF v1.0). This table contains pointers (byte offsets) of each object in the document. This allows for fast random access to objects in large PDF files. These tables tend to get corrupted a lot, so most readers are able to reconstruct them without any perceptible change in the reader's performance. However, if the file is saved with object streams, then Cross Reference Streams are used instead of Cross Reference Tables. Cross Reference Streams were introduced in a later PDF version (v1.6, I think). For whatever reason, not as many readers are able to reconstruct corrupted Cross Reference Streams (e.g. Google Chrome can, but Mac's Preview and Adobe Acrobat apparently cannot). |
Thanks @Hopding! Good information. We might start not using object streams if it means better compatibility. |
Hello there @kevinswartz! I was able to find and fix the issue causing this in #101. Some of the logic used to write out the cross reference tables and streams was incorrect. In particular, the code assumed that all PDFs would have an object with an ID of I just cut prerelease You can install this prerelease with npm:
It's also available on unpkg:
Please try it out and let me know if it works for you! |
Thanks! I'll check it out. |
Version You can install this new version with npm:
It's also available on unpkg: (@kevinswartz if you find that you're still having trouble with this after using the new release, please go ahead and reopen this issue.) |
Thanks @Hopding! Looks like that fixes the issue. |
Hi @Hopding ,
I have a file here that I'm able to view without issue in pdf.js. Once I write some text to it via pdf-lib, the file can no longer be viewed in pdf.js with the error "Invalid PDF Structure". I've attached pdfs from before, and after the write. Do you have any ideas about ways to write text differently so this doesn't happen? These files are non-production.
Thanks again!
file_before.pdf
file_after.pdf
The text was updated successfully, but these errors were encountered: