New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with AutoCad generated PDF #24
Comments
Hi Olav - I've encountered the same with a number of pdf's, also generated from autocad. I haven't been able to get to the bottom of what causes it - but something in the way autocad writes the pdf file. If the files are opened and re-saved in adobe, they can be then merged. |
Hello, I can now merge PDFs generated by AutoCad with this simple fix |
Hi Matthew Thanks for your help! This fixed the problem for all but some few pdfs. regards Olav |
adammorris wrote yesterday, in regard to AutoCAD, "... If the files are opened and re-saved in adobe, they can be then merged." I need to explain this in the FAQ: a LOT of PDF-producing software in the world at large, including scanners and AutoCAD, produces broken PDF. At the same time, Adobe software is exceptionally "forgiving" in doing its best to read anything it encounters. It's generally safe to assume that you can "mollify" any busted PDF you find by running it through Acrobat and writing it back out. Preview for MacOS, incidentally, is almost as good in this role. One of the aims we have with PyPDF2 is to make it as intelligent in reading as Acrobat is, so that it does "the right thing" with PDF instances that don't conform to the PDF standard. laffen (and anyone else), send us any examples of PDF that cause PyPDF2 to fail, but which you think PyPDF2 should be able to handle. If your examples need to be kept private, be sure to tell us so; we're accustomed to handling proprietary material, and do so conscientiously. |
I created fixes for a few more bugs that occur with the PDF you sent me - such as, PyPDF2 could not read 'x\00', which it thought was a number object (it is actually an unconventional representation of a null object). There are a few bugs left, though - I will commit the results soon |
See if commit 54e0b6d works for the remaining PDFs |
Hi Matthew! Thanks a lot for your help. I really appreciate it :-) I tried the commit Regards Olav 2013/9/6 Matthew Stamy notifications@github.com
|
Sorry about that - I had added some code for debugging purposes and deleted more than I intended. The only essential lines of added code are
This was added in reading a dictionary object; but there should probably be a more general fix in case it occurs elsewhere. |
Hi Great! Regards Olav |
Awesome - this fixed all my problem PDF's as well! Thank you so much! |
Hi
I am trying to use the pyPDF2 module to merge a lot of pdf-files. For some of the pdf-files it fails.
The failing pdf-files is files generated directly from Autocad.
Traceback (most recent call last):
File "", line 37, in
File "", line 29, in main
File "C:\Python27\lib\site-packages\PyPDF2\merger.py", line 168, in append
self.merge(len(self.pages), fileobj, bookmark, pages, import_bookmarks)
File "C:\Python27\lib\site-packages\PyPDF2\merger.py", line 116, in merge
pages = (0, pdfr.getNumPages())
My script:
def main():
from PyPDF2 import PdfFileReader, PdfFileMerger
doclistdir = r'xxxxxxxxxxxxxxxxxx''
doclistfile = open(r'xxxxxxxxxx\list.txt','r')
doclist = doclistfile.readlines()
merger = PdfFileMerger()
for doc in doclist:
pdfdoc = doclistdir + '' + doc.strip()
mergerelement = open(pdfdoc,'rb')
#print 'Processing: ' + pdfdoc
merger.append(mergerelement)
output = open(doclistdir + '' + "document-output.pdf", "wb")
merger.write(output)
pass
if name == 'main':
main()
regards
Olav
The text was updated successfully, but these errors were encountered: