New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python3 - TypeError: ord() expected string of length 1, but int found #254
Comments
Any chance you can post/send me the PDF(s) you're working with? Most likely this is a Python 3 type handling issue found in the LZW decoding algorithm, in which case it is easily fixable |
My script is Here is the call: python3.4 pdf2pdfocr_multibackground.py first.pdf second.pdf result.pdf Thanks! |
5bbd5af should take care of these type issues |
Let me know of any further issues! |
It works! Thanks! |
when new version will be available at https://pypi.python.org/pypi/PyPDF2? |
199709222.pdf Here is my code import PyPDF2 search_words = [] p1 = 'C:\data\pdfContainer\test' for i in file_name:
|
@mstamy2 I want to check the occurence of words present in allwords.txt in the PDF file mentioned and write it in excel |
These fixes are still not merged. |
I got multiple files that trigger this error.
The bug has been triggered 6 times in the first 1255 files, so I'm guessing the error rate is about 0.5% |
Can a release be made that includes this fix? |
I am getting this error when using python3 and this simple code:
imagepdf = PdfFileReader(open(sys.argv[1], 'rb'), strict=False)
textpdf = PdfFileReader(open(sys.argv[2], 'rb'), strict=False)
for i in range(imagepdf.getNumPages()):
imagepage = imagepdf.getPage(i)
textpage = textpdf.getPage(i)
factor_x = textpage.mediaBox.upperRight[0] / imagepage.mediaBox.upperRight[0]
factor_y = textpage.mediaBox.upperRight[1] / imagepage.mediaBox.upperRight[1]
imagepage.scale(float(factor_x), float(factor_y))
textpage.mergePage(imagepage) # imagepage stay on top
textpage.compressContentStreams()
output.addPage(textpage)
Trace:
Traceback (most recent call last):
File "...", line 34, in
imagepage.scale(float(factor_x), float(factor_y))
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyPDF2/pdf.py", line 2493, in scale
0, 0])
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyPDF2/pdf.py", line 2479, in addTransformation
originalContent, self.pdf, ctm)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyPDF2/pdf.py", line 2180, in _addTransformationMatrix
contents = ContentStream(contents, pdf)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyPDF2/pdf.py", line 2641, in init
data += s.getObject().getData()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyPDF2/generic.py", line 837, in getData
decoded._data = filters.decodeStreamData(self)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyPDF2/filters.py", line 350, in decodeStreamData
data = LZWDecode.decode(data, stream.get("/DecodeParms"))
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyPDF2/filters.py", line 255, in decode
return LZWDecode.decoder(data).decode()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyPDF2/filters.py", line 228, in decode
cW = self.nextCode();
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/PyPDF2/filters.py", line 205, in nextCode
nextbits=ord(self.data[self.bytepos])
TypeError: ord() expected string of length 1, but int found
Am I doing something wrong?
The text was updated successfully, but these errors were encountered: