You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
eghlima opened this issue
Jul 25, 2017
· 3 comments
Labels
Has MCVEA minimal, complete and verifiable example helps a lot to debug / understand feature requestsis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFPdfReaderThe PdfReader component is affected
I have attached an invalid pdf file.
when I wanted to open that pdf file,
I have faced with long time to read a pdf (more than 1 hour).
seems a bug is in PdfFileReader.
this is my test to reproduce the bug on PyPDF2==1.26.0 and python 3.6:
Sorry for digging this up, but we saw similar results, even with a "PDF" that was a 5MB file with all zeroes.
I'm not sure what exactly the code is doing, but it seems to get stuck in a loop looking for an EOF marker that doesn't exist. I'm not sure if this will ever be fixed in PyPDF2 given the current situation, but it would really be helpful to have PyPDF2 fail fast on these types of files.
Actually, the real function reads a line backwards, but it's the same idea... This ends up being quadratic in the total length of the line, which is particularly noticeable when the line gets to be very long (e.g. 5MB of null bytes, which, since it has no \r or \n characters, is treated a single line). I'll open an issue with a fix as well as a way to monkey-patch PdfFileReader to work around it - I don't know if PyPDF2 is still being maintained but at least it'll be out there.
MartinThoma
added
is-bug
From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
PdfReader
The PdfReader component is affected
labels
Apr 8, 2022
MartinThoma
added
the
Has MCVE
A minimal, complete and verifiable example helps a lot to debug / understand feature requests
label
Jun 27, 2022
Has MCVEA minimal, complete and verifiable example helps a lot to debug / understand feature requestsis-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFPdfReaderThe PdfReader component is affected
I have attached an invalid pdf file.
when I wanted to open that pdf file,
I have faced with long time to read a pdf (more than 1 hour).
seems a bug is in PdfFileReader.
this is my test to reproduce the bug on
PyPDF2==1.26.0
and python 3.6:MCVE: Code + PDF
Example PDF: file1.pdf
The text was updated successfully, but these errors were encountered: