-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: UnboundLocalError when iterating on pages of malformed pdf (with strict=True) #2617
Comments
Thanks for the report. Do you want to submit a corresponding PR which initializes this value with a default to allow proper error reporting? |
@farjasju for advice, the fix is too add after line 402 in _reader.py It would be great to make a test with your file in order to improve coverage : You might be the one who may get over the 95% of code coverage😀 |
Thanks for the suggestion! I was trying to understand what was this |
@pubpub-zz so you confirm that the expected exception when iterating over the pages is the following?
|
Correct! your pdf is damaged and object 21 can found properly in the pdf (you can confirm that reading the file with strict=False) |
Thanks! Should I add the file to |
objects are identified with an id and a generation/version. It allows to identify a reuse of an id |
As long as you do not own any copyright on the file, please download from the GitHub URL where you uploaded the example to, id est https://github.com/py-pdf/pypdf/files/15186107/malformed_pdf.pdf |
The PDF seems to be a truncated version of this article. I personally do not own any right of it, I don't know if it is ok to upload it? |
As long as you are unsure, please do not use |
Sorry for being such a Github noob but, I have to push my branch before creating the PR right? I get a 403 when trying to push my it:
EDIT: Okay, maybe it's better to fork the repo first instead of creating the branch on the cloned repo itself |
Yes, you need to create a fork and push to your fork. |
Closes #2617 Co-authored-by: jules <jules@harfanglab.fr>
An
UnboundLocalError: local variable 'generation' referenced before assignment
is raised when iterating on the pages of a malformed pdf (withlen(PdfReader.pages
) for example), when strict=True.Environment
Code + PDF
This is a minimal, complete example that shows the issue:
The malformed pdf (coming from https://www.columbia.edu/~aw2951/Nations.pdf):
malformed_pdf.pdf
Traceback
This is the complete traceback I see:
The text was updated successfully, but these errors were encountered: