Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PdfReadError: Could not find xref table at specified location #1089

Closed
MartinThoma opened this issue Jul 10, 2022 · 2 comments
Closed

PdfReadError: Could not find xref table at specified location #1089

MartinThoma opened this issue Jul 10, 2022 · 2 comments
Labels
Has MCVE A minimal, complete and verifiable example helps a lot to debug / understand feature requests is-robustness-issue From a users perspective, this is about robustness PdfReader The PdfReader component is affected

Comments

@MartinThoma
Copy link
Member

I wanted to get metadata from a PDF. The PDF opens fine in Chrome.

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
Linux-5.4.0-121-generic-x86_64-with-glibc2.31

$ python -c "import PyPDF2;print(PyPDF2.__version__)"
2.4.2

Code + PDF

The PDF: pdf/6f6d505d151b9769900a8bd315db6e67.pdf

>>> from PyPDF2 import PdfReader
>>> reader = PdfReader('pdf/6f6d505d151b9769900a8bd315db6e67.pdf')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/moose/Github/py-pdf/PyPDF2/PyPDF2/_reader.py", line 267, in __init__
    self.read(stream)
  File "/home/moose/Github/py-pdf/PyPDF2/PyPDF2/_reader.py", line 1305, in read
    raise PdfReadError("Could not find xref table at specified location")
PyPDF2.errors.PdfReadError: Could not find xref table at specified location
@MartinThoma MartinThoma added PdfReader The PdfReader component is affected is-robustness-issue From a users perspective, this is about robustness Has MCVE A minimal, complete and verifiable example helps a lot to debug / understand feature requests labels Jul 10, 2022
@MartinThoma
Copy link
Member Author

Maybe #493 helps?

pubpub-zz added a commit to pubpub-zz/pypdf that referenced this issue Jul 19, 2022
rebuild the xref if the parent chained xref is invalid
@pubpub-zz
Copy link
Collaborator

the problem is that the /Prev in the 1st chained Trailer is pointing to an invalid position. In such case the xref shall be completely rebuild parsing the whole file
proposed fixed in #1133

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Has MCVE A minimal, complete and verifiable example helps a lot to debug / understand feature requests is-robustness-issue From a users perspective, this is about robustness PdfReader The PdfReader component is affected
Projects
None yet
Development

No branches or pull requests

2 participants