Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manipulated inline images can force PyPDF2 into an infinite loop #329

Closed
sekrause opened this issue Feb 17, 2017 · 0 comments · Fixed by #740
Closed

Manipulated inline images can force PyPDF2 into an infinite loop #329

sekrause opened this issue Feb 17, 2017 · 0 comments · Fixed by #740
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF nf-performance Non-functional change: Performance nf-security Non-functional change: Security

Comments

@sekrause
Copy link
Contributor

When you try to get the content stream of this attached PDF, PyPDF2 will end up in an infinite loop. So this is probably a security issue because it might be possible to denial-of-service applications using PyPDF2.

The reason is that the last while-loop in ContentStream._readInlineImage only terminates when it finds the EI token, but never actually checks if the stream has already ended. So it's as simple as adding a (broken) inline image that doesn't have an EI token at all, like the attached PDF.

You can see the infinite loop by running this test script with the attached PDF:

import sys

from PyPDF2 import PdfFileReader, PdfFileWriter
from PyPDF2.pdf import ContentStream

with open(sys.argv[1], 'rb') as f:
    pdf = PdfFileReader(f, strict=False)
    for page in pdf.pages:
        contentstream = ContentStream(page.getContents(), pdf)
        for operands, command in contentstream.operations:
            if command == b'INLINE IMAGE':
                data = operands['data']
                print(len(data))

I will soon prepare a pull request that fixes this issue.

@MartinThoma MartinThoma added is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF nf-performance Non-functional change: Performance nf-security Non-functional change: Security labels Apr 8, 2022
MartinThoma pushed a commit that referenced this issue Apr 15, 2022
Closes #329 - potential infinite loop (SEC)
Closes #330 - performance issue of ContentStream._readInlineImage (PERF)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF nf-performance Non-functional change: Performance nf-security Non-functional change: Security
Projects
None yet
2 participants