New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Denial of service issue #184

Closed
fgeek opened this Issue Mar 7, 2015 · 2 comments

Comments

Projects
None yet
2 participants
@fgeek

fgeek commented Mar 7, 2015

Hi,

I found denial of service issue from PyPDF2 version 1.24. With fuzzed sample file PyPDF2 ends up using all CPU from one core.

Sample file is located at http://bugs.fi/media/afl/pypdf2/pypdf2-1.24-afl-dos.pdf which is fuzzed with American fuzzy lop and using https://bitbucket.org/jwilk/python-afl project as instrumentation component.

References:

crasher.py (SHA1: a4fcecaa1e49472d45d6b2155cf70d62620b9622)

import PyPDF2
import sys
output = PdfFileWriter()
try:
    input1 = PdfFileReader(open(sys.argv[1], "rb"))
except PyPDF2.utils.PdfReadError:
    sys.exit()
print "document has %d pages." % input1.getNumPages()
output.addPage(input1.getPage(0).rotateClockwise(90))
outputStream = file('example2.pdf, "wb")
output.write(outputStream)

Execution with Python 2.7.9 using latest Git version (41d90b4):

python crasher.py pypdf2-1.24-afl-dos.pdf 1-2
PdfReadWarning: Xref table not zero-indexed. ID numbers for objects will not be corrected. [pdf.py:1509]
@dhudson1

This comment has been minimized.

Show comment
Hide comment
@dhudson1

dhudson1 Aug 17, 2015

Collaborator

If you insert the following lines into the readObject function in the generics.py file, PyPDF2 should raise an error:

            if len(tok) <= 0:   # Prevents an infinite loop by raising
                                # an error iff the stream is at the EOF
                raise PdfStreamError("File ended unexpectedly.")

You should insert the above code so that your readObject function now looks like this:

def readObject(stream, pdf):
    tok = stream.read(1)
    stream.seek(-1, 1) # reset to start
    idx = ObjectPrefix.find(tok)
    if idx == 0:
        # name object
        return NameObject.readFromStream(stream, pdf)
    elif idx == 1:
        # hexadecimal string OR dictionary
        peek = stream.read(2)
        stream.seek(-2, 1) # reset to start
        if peek == b_('<<'):
            return DictionaryObject.readFromStream(stream, pdf)
        else:
            return readHexStringFromStream(stream)
    elif idx == 2:
        # array object
        return ArrayObject.readFromStream(stream, pdf)
    elif idx == 3 or idx == 4:
        # boolean object
        return BooleanObject.readFromStream(stream)
    elif idx == 5:
        # string object
        return readStringFromStream(stream)
    elif idx == 6:
        # null object
        return NullObject.readFromStream(stream)
    elif idx == 7:
        # comment
        while tok not in (b_('\r'), b_('\n')):
            tok = stream.read(1)
            if len(tok) <= 0:   # Prevents an infinite loop by raising
                                # an error iff the stream is at the EOF
                raise PdfStreamError("File ended unexpectedly.")
        tok = readNonWhitespace(stream)
        stream.seek(-1, 1)
        return readObject(stream, pdf)
    else:
        # number object OR indirect reference
        if tok in NumberSigns:
            # number
            return NumberObject.readFromStream(stream)
        peek = stream.read(20)
        stream.seek(-len(peek), 1) # reset to start
        if IndirectPattern.match(peek) != None:
            return IndirectObject.readFromStream(stream, pdf)
        else:
            return NumberObject.readFromStream(stream)

Thank you very much for finding this bug, and waiting for us to get back to you.

Collaborator

dhudson1 commented Aug 17, 2015

If you insert the following lines into the readObject function in the generics.py file, PyPDF2 should raise an error:

            if len(tok) <= 0:   # Prevents an infinite loop by raising
                                # an error iff the stream is at the EOF
                raise PdfStreamError("File ended unexpectedly.")

You should insert the above code so that your readObject function now looks like this:

def readObject(stream, pdf):
    tok = stream.read(1)
    stream.seek(-1, 1) # reset to start
    idx = ObjectPrefix.find(tok)
    if idx == 0:
        # name object
        return NameObject.readFromStream(stream, pdf)
    elif idx == 1:
        # hexadecimal string OR dictionary
        peek = stream.read(2)
        stream.seek(-2, 1) # reset to start
        if peek == b_('<<'):
            return DictionaryObject.readFromStream(stream, pdf)
        else:
            return readHexStringFromStream(stream)
    elif idx == 2:
        # array object
        return ArrayObject.readFromStream(stream, pdf)
    elif idx == 3 or idx == 4:
        # boolean object
        return BooleanObject.readFromStream(stream)
    elif idx == 5:
        # string object
        return readStringFromStream(stream)
    elif idx == 6:
        # null object
        return NullObject.readFromStream(stream)
    elif idx == 7:
        # comment
        while tok not in (b_('\r'), b_('\n')):
            tok = stream.read(1)
            if len(tok) <= 0:   # Prevents an infinite loop by raising
                                # an error iff the stream is at the EOF
                raise PdfStreamError("File ended unexpectedly.")
        tok = readNonWhitespace(stream)
        stream.seek(-1, 1)
        return readObject(stream, pdf)
    else:
        # number object OR indirect reference
        if tok in NumberSigns:
            # number
            return NumberObject.readFromStream(stream)
        peek = stream.read(20)
        stream.seek(-len(peek), 1) # reset to start
        if IndirectPattern.match(peek) != None:
            return IndirectObject.readFromStream(stream, pdf)
        else:
            return NumberObject.readFromStream(stream)

Thank you very much for finding this bug, and waiting for us to get back to you.

@fgeek

This comment has been minimized.

Show comment
Hide comment
@fgeek

fgeek Sep 18, 2016

This is now also fixed in Debian via 8.6 point update.

fgeek commented Sep 18, 2016

This is now also fixed in Debian via 8.6 point update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment