Skip to content
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.

Error with zlib headers #151

Open
pudo opened this issue Sep 23, 2016 · 1 comment
Open

Error with zlib headers #151

pudo opened this issue Sep 23, 2016 · 1 comment

Comments

@pudo
Copy link

pudo commented Sep 23, 2016

With this PDF file:

http://www.gld.gov.hk/egazette/pdf/20162032/egn201620324533.pdf

The following error occurs:

 File "/aleph/aleph/ingest/pdf.py", line 81, in extract_pdf
    doc = PDFDocument(parser, '')
  File "/usr/local/lib/python2.7/site-packages/pdfminer/pdfdocument.py", line 566, in __init__
    xref.load(parser)
  File "/usr/local/lib/python2.7/site-packages/pdfminer/pdfdocument.py", line 204, in load
    parser1 = PDFStreamParser(stream.get_data())
  File "/usr/local/lib/python2.7/site-packages/pdfminer/pdftypes.py", line 294, in get_data
    self.decode()
  File "/usr/local/lib/python2.7/site-packages/pdfminer/pdftypes.py", line 253, in decode
    raise PDFException('Invalid zlib bytes: %r, %r' % (e, data))
PDFException: Invalid zlib bytes: error('Error -3 while decompressing data: incorrect header check',), '[snip]'

Where [snip] is:

N\x9c\xc9cv\x1b\xd3p\x98\xb7\x97\x8d`w\xb0c\x0f\x0c\x1d=(\xdaPw\x814\xa8.\xa9->\xca\x9b\xcf\xc8\xa8\xffT\xc2\xe3YJ\x1e&C\xf9q\xa9E\xa1\x1c,\n\xbf(\'\x03\x1a\x97\x8c\xce\x8f`\xdc\x07\xd4\xd6\x98\n\xad\xe5\xb6\x9d\xc3iQ\xdeVD/\x91\xc7!\xf7{\xe9WQ\x87\x0eVG}\x89:O\x13.\x9f\xe2\xb3Tu\x1c\xab\x170\x17f\xb2\x04\xd6\xbfa\x99\xd9u\x9c\r\x04\xa2\x1f\xe3\xc2Z\xb4~\xac)\xb5\xd7\x85\x03\xf0UB\xbd\xbb\xb8\x1eI_k\xdfn:\xecPmx\xe3\x18\xc2.^>\x7f\xb5w|\x98XOk\xafG\xc2\x1a\x0e\xdf\x12\x7fb.b\xa9\x0f>\x85G9m\xf2\xb3\'p\xcb\xe9\x86\x89\xf0\xa3\x11\xd7\xe6]^\x90\xc7\xa5\xa7J\xb4|\xc5\xef)\x8a\x11\xa57}H\x82\xf8y\x06\x81z:\xff\x1d<\x96\xe8\xed\xda_\xed\x07\xca\xd52\x90\x9b\x97\xbe\xc7\n\xe8\xfb\xa4+\xd0[\x13\x9e\xbd\x14u\x93\x13VMk\xdd\x13\x8e\x80f\xe0lc\xa1\t\xbf\xecE#\xa0\xe2\xfb\x14\x93\xa6\xbc\x94Y\xa7\xf1\x9d\xfb^A.\xd5\xd4P\r\xf0H%\x9d\x12b`\xdak\xa5\xb9h\x9a\x15\xdf\x8b\x96k\x1c\'\xd5\xc1/\xdd\x94\xc9O\x0f\x81\x17\x1cxdU\x80\xfe\x03\x98\xf3\xe3\xd8$J2\xfd\x00\x11gd5\t`\xbc\xbc\xd6\x8b\xe4\xae\x15\x8f\xe7\xe7\xca\x8b\xccMkk\x81\xbaF\x89\xcf-\xb4&I\xe3\r)\x83m\xae\xb4"\xda\x1c\xfe,\xce\xa1\x83\x14b\xb7\xae\xf5&\xf6hA\x00\xbf\xf7\xd1\xbcZ\xe1\xf1\xb2\xaa\x8c%\x90\xa1\xe5\xb9\x8d\x96\xb7T;\xe4\xe2\xb5\xfdC\x18\x02\x90\x1c\xf2\xa2\xf4\xa0@Bi\xdc@\x9cw\xbf\xb0\x1f\x8fA_\xbaQQGQ\xf3/\xd7\xdb\xf9\xf6\xe2\xe9\xac\xa1\xcfNJ<\xc8\xbc\x8f\xefJ\xe9\x12m\xfa\x88\xa5\xfa\x82;@j~\xf4\x7f\xdc\xdf\xf9\xc3\x83\'\xbe\xc6\x7f#\xd1\x01\xb4&\xa8\xc5]U\x0evE\xb7\xf1\x84\xe2\xa0~q$-<\xda\x04\x818\xe68\xb7\x11[;\xb9\xe9\xdf\xd2e\xaa\x87\xf0A2\xdc\xe59L\x92\xcd\x90\x89\n\x9c\x1e\xca\x822f\x00\xa6o\x1f^q\x0cl]\x8ex<\xd0$0oH@^\x16\xbc\xcc\xc7\x16\x99D]\x9b"\xac\x18\x03\xc3\xfb\x8e\xc4\xb4W7\xfe\xd4\x9b\xc3\xdd\xce\xfb\xaf@k\xec`\xbc\xd7G\xc9\xdb}\xe6f\xd1\xde\x99\xf0\xf2\x9b\xa5\xef5\xaf\xc6E[F\x0c_K!\t\x97\xa4\xd7\x1a\xb3X\x953\x18n\xdb\xd3\xac\x97W]^B\xd2m-\x84\x1c\x83e\xac\xe4fn\x1b\xea\xb7\x9a\x13\xcd\xd6fn)0O\x14\x05\xaa\xa4\x96\xde\xc6\xa1\xdc\x88\x93\x81Zf&\xe7\x8fV\x8d\xd0\x87d\x07\x1c\xd6\x96\xc9\x98\xee\xec\xf63\x18\t\x98\x0c\xfdWM1\xaf\xef \x10;\xde\xcd_\x8e\x05!m\xe5S\xcf\xa1\x82\x1c\xd1}F(\xc6\xe8\xc8\xb0\x07\xdb\xe2W|\x7fX\xd1\xa9z\r\n
@disillusions
Copy link

Same error. Does anybody have solution?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants