Skip to content
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.

struct.error: unpack requires a string argument of length 8 #96

Open
jlegaye opened this issue Feb 19, 2015 · 7 comments
Open

struct.error: unpack requires a string argument of length 8 #96

jlegaye opened this issue Feb 19, 2015 · 7 comments

Comments

@jlegaye
Copy link

jlegaye commented Feb 19, 2015

I am running this command :

pdf2txt.py -S -t xml -o pdfMinerOutput.xml input.pdf

It crashes at the very beginning with this stack trace :

Traceback (most recent call last):
  File "C:\Tools\Python27\Scripts\pdf2txt.py", line 116, in <module>
    if __name__ == '__main__': sys.exit(main(sys.argv))
  File "C:\Tools\Python27\Scripts\pdf2txt.py", line 110, in main
    interpreter.process_page(page)
  File "C:\Tools\Python27\lib\site-packages\pdfminer\pdfinterp.py", line 839, in
 process_page
    self.render_contents(page.resources, page.contents, ctm=ctm)
  File "C:\Tools\Python27\lib\site-packages\pdfminer\pdfinterp.py", line 850, in
 render_contents
    self.init_resources(resources)
  File "C:\Tools\Python27\lib\site-packages\pdfminer\pdfinterp.py", line 356, in
 init_resources
    self.fontmap[fontid] = self.rsrcmgr.get_font(objid, spec)
  File "C:\Tools\Python27\lib\site-packages\pdfminer\pdfinterp.py", line 204, in
 get_font
    font = self.get_font(None, subspec)
  File "C:\Tools\Python27\lib\site-packages\pdfminer\pdfinterp.py", line 195, in
 get_font
    font = PDFCIDFont(self, spec)
  File "C:\Tools\Python27\lib\site-packages\pdfminer\pdffont.py", line 669, in _
_init__
    BytesIO(self.fontfile.get_data()))
  File "C:\Tools\Python27\lib\site-packages\pdfminer\pdffont.py", line 387, in _
_init__
    (ntables, _1, _2, _3) = struct.unpack('>HHHH', fp.read(8))
struct.error: unpack requires a string argument of length 8
@NealJMD
Copy link

NealJMD commented Apr 19, 2015

I am also experiencing this on Mac OS X 10.8.5

@dvro
Copy link

dvro commented Oct 2, 2015

@NealJMD @jlegaye I am having the ssame problem. Any luck solving this?

@gfarce
Copy link

gfarce commented Oct 2, 2015

Is this an issue with all the pdfs you're processing or only some?

@dvro
Copy link

dvro commented Oct 2, 2015

@gfarce only some pdfs.

@vionemc
Copy link

vionemc commented Feb 3, 2016

I am also experiencing this in Cygwin. It happened a lot. For example for this pdf:
https://comision6senado.files.wordpress.com/2013/03/acta-11-12-septiembre-25-de-2012.pdf

@pombredanne
Copy link

Same problem here: nexB/scancode-toolkit#289 (comment)

@euske Shinyama-san do you need some help to apply patches and PR?
You may be busy with many other things

#132 by @abrasive seems to be a cure here?

@danmash
Copy link

danmash commented Oct 18, 2016

I have a similar issue, anyone know how can I solve it? #144

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants