Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'PDFObjRef' object does not support indexing #58

Open
travis-st opened this issue Jul 29, 2017 · 7 comments
Open

'PDFObjRef' object does not support indexing #58

travis-st opened this issue Jul 29, 2017 · 7 comments

Comments

@travis-st
Copy link

`import pdfquery
import sys

pdf = pdfquery.PDFQuery(sys.argv[1])
pdf.load()`

Traceback (most recent call last): File "bin/parse_pdf.py", line 6, in <module> pdf.load() File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 385, in load self.tree = self.get_tree(*_flatten(page_numbers)) File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 487, in get_tree for n, page in pages: File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 608, in <genexpr> return (self.get_layout(page) for page in self._cached_pages()) File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 603, in get_layout layout = self._add_annots(layout, page.annots) File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 647, in _add_annots annot = self._set_hwxy_attrs(annot) File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 665, in _set_hwxy_attrs attr['x0'] = bbox[0] TypeError: 'PDFObjRef' object does not support indexing

@jcushman
Copy link
Owner

Hi! I can't really debug this without the PDF that's causing a problem for you -- can you share it?

@travis-st
Copy link
Author

I would love to, but it's proprietary and confidential. Sorry :(

@travis-st
Copy link
Author

FYI, experienced a different problem this time:

>>> pdf = pdfquery.PDFQuery("input/2015/12-Dec/17-Dec/17-12.pdf")
>>> pdf.load()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 385, in load
    self.tree = self.get_tree(*_flatten(page_numbers))
  File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 486, in get_tree
    pages = enumerate(self.get_layouts())
  File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 608, in get_layouts
    return (self.get_layout(page) for page in self._cached_pages())
  File "/usr/local/lib/python2.7/site-packages/pdfquery/pdfquery.py", line 636, in _cached_pages
    self._pages += list(self._pages_iter)
  File "/usr/local/lib/python2.7/site-packages/pdfminer/pdfpage.py", line 100, in create_pages
    yield klass(document, objid, tree)
  File "/usr/local/lib/python2.7/site-packages/pdfminer/pdfpage.py", line 53, in __init__
    self.mediabox = resolve1(self.attrs['MediaBox'])
KeyError: 'MediaBox'

@acmisiti
Copy link

acmisiti commented May 8, 2018

Any update on the "'PDFObjRef' object does not support indexing" issue?

@NickHeiner
Copy link

I experienced this same issue, and also cannot share the PDF being used unfortunately.

@kravchenkog
Copy link

I have a similar problem.

>>> pdf.load() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/anaconda3/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 385, in load self.tree = self.get_tree(*_flatten(page_numbers)) File "/anaconda3/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 487, in get_tree for n, page in pages: File "/anaconda3/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 608, in <genexpr> return (self.get_layout(page) for page in self._cached_pages()) File "/anaconda3/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 601, in get_layout self.interpreter.process_page(page) File "/anaconda3/lib/python3.7/site-packages/pdfminer/pdfinterp.py", line 852, in process_page self.render_contents(page.resources, page.contents, ctm=ctm) File "/anaconda3/lib/python3.7/site-packages/pdfminer/pdfinterp.py", line 864, in render_contents self.execute(list_value(streams)) File "/anaconda3/lib/python3.7/site-packages/pdfminer/pdfinterp.py", line 888, in execute func(*args) File "/anaconda3/lib/python3.7/site-packages/pdfminer/pdfinterp.py", line 772, in do_TJ self.device.render_string(self.textstate, seq, self.ncs, self.graphicstate.copy()) File "/anaconda3/lib/python3.7/site-packages/pdfminer/pdfdevice.py", line 87, in render_string scaling, charspace, wordspace, rise, dxscale, ncs, graphicstate) File "/anaconda3/lib/python3.7/site-packages/pdfminer/pdfdevice.py", line 105, in render_string_horizontal ncs, graphicstate) File "/anaconda3/lib/python3.7/site-packages/pdfminer/converter.py", line 121, in render_char textwidth = font.char_width(cid) File "/anaconda3/lib/python3.7/site-packages/pdfminer/pdffont.py", line 525, in char_width return self.widths[cid] * self.hscale TypeError: unsupported operand type(s) for *: 'PDFObjRef' and 'float'

@NickB23
Copy link

NickB23 commented Jul 3, 2019

Here too. Same problem:

  File "pdfqueryparser.py", line 4, in <module>
    pdf.load()
  File "/usr/local/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 385, in load
    self.tree = self.get_tree(*_flatten(page_numbers))
  File "/usr/local/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 487, in get_tree
    for n, page in pages:
  File "/usr/local/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 608, in <genexpr>
    return (self.get_layout(page) for page in self._cached_pages())
  File "/usr/local/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 603, in get_layout
    layout = self._add_annots(layout, page.annots)
  File "/usr/local/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 647, in _add_annots
    annot = self._set_hwxy_attrs(annot)
  File "/usr/local/lib/python3.7/site-packages/pdfquery/pdfquery.py", line 665, in _set_hwxy_attrs
    attr['x0'] = bbox[0]
TypeError: 'PDFObjRef' object does not support indexing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants