Error #636

Puneet0353 · 2022-04-05T03:14:11Z

Describe the bug

A clear and concise description of what the bug is.
ValueError: not enough values to unpack (expected 2, got 1)

The complete details of the error are-

ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_14976/3193579313.py in
5 #Get Basic Data and convert them to Dictionary
6 page = pdf.pages[1]
----> 7 Page1_Tables = page.extract_tables()
8 input(Page1_Tables)
9 B1 = pd.DataFrame(Page1_Tables[0])

~\anaconda3\lib\site-packages\pdfplumber\page.py in extract_tables(self, table_settings)
223 def extract_tables(self, table_settings={}):
224 table_settings = TableFinder.resolve_table_settings(table_settings)
--> 225 tables = self.find_tables(table_settings)
226
227 extract_kwargs = dict(

~\anaconda3\lib\site-packages\pdfplumber\page.py in find_tables(self, table_settings)
219
220 def find_tables(self, table_settings={}):
--> 221 return TableFinder(self, table_settings).tables
222
223 def extract_tables(self, table_settings={}):

~\anaconda3\lib\site-packages\pdfplumber\table.py in init(self, page, settings)
472 self.page = page
473 self.settings = self.resolve_table_settings(settings)
--> 474 self.edges = self.get_edges()
475 self.intersections = edges_to_intersections(
476 self.edges,

~\anaconda3\lib\site-packages\pdfplumber\table.py in get_edges(self)
568
569 if v_strat == "lines":
--> 570 v_base = utils.filter_edges(self.page.edges, "v")
571 elif v_strat == "lines_strict":
572 v_base = utils.filter_edges(self.page.edges, "v", edge_type="line")

~\anaconda3\lib\site-packages\pdfplumber\container.py in edges(self)
77 if hasattr(self, "_edges"):
78 return self._edges
---> 79 line_edges = list(map(utils.line_to_edge, self.lines))
80 self._edges = self.rect_edges + line_edges
81 return self._edges

~\anaconda3\lib\site-packages\pdfplumber\container.py in lines(self)
35 @Property
36 def lines(self):
---> 37 return self.objects.get("line", [])
38
39 @Property

~\anaconda3\lib\site-packages\pdfplumber\page.py in objects(self)
150 if hasattr(self, "_objects"):
151 return self._objects
--> 152 self._objects = self.parse_objects()
153 return self._objects
154

~\anaconda3\lib\site-packages\pdfplumber\page.py in parse_objects(self)
206 def parse_objects(self):
207 objects = {}
--> 208 for obj in self.iter_layout_objects(self.layout._objs):
209 kind = obj["object_type"]
210 if kind in ["anno"]:

~\anaconda3\lib\site-packages\pdfplumber\page.py in layout(self)
96 )
97 interpreter = PDFPageInterpreter(self.pdf.rsrcmgr, device)
---> 98 interpreter.process_page(self.page_obj)
99 self._layout = device.get_result()
100 return self._layout

~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in process_page(self, page)
1003 ctm = (1, 0, 0, 1, -x0, -y0)
1004 self.device.begin_page(page, ctm)
-> 1005 self.render_contents(page.resources, page.contents, ctm=ctm)
1006 self.device.end_page(page)
1007 return

~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in render_contents(self, resources, streams, ctm)
1021 self.init_resources(resources)
1022 self.init_state(ctm)
-> 1023 self.execute(list_value(streams))
1024 return
1025

~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in execute(self, streams)
1049 else:
1050 log.debug('exec: %s', name)
-> 1051 func()
1052 else:
1053 if settings.STRICT:

~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in do_s(self)
584 """Close and stroke path"""
585 self.do_h()
--> 586 self.do_S()
587 return
588

~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in do_S(self)
576 def do_S(self) -> None:
577 """Stroke path"""
--> 578 self.device.paint_path(self.graphicstate, True, False, False,
579 self.curpath)
580 self.curpath = []

~\anaconda3\lib\site-packages\pdfminer\converter.py in paint_path(self, gstate, stroke, fill, evenodd, path)
119 raw_pts = [cast(Point, p[-2:] if p[0] != 'h' else path[0][-2:])
120 for p in path]
--> 121 pts = [apply_matrix_pt(self.ctm, pt) for pt in raw_pts]
122
123 if shape in {'mlh', 'ml'}:

~\anaconda3\lib\site-packages\pdfminer\converter.py in (.0)
119 raw_pts = [cast(Point, p[-2:] if p[0] != 'h' else path[0][-2:])
120 for p in path]
--> 121 pts = [apply_matrix_pt(self.ctm, pt) for pt in raw_pts]
122
123 if shape in {'mlh', 'ml'}:

~\anaconda3\lib\site-packages\pdfminer\utils.py in apply_matrix_pt(m, v)
251 def apply_matrix_pt(m: Matrix, v: Point) -> Point:
252 (a, b, c, d, e, f) = m
--> 253 (x, y) = v
254 """Applies a matrix to a point."""
255 return a * x + c * y + e, b * x + d * y + f

ValueError: not enough values to unpack (expected 2, got 1)

Code to reproduce the problem

import pdfplumber
FILE = "D:\Astro\Charts\"+Name+".pdf"
pdf = pdfplumber.open(FILE)
#Get Basic Data and convert them to Dictionary
page = pdf.pages[1]
Page1_Tables = page.extract_tables()
Paste it here, or attach a Python file.

PDF file

Please attach any PDFs necessary to reproduce the problem.

If you need to redact text in a sensitive PDF, you can run it through JoshData/pdf-redactor.

Expected behavior

What did you expect the result should have been?
It should have extracted tables. It was working fine. However after I reinstalled Anaconda with Python 3.9, this problem has started coming

Actual behavior

What actually happened, instead?

Screenshots

If applicable, add screenshots to help explain your problem.

Environment

pdfplumber version: [e.g., 0.5.22]
Python version: [e.g., 3.8.1]
OS: [e.g., Mac, Linux, etc.]

Additional context

Ajay VR Detailed.pdf

Add any other context/notes about the problem here.

jsvine · 2022-04-11T13:19:37Z

Hi @Puneet0353, and thanks for sharing this interesting example. I have examined the file and the error, and have come to the following conclusions:

Per the traceback you've pasted above (and which I've confirmed), the error is raised by pdfminer.six, the library we use to extract the raw object information from the PDFs. So this isn't an issue that cannot be resolved directly through pdfplumber.
pdfminer.six appears to raise the error due to an unusual graphics command in the PDF. I'm not entirely sure whether the PDF is malformed or whether it's just unusual. In either case, the PDF appears to parse cleanly if you first repair it with GhostScript:

 gs \
  -o "Ajay VR Detailed-repaired.pdf" \
  -sDEVICE=pdfwrite \
  -dPDFSETTINGS=/prepress \
  "Ajay VR Detailed.pdf"

I hope that helps. In the meantime, I plan to investigate whether there's a way to improve pdfminer.six's handling of the graphics command in your PDF, and will submit a PR on that repository if I find a solution.

Puneet0353 · 2022-04-11T16:55:48Z

Thanks so much. But by installing the previous version of pdfplumber resolved the issue.

…

On Mon, 11 Apr, 2022, 18:49 Jeremy Singer-Vine, ***@***.***> wrote: Hi @Puneet0353 <https://github.com/Puneet0353>, and thanks for sharing this interesting example. I have examined the file and the error, and have come to the following conclusions: - Per the traceback you've pasted above (and which I've confirmed), the error is raised by pdfminer.six <https://github.com/pdfminer/pdfminer.six>, the library we use to extract the raw object information from the PDFs. So this isn't an issue that cannot be resolved directly through pdfplumber. - pdfminer.six appears to raise the error due to an unusual graphics command in the PDF. I'm not entirely sure whether the PDF is malformed or whether it's just unusual. In either case, the PDF appears to parse cleanly if you first repair it with GhostScript <https://superuser.com/questions/278562/how-can-i-fix-repair-a-corrupted-pdf-file> : gs \ -o "Ajay VR Detailed-repaired.pdf" \ -sDEVICE=pdfwrite \ -dPDFSETTINGS=/prepress \ "Ajay VR Detailed.pdf" I hope that helps. In the meantime, I plan to investigate whether there's a way to improve pdfminer.six's handling of the graphics command in your PDF, and will submit a PR on that repository if I find a solution. — Reply to this email directly, view it on GitHub <#636 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHHQULORGN4UVBE3K3KLUCDVEQRHHANCNFSM5SRLWEVA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Puneet0353 added the bug label Apr 5, 2022

jsvine closed this as completed Apr 11, 2022

jsvine mentioned this issue Apr 22, 2022

Ignore path constructors that do not begin with m pdfminer/pdfminer.six#749

Merged

7 tasks

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error #636

Error #636

Puneet0353 commented Apr 5, 2022

jsvine commented Apr 11, 2022

Puneet0353 commented Apr 11, 2022 via email

Error #636

Error #636

Comments

Puneet0353 commented Apr 5, 2022

Describe the bug

The complete details of the error are-

Code to reproduce the problem

PDF file

Expected behavior

Actual behavior

Screenshots

Environment

Additional context

jsvine commented Apr 11, 2022

Puneet0353 commented Apr 11, 2022 via email