-
Notifications
You must be signed in to change notification settings - Fork 655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error #636
Comments
Hi @Puneet0353, and thanks for sharing this interesting example. I have examined the file and the error, and have come to the following conclusions:
gs \
-o "Ajay VR Detailed-repaired.pdf" \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/prepress \
"Ajay VR Detailed.pdf" I hope that helps. In the meantime, I plan to investigate whether there's a way to improve |
Thanks so much.
But by installing the previous version of pdfplumber resolved the issue.
…On Mon, 11 Apr, 2022, 18:49 Jeremy Singer-Vine, ***@***.***> wrote:
Hi @Puneet0353 <https://github.com/Puneet0353>, and thanks for sharing
this interesting example. I have examined the file and the error, and have
come to the following conclusions:
-
Per the traceback you've pasted above (and which I've confirmed), the
error is raised by pdfminer.six
<https://github.com/pdfminer/pdfminer.six>, the library we use to
extract the raw object information from the PDFs. So this isn't an issue
that cannot be resolved directly through pdfplumber.
-
pdfminer.six appears to raise the error due to an unusual graphics
command in the PDF. I'm not entirely sure whether the PDF is malformed or
whether it's just unusual. In either case, the PDF appears to parse cleanly
if you first repair it with GhostScript
<https://superuser.com/questions/278562/how-can-i-fix-repair-a-corrupted-pdf-file>
:
gs \
-o "Ajay VR Detailed-repaired.pdf" \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/prepress \
"Ajay VR Detailed.pdf"
I hope that helps. In the meantime, I plan to investigate whether there's
a way to improve pdfminer.six's handling of the graphics command in your
PDF, and will submit a PR on that repository if I find a solution.
—
Reply to this email directly, view it on GitHub
<#636 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHHQULORGN4UVBE3K3KLUCDVEQRHHANCNFSM5SRLWEVA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Describe the bug
A clear and concise description of what the bug is.
ValueError: not enough values to unpack (expected 2, got 1)
The complete details of the error are-
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_14976/3193579313.py in
5 #Get Basic Data and convert them to Dictionary
6 page = pdf.pages[1]
----> 7 Page1_Tables = page.extract_tables()
8 input(Page1_Tables)
9 B1 = pd.DataFrame(Page1_Tables[0])
~\anaconda3\lib\site-packages\pdfplumber\page.py in extract_tables(self, table_settings)
223 def extract_tables(self, table_settings={}):
224 table_settings = TableFinder.resolve_table_settings(table_settings)
--> 225 tables = self.find_tables(table_settings)
226
227 extract_kwargs = dict(
~\anaconda3\lib\site-packages\pdfplumber\page.py in find_tables(self, table_settings)
219
220 def find_tables(self, table_settings={}):
--> 221 return TableFinder(self, table_settings).tables
222
223 def extract_tables(self, table_settings={}):
~\anaconda3\lib\site-packages\pdfplumber\table.py in init(self, page, settings)
472 self.page = page
473 self.settings = self.resolve_table_settings(settings)
--> 474 self.edges = self.get_edges()
475 self.intersections = edges_to_intersections(
476 self.edges,
~\anaconda3\lib\site-packages\pdfplumber\table.py in get_edges(self)
568
569 if v_strat == "lines":
--> 570 v_base = utils.filter_edges(self.page.edges, "v")
571 elif v_strat == "lines_strict":
572 v_base = utils.filter_edges(self.page.edges, "v", edge_type="line")
~\anaconda3\lib\site-packages\pdfplumber\container.py in edges(self)
77 if hasattr(self, "_edges"):
78 return self._edges
---> 79 line_edges = list(map(utils.line_to_edge, self.lines))
80 self._edges = self.rect_edges + line_edges
81 return self._edges
~\anaconda3\lib\site-packages\pdfplumber\container.py in lines(self)
35 @Property
36 def lines(self):
---> 37 return self.objects.get("line", [])
38
39 @Property
~\anaconda3\lib\site-packages\pdfplumber\page.py in objects(self)
150 if hasattr(self, "_objects"):
151 return self._objects
--> 152 self._objects = self.parse_objects()
153 return self._objects
154
~\anaconda3\lib\site-packages\pdfplumber\page.py in parse_objects(self)
206 def parse_objects(self):
207 objects = {}
--> 208 for obj in self.iter_layout_objects(self.layout._objs):
209 kind = obj["object_type"]
210 if kind in ["anno"]:
~\anaconda3\lib\site-packages\pdfplumber\page.py in layout(self)
96 )
97 interpreter = PDFPageInterpreter(self.pdf.rsrcmgr, device)
---> 98 interpreter.process_page(self.page_obj)
99 self._layout = device.get_result()
100 return self._layout
~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in process_page(self, page)
1003 ctm = (1, 0, 0, 1, -x0, -y0)
1004 self.device.begin_page(page, ctm)
-> 1005 self.render_contents(page.resources, page.contents, ctm=ctm)
1006 self.device.end_page(page)
1007 return
~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in render_contents(self, resources, streams, ctm)
1021 self.init_resources(resources)
1022 self.init_state(ctm)
-> 1023 self.execute(list_value(streams))
1024 return
1025
~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in execute(self, streams)
1049 else:
1050 log.debug('exec: %s', name)
-> 1051 func()
1052 else:
1053 if settings.STRICT:
~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in do_s(self)
584 """Close and stroke path"""
585 self.do_h()
--> 586 self.do_S()
587 return
588
~\anaconda3\lib\site-packages\pdfminer\pdfinterp.py in do_S(self)
576 def do_S(self) -> None:
577 """Stroke path"""
--> 578 self.device.paint_path(self.graphicstate, True, False, False,
579 self.curpath)
580 self.curpath = []
~\anaconda3\lib\site-packages\pdfminer\converter.py in paint_path(self, gstate, stroke, fill, evenodd, path)
119 raw_pts = [cast(Point, p[-2:] if p[0] != 'h' else path[0][-2:])
120 for p in path]
--> 121 pts = [apply_matrix_pt(self.ctm, pt) for pt in raw_pts]
122
123 if shape in {'mlh', 'ml'}:
~\anaconda3\lib\site-packages\pdfminer\converter.py in (.0)
119 raw_pts = [cast(Point, p[-2:] if p[0] != 'h' else path[0][-2:])
120 for p in path]
--> 121 pts = [apply_matrix_pt(self.ctm, pt) for pt in raw_pts]
122
123 if shape in {'mlh', 'ml'}:
~\anaconda3\lib\site-packages\pdfminer\utils.py in apply_matrix_pt(m, v)
251 def apply_matrix_pt(m: Matrix, v: Point) -> Point:
252 (a, b, c, d, e, f) = m
--> 253 (x, y) = v
254 """Applies a matrix to a point."""
255 return a * x + c * y + e, b * x + d * y + f
ValueError: not enough values to unpack (expected 2, got 1)
Code to reproduce the problem
import pdfplumber
FILE = "D:\Astro\Charts\"+Name+".pdf"
pdf = pdfplumber.open(FILE)
#Get Basic Data and convert them to Dictionary
page = pdf.pages[1]
Page1_Tables = page.extract_tables()
Paste it here, or attach a Python file.
PDF file
Please attach any PDFs necessary to reproduce the problem.
If you need to redact text in a sensitive PDF, you can run it through JoshData/pdf-redactor.
Expected behavior
What did you expect the result should have been?
It should have extracted tables. It was working fine. However after I reinstalled Anaconda with Python 3.9, this problem has started coming
Actual behavior
What actually happened, instead?
Screenshots
If applicable, add screenshots to help explain your problem.
Environment
Additional context
Ajay VR Detailed.pdf
Add any other context/notes about the problem here.
The text was updated successfully, but these errors were encountered: