-
Notifications
You must be signed in to change notification settings - Fork 168
Closed
Labels
Description
Versions
- pymupdf==1.26.6
- pymupdf-layout==1.26.6
- pymupdf4llm==0.2.0
Steps to reproduce
import pymupdf.layout
import pymupdf
import pymupdf4llm
doc = pymupdf.open('file.pdf')
md = pymupdf4llm.to_markdown(doc)
Error encountered
Traceback (most recent call last):
File "/opt/ori/lab/show.py", line 7, in <module>
md = pymupdf4llm.to_markdown(doc)
File "/usr/local/lib/python3.10/site-packages/pymupdf4llm/__init__.py", line 71, in to_markdown
parsed_doc = parse_document(
File "/usr/local/lib/python3.10/site-packages/pymupdf4llm/__init__.py", line 30, in parse_document
return DL.parse_document(
File "/usr/local/lib/python3.10/site-packages/pymupdf4llm/helpers/document_layout.py", line 716, in parse_document
page.layout_information = utils.find_reading_order(
File "/usr/local/lib/python3.10/site-packages/pymupdf4llm/helpers/utils.py", line 409, in find_reading_order
min(b[0] for b in body_boxes),
ValueError: min() arg is an empty sequence
Notes
The pdf referred to above is attached.