-
Notifications
You must be signed in to change notification settings - Fork 676
Description
Description of the bug
The Pdf is this:
part_4.pdf
How can I skip this error page and continue proceed to another page? Thanks.
How to reproduce the bug
log:
Traceback (most recent call last):
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/runpy.py", line 198, in _run_module_as_main
return _run_code(code, main_globals, None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/runpy.py", line 88, in _run_code
exec(code, run_globals)
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 71, in
cli.main()
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
run()
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
runpy.run_path(target, run_name="main")
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
_run_code(code, mod_globals, init_globals, mod_name, mod_spec, pkg_name, script_name)
File "/root/.vscode-server/extensions/ms-python.debugpy-2024.12.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
exec(code, run_globals)
File "/usr/mineru_test/mineru_project/mineru_project_python/minerU/magic_pdf_parse_main.py", line 151, in
pdf_parse_main(pdf_path, output_dir="./out")
File "/usr/mineru_test/mineru_project/mineru_project_python/minerU/magic_pdf_parse_main.py", line 108, in pdf_parse_main
pipe.pipe_classify()
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/pipe/UNIPipe.py", line 25, in pipe_classify
self.pdf_type = AbsPipe.classify(self.pdf_bytes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/pipe/AbsPipe.py", line 63, in classify
pdf_meta = pdf_meta_scan(pdf_bytes)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/filter/pdf_meta_scan.py", line 331, in pdf_meta_scan
image_info_per_page, junk_img_bojids = get_image_info(doc, page_width_pts, page_height_pts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/filter/pdf_meta_scan.py", line 119, in get_image_info
page_result = process_image(page, junk_img_bojids)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/magic_pdf/filter/pdf_meta_scan.py", line 39, in process_image
recs = page.get_image_rects(img, transform=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/pymupdf/utils.py", line 879, in get_image_rects
pix = pymupdf.Pixmap(page.parent, xref) # make pixmap of the image to compute MD5
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/pymupdf/init.py", line 10110, in init
img = mupdf.pdf_load_image(pdf, ref)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/mineru_test/python/miniforge3/envs/mineru_project_py311/lib/python3.11/site-packages/pymupdf/mupdf.py", line 50901, in pdf_load_image
return _mupdf.pdf_load_image(doc, obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pymupdf.mupdf.FzErrorSyntax: code=8: Failed to decode JPX image
How can I skip this error page and continue proceed to another page? Looking forward to your reply,Thanks.
PyMuPDF version
1.24.14
Operating system
Linux
Python version
3.11