Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pdf_add_stream metadata error #3112

Merged
merged 1 commit into from Jan 30, 2024

Conversation

clpo13
Copy link
Contributor

@clpo13 clpo13 commented Jan 29, 2024

I ran into a problem with the recode_pdf tool from archive-pdf-tools, which uses PyMuPDF. This error occurs with versions of PyMuPDF from 1.23.9 and on:

$ recode_pdf --from-imagestack test001.tif --hocr-file test001.hocr --dpi 400 -o test001.pdf
Traceback (most recent call last):
  File "/home/cody/ocrtest/venv/bin/recode_pdf", line 301, in <module>
    res = recode(args.from_pdf, args.from_imagestack, args.dpi, args.hocr_file,
  File "/home/cody/ocrtest/venv/lib/python3.10/site-packages/internetarchivepdf/recode.py", line 749, in recode
    write_metadata(in_pdf, outdoc, extra_metadata=extra_metadata)
  File "/home/cody/ocrtest/venv/lib/python3.10/site-packages/internetarchivepdf/pdfhacks.py", line 516, in write_metadata
    to_pdf.set_xml_metadata(stream)
  File "/home/cody/ocrtest/venv/lib/python3.10/site-packages/fitz/__init__.py", line 5580, in set_xml_metadata
    xml = mupdf.pdf_add_stream( pdf, res, None, 0)
  File "/home/cody/ocrtest/venv/lib/python3.10/site-packages/fitz/mupdf.py", line 45040, in pdf_add_stream
    return _mupdf.pdf_add_stream(doc, buf, obj, compressed)
ValueError: invalid null reference in method 'pdf_add_stream', argument 3 of type 'mupdf::PdfObj const &'

The change in this PR follows the pattern of the other calls to mupdf.pdf_add_stream in src/__init__.py, but if there's a better way to do it, let me know.

Copy link
Contributor

github-actions bot commented Jan 29, 2024

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@clpo13
Copy link
Contributor Author

clpo13 commented Jan 29, 2024

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Jan 29, 2024
@clpo13
Copy link
Contributor Author

clpo13 commented Jan 29, 2024

recheck

@julian-smith-artifex-com
Copy link
Collaborator

This is great, many thanks for the fix, will merge.

[I have a new test in my tree that fails without your fix, which i'll push later].

@julian-smith-artifex-com julian-smith-artifex-com merged commit f2442e6 into pymupdf:main Jan 30, 2024
2 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jan 30, 2024
@clpo13 clpo13 deleted the patch-1 branch January 31, 2024 18:20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants