-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subset_fonts error exit without exception/warning #3470
Comments
This post cannot be accepted with a reproducing file. |
try to run doc.subset_fonts in the attached file will create an error in an Under with fallback, the doc.subset_fonts will raise the same error. Under new version(without fallback), the error will not be raised, but the file doc.save after doc.subset_fonts will scramble the words. |
The MuPDF team has developed a fix which I am currently testing. |
Update: fix developed. |
I have a possibly-related issue where 1.24.3 leaves some misc chars on the page, which go away if I stop using subset_fonts. Haven't narrowed it down to a MWE yet, but one difference is I DO NOT get an error with older pymupdf: so it might not be quite the same issue... More to follow. Downstream issue: https://gitlab.com/plom/plom/-/issues/3374 |
Fixes Issue #3374, by falling back on the deprecated in-python fonttools based technique for doing subsetting. To be removed once the new MuPDF-based code is a little more mature, or at least once [1, 2] are fixed. [1] pymupdf/PyMuPDF#3470 [2] pymupdf/PyMuPDF#3494
Description of the bug
in the new PyMUPDF 1.24.3, if any error in doc.subset_fonts(), the process will end without any warning or error number. doc.subset_fonts() Error will be raised in PyMUPdf 1.23.26.
How to reproduce the bug
In PyMUPdf 1.23.26
Traceback (most recent call last):
File "C:_a\PDF_Searchable_v1.py", line 346, in pdfSearhable4
doc.subset_fonts()
File "C:\Users\6\AppData\Local\Programs\Python\Python310\lib\site-packages\fitz\utils.py", line 5631, in subset_fonts
width_table, def_width = get_old_widths(font_xref)
File "C:\Users\6\AppData\Local\Programs\Python\Python310\lib\site-packages\fitz\utils.py", line 5350, in get_old_widths
df_xref = int(df[1][1:-1].replace("0 R", ""))
ValueError: invalid literal for int() with base 10: '<</BaseFont/CIDFont+F1/CIDSystemInfo<</Ordering 97 /Registry 98 /Supplement 0>>/CIDToGIDMap/Identity/FontDescriptor<</Ascent 952/CapHeight 631/Descent -268/Flags 6/FontBBox 99 /FontFile2 100 /FontNam
PyMuPDF version
1.24.3
Operating system
Windows
Python version
3.10
The text was updated successfully, but these errors were encountered: