-
Notifications
You must be signed in to change notification settings - Fork 660
Open
Labels
postponepostpone to a future versionpostpone to a future version
Description
Description of the bug
I noticed that memory is constantly increasing in my application and traced this to the insert_htmlbox call, if I remove this, everything is fine.
How to reproduce the bug
I've made a simple code to show the memory:
import os
import psutil
import pymupdf
import gc
import fitz
print("Initial memory used (MB):", psutil.Process(os.getpid()).memory_info().rss / 1024**2)
text_block = (
{
"text": "Table 19: Human Development Index (HDI)",
"translated_text": "Table 19: Human Development Index (HDI)",
"font": "Verdana,Bold",
"size": 10.979999542236328,
"bold": True,
"italic": False,
"underline": False,
"span": {
"size": 10.979999542236328,
"flags": 16,
"bidi": 0,
"char_flags": 24,
"font": "Verdana,Bold",
"color": 0,
"alpha": 255,
"ascender": 1.0720000267028809,
"descender": -0.30300000309944153,
"text": "Table 19: Human Development Index (HDI) ",
"origin": [90.0, 86.4000244140625],
"bbox": [90.0, 74.62946319580078, 368.0684814453125, 89.72696685791016],
},
"color": 0,
},
)
rect = pymupdf.Rect(text_block[0]["span"]["bbox"])
doc = pymupdf.open()
page = doc.new_page(width=rect.width, height=rect.height)
for runtime in range(200):
page.add_redact_annot(rect, fill=None)
page.apply_redactions(images=fitz.PDF_REDACT_IMAGE_NONE) # white fill in RGB
page.insert_htmlbox(rect, f"<div style='font-size: {text_block[0]['size']}px; font-weight: {'bold' if text_block[0]['bold'] else 'normal'}; font-style: {'italic' if text_block[0]['italic'] else 'normal'}; text-decoration: {'underline' if text_block[0]['underline'] else 'none'}; font-family: {text_block[0]['font']}; color: #{text_block[0]['color']:06x};'>{text_block[0]['translated_text']}</div>")
if runtime % 10 == 0:
print(f"Memory used (MB) after {runtime} insertions:", psutil.Process(os.getpid()).memory_info().rss / 1024**2)
doc.subset_fonts()
fitz.TOOLS.store_shrink(100)
gc.collect() # Force garbage collection
print("Final memory used (MB):", psutil.Process(os.getpid()).memory_info().rss / 1024**2)
doc.ez_save("leak_memory.pdf")
The issue happens with line 41 from the above lines of code (page.insert_htmlbox(........))
Below is the result before/after I commented it out:
Before removing insert_htmlbox |
After removing insert_htmlbox |
|---|---|
![]() |
![]() |
PyMuPDF version
1.26.4
Operating system
MacOS
Python version
3.13
Metadata
Metadata
Assignees
Labels
postponepostpone to a future versionpostpone to a future version

