-
Notifications
You must be signed in to change notification settings - Fork 678
Description
Description of the bug
I have created the reproducible code below for my issue.
In a nutshell, I have a UI that allows a user to edit the form fields (pymupdf.Widget) on a PDF. How they edit them is not important, but what is needed is each widget to be annotated with it's widget index on the page. Each page is converted to an image and then rendered on the UI with all of the widgets annotated with each ones index inside a red box.
With some PDFs I use, somehow the image shows with the widgets on top of the annotations on the z-axis, causing the widgets to cover the annotations, even though the widgets have always existed on the PDF, and the annotations were added after.
Example:
To overcome this first issue, what I decided to do was:
- create a new blank PDF document, so that the original PDF is not effected when adding/deleting widgets/annotations
- copy the required page from the original PDF to the new empty PDF
- add the annotations for each widget on this copied page
- delete all the widgets on this copied page so that they do not cover the annotations
This works as expected for the first page. However, strangely, every page after this first page that I attempt to create the image for in the same way, the newly copied page suddenly has 0 widgets on it, despite the page in the original PDF having widgets.
After calling the function once (expected results):
After calling the function again on different, and the same pages (unexpected results - all widgets and hence all annotations missing):
I have been trying to find a solution to either of these problems for the last 2 days, but cannot.
from typing import Final
import pymupdf
ANNOTATION_FONT_SIZE: Final[float] = 7
ANNOTATION_LINE_WEIGHT: Final[float] = 1.2
ANNOTATION_MARGIN_X: Final[float] = 1
ANNOTATION_PAD: Final[float] = 1
RGB = tuple[float, float, float]
RED: Final[RGB] = (1, 0, 0)
WHITE: Final[RGB] = (1, 1, 1)
BLACK: Final[RGB] = (0, 0, 0)
def generate_annotated_page_image(pdf: pymupdf.Document, page_index: int) -> bytes:
"""Generate an annotated page image from a PyMuPDF document page."""
new_pdf = pymupdf.open()
new_pdf.insert_pdf(docsrc=pdf, from_page=page_index, to_page=page_index)
pdf_page: pymupdf.Page = new_pdf[0]
for annotation in pdf_page.annots():
pdf_page.delete_annot(annot=annotation)
widgets = list(pdf_page.widgets())
print(f"Page index {page_index}: {len(widgets)} widgets")
for widget_index, widget in enumerate(iterable=widgets):
annotate_widget(pdf_page=pdf_page, widget=widget, widget_index=widget_index)
pdf_page.delete_widget(widget=widget)
return pdf_page.get_pixmap(matrix=pymupdf.Matrix(2, 2)).tobytes()
def annotate_widget(pdf_page: pymupdf.Page, widget: pymupdf.Widget, widget_index: int) -> None:
"""Annotate a field in a PyMuPDF document."""
annotation: str = str(object=widget_index)
text_width: float = pymupdf.get_text_length(text=annotation, fontsize=ANNOTATION_FONT_SIZE)
rect_height: float = ANNOTATION_FONT_SIZE
rect_width: float = max(rect_height, text_width + (2 * ANNOTATION_MARGIN_X))
offset: int = max((widget.rect.height - rect_height) / 2, 0)
rect: tuple[float, float, float, float] = (
widget.rect[0],
widget.rect[1] + offset,
widget.rect[0] + rect_width,
widget.rect[1] + rect_height + offset,
)
text_rect: tuple[float, float, float, float] = (rect[0], rect[1] + ANNOTATION_PAD / 2, rect[2], rect[3])
rect_annotation: pymupdf.Annot = pdf_page.add_rect_annot(
rect=(rect[0] - ANNOTATION_PAD, rect[1] - ANNOTATION_PAD, rect[2] + ANNOTATION_PAD, rect[3] + ANNOTATION_PAD)
)
rect_annotation.set_colors(stroke=RED, fill=WHITE)
rect_annotation.set_border(width=ANNOTATION_LINE_WEIGHT)
rect_annotation.update()
pdf_page.add_freetext_annot(
rect=text_rect,
text=annotation,
fontsize=ANNOTATION_FONT_SIZE,
text_color=BLACK,
fill_color=WHITE,
align=pymupdf.TEXT_ALIGN_CENTER,
)
pdf_path = "/path/to/pdf/with/widgets.pdf"
pdf = pymupdf.open(pdf_path)
for i, page_index in enumerate([0, 1, 0]):
image = generate_annotated_page_image(pdf=pdf, page_index=page_index)
with open(f"image{i}.png", "wb") as f:
f.write(image)Output:
Page index 0: 63 widgets
Page index 1: 0 widgets
Page index 0: 0 widgets
How to reproduce the bug
Run the script above, replacing pdf_path = "/path/to/pdf/with/widgets.pdf" with the path to a PDF with widgets on at least 2 pages.
PyMuPDF version
1.25.3
Operating system
MacOS
Python version
3.12



