Hi, I'm not able to detect cropped image using the get_bboxlog() method (fitz version 1.23.7).
I generated the attached PDF with two cropped image (one rotated 90°), but the extraction gives me the bounding boxes of the non-cropped images:
Image 0 - bbox: Rect(266.25, 157.2283935546875, 328.5, 608.3989868164062)
Image 1 - bbox: Rect(73.5, 73.5, 568.5, 142.5)
1 - Type: 'fill-path', width=595.5 height=842.25 (raw = (0.0, 0.0, 595.5, 842.25))
2 - Type: 'fill-image', width=62.25 height=451.17059326171875 (raw = (266.25, 157.2283935546875, 328.5, 608.3989868164062))
3 - Type: 'fill-image', width=495.0 height=69.0 (raw = (73.5, 73.5, 568.5, 142.5))
In the following the rendered PDF page and the script used to replicate the result. What am I doing wrong?

import fitz
fn_in = "test_page.pdf"
with open(fn_in, "rb") as f:
doc = fitz.open(f)
page = doc.load_page(0)
# Extract images
imgs = []
for i, img in enumerate(page.get_image_info(xrefs=True)):
xref = img["xref"]
img["bbox"] = fitz.Rect(img["bbox"])
print(f"Image {i} - bbox: {img['bbox']}")
img["transform"] = fitz.Matrix(img["transform"])
imgs.append(img)
# Get bbox_log
for i, (type, raw) in enumerate(page.get_bboxlog()):
rect = fitz.Rect(raw)
print(f"{i+1} - Type: '{type}', width={rect.width} height={rect.height} (raw = {raw})")
# There are three elements
# 1) A rectangle occupying the full page (I don't know why it is there)
# 2) The first image
# 3) The second image (correctly detect rotation)
# PROBLEM: None of the images are cropped
# Here images are correctly cropped
# page.get_pixmap().save('rendered_page.png')
Originally posted by @abe-mxff in #1312 (comment)
Hi, I'm not able to detect cropped image using the
get_bboxlog()method (fitz version 1.23.7).I generated the attached PDF with two cropped image (one rotated 90°), but the extraction gives me the bounding boxes of the non-cropped images:
In the following the rendered PDF page and the script used to replicate the result. What am I doing wrong?

Originally posted by @abe-mxff in #1312 (comment)