Skip to content

[Bug] A specific diagram recognized as significant is not extracted as images by pymupdf4llm.to_markdown #296

@xcpky

Description

@xcpky

The following diagram is not extracted by pymupdf4llm.to_markdown("uart.pdf", write_images=True) as images, which it should.

Image

This is the original pdf file, and the diagram is on page 5.

uart.pdf

I suspected the diagram is not marked as significant, and asked in discord. HaraldLieder says in discord it is but there are other bugs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions