Skip to content

Conversation

@JorjMcKie
Copy link
Collaborator

See file CHANGES.md

See file CHANGES.md
@JorjMcKie JorjMcKie requested a review from Copilot June 13, 2025 16:57
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the package to version 0.0.25, bumps the pymupdf and pymupdf4llm dependency versions, and introduces several fixes and enhancements around TOC-based headers, invisible‐text handling, and Markdown output.

  • Bump versions and dependency constraints in setup.py files.
  • Export TocHeaders, add ignore_alpha flag, refine code‐block and Type 3 font handling in Markdown conversion.
  • Document 0.0.25 changes in CHANGES.md.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pymupdf4llm/setup.py Updated pymupdf>=1.26.1 and version to 0.0.25.
pdf4llm/setup.py Locked pymupdf4llm==0.0.25 and bumped version accordingly.
pymupdf4llm/pymupdf4llm/helpers/pymupdf_rag.py Added ignore_alpha, reworked header detection, code‐block logic, and improved invisible‐text baking.
pymupdf4llm/pymupdf4llm/helpers/get_text_lines.py Enhanced span sanitization to always include Type 3 fonts and ignore underlines.
pymupdf4llm/pymupdf4llm/init.py Exported TocHeaders at top level and bumped version.
CHANGES.md Detailed fixes and other changes in version 0.0.25.
Comments suppressed due to low confidence (2)

pymupdf4llm/pymupdf4llm/helpers/pymupdf_rag.py:325

  • The docstring says ignore_alpha defaults to True, but the signature sets it to False; please align the default value or update the documentation.
ignore_alpha=False,

pymupdf4llm/pymupdf4llm/helpers/pymupdf_rag.py:198

  • Using if not page: may treat falsy but non-None page objects incorrectly; consider reverting to if page is None: for an explicit None check.
if not page:

@JorjMcKie JorjMcKie merged commit 8208d6e into main Jun 13, 2025
@JorjMcKie JorjMcKie deleted the v0.0.25 branch June 13, 2025 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants