Skip to content

v1.2.0

Choose a tag to compare

@realraelrr realraelrr released this 29 May 10:22
· 5 commits to main since this release
v1.2.0
0f8b8e1

Highlights

  • Add Docling-native PPTX and common image input support (png, jpg, jpeg, tif, tiff, bmp, webp).
  • Add targeted CJK Markdown normalization for agent-facing source.md while preserving raw source.docling.json.
  • Add text_normalization and text_integrity quality signals for CJK cleanup, replacement characters, formula placeholders, and residual compatibility glyphs.
  • Refine PDF page-quality aggregation so isolated long-document page failures become medium-risk warnings, while short documents and high failed-page ratios remain failed_for_agent/high.
  • Refactor non-PDF conversion internals to share the sidecar attempt builder while preserving format-specific routing and PDF remediation behavior.
  • Add lightweight project-owned TypedDict contracts and a ruff dev check for import/unused/syntax hygiene.

Verification

  • conda run -n docling python -m ruff check . -> passed.
  • conda run -n docling python -m pytest -> 126 passed.
  • Root, .codex, and .claude skill validators passed.
  • Subagent review found no Critical, Important, or Minor issues.