Skip to content

v0.6.0

Choose a tag to compare

@harumiWeb harumiWeb released this 10 Mar 12:38
· 138 commits to main since this release
1b1853b

v0.6.0 Release Notes

logo

This release adds a new best-effort libreoffice extraction mode for
non-COM environments and extends shape/chart metadata with provenance fields.

Highlights

  • Added mode="libreoffice" across the Python API, CLI, and MCP server.
  • Added early validation for .xls + mode="libreoffice" with a clear error.
  • Added extraction-only validation for mode="libreoffice":
    • rejects PDF/PNG rendering
    • rejects auto page-break export
  • Added FallbackReason.LIBREOFFICE_UNAVAILABLE and
    FallbackReason.LIBREOFFICE_PIPELINE_FAILED.
  • Added backend metadata to shapes/charts:
    • provenance
    • approximation_level
    • confidence
    • serialized output now keeps these fields opt-in via include_backend_metadata
  • Added OOXML-based best-effort reconstruction for:
    • shapes
    • connectors
    • charts
  • Added a LibreOffice runtime helper so server/Linux/macOS environments can
    opt into rich extraction without Excel COM.
  • Added bundled bridge compatibility probing for LibreOffice Python runtime
    selection, including fail-fast handling for incompatible
    EXSTRUCT_LIBREOFFICE_PYTHON_PATH overrides.
  • Added a required Linux GitHub Actions smoke job that installs LibreOffice
    • python3-uno and runs the pytest.mark.libreoffice sample smoke test.

Notes

  • libreoffice is available for .xlsx/.xlsm only.
  • libreoffice is best-effort and not a strict subset of COM output.
  • v1 does not add LibreOffice PDF/PNG rendering or auto page-break extraction.