Skip to content

[codex] Add typed LibreOffice workbook handles#125

Merged
harumiWeb merged 3 commits intomainfrom
feat/typed-soffice
Apr 16, 2026
Merged

[codex] Add typed LibreOffice workbook handles#125
harumiWeb merged 3 commits intomainfrom
feat/typed-soffice

Conversation

@harumiWeb
Copy link
Copy Markdown
Owner

@harumiWeb harumiWeb commented Apr 16, 2026

Summary

  • replace the raw LibreOfficeSession.load_workbook() token with a typed LibreOfficeWorkbookHandle
  • add session-scoped workbook lifecycle validation, idempotent close handling, and workbook cache invalidation
  • preserve backward compatibility for custom session_factory implementations that only support legacy path-based extraction

Why

LibreOfficeSession exposed a public-looking workbook lifecycle API but returned an untyped dict token and treated close_workbook() as a no-op. That made the contract easy to misuse and gave type checkers almost no useful guarantees.

Impact

  • default LibreOfficeSession callers can use a typed workbook handle safely
  • foreign-session and closed-handle misuse now fails explicitly
  • existing custom session_factory stubs that only implement extract_draw_page_shapes() / extract_chart_geometries() continue to work

Root Cause

The original LibreOffice session lifecycle was left as a placeholder after the rich extraction backend landed, so the backend had no typed workbook handle contract and no meaningful close semantics.

Validation

  • uv run pytest tests/core/test_libreoffice_backend.py -q
  • uv run task precommit-run

Summary by CodeRabbit

  • New Features

    • Introduced typed workbook handles for LibreOffice extraction with session-scoped lifecycle tracking and cache management.
  • Bug Fixes

    • Fixed workbook lifecycle integration to properly clear cached payloads and validate handle ownership.
  • Improvements

    • Maintained backward compatibility for legacy session factory implementations without workbook lifecycle hooks.

Closes #77

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 16, 2026

📝 Walkthrough

Walkthrough

Implements typed workbook handles for LibreOffice extraction operations. Introduces LibreOfficeWorkbookHandle dataclass to replace untyped dict tokens, tracks session-scoped workbook lifecycle with ownership validation, clears cached bridge payloads per workbook, and maintains backward compatibility for legacy session_factory implementations that lack lifecycle hooks.

Changes

Cohort / File(s) Summary
Core LibreOffice Session API
src/exstruct/core/libreoffice.py
Added LibreOfficeWorkbookHandle frozen dataclass with session/workbook ownership tracking. Updated load_workbook() to return typed handle instead of dict. Reworked close_workbook() to accept handle, validate ownership, and invalidate cached bridge payloads. Updated extraction methods (extract_draw_page_shapes, extract_chart_geometries) to accept Path | LibreOfficeWorkbookHandle. Added internal path resolution and validation helpers.
Backend Extraction Integration
src/exstruct/core/backends/libreoffice_backend.py
Added helper methods _read_chart_geometries_with_lifecycle() and _read_draw_page_shapes_with_lifecycle() to route extraction calls through typed workbook handle lifecycle when available. Each helper checks session capability and falls back to legacy path-based extraction for incompatible session_factory implementations.
Test Suite & Validation
tests/core/test_libreoffice_backend.py
Updated test doubles (_DummySession, _TrackingSession) to emit and accept LibreOfficeWorkbookHandle. Added backend integration tests validating load/extract/close sequences with handles. Added regression test for legacy sessions accepting only Path. Added ownership validation and idempotency tests (handle closure, repeated closes, post-close access errors).
Documentation & Specs
CHANGELOG.md, tasks/feature_spec.md, tasks/lessons.md, tasks/todo.md
Added changelog entries, feature specification for issue #77 (typed handle contract + ownership validation), compatibility lessons for extension point tightening and regression testing, and dated todo tracking implementation and verification steps.

Sequence Diagram

sequenceDiagram
    participant Backend as LibreOfficeRichBackend
    participant Session as LibreOfficeSession
    participant WbRegistry as Workbook Registry
    participant Cache as Bridge Payload Cache
    participant Bridge as Bridge Subprocess

    rect rgba(100, 150, 200, 0.5)
    Note over Backend,Bridge: New Typed Workbook Lifecycle (with Handle)
    Backend->>Session: load_workbook(file_path)
    Session->>WbRegistry: register workbook
    Session-->>Backend: LibreOfficeWorkbookHandle
    Backend->>Session: extract_chart_geometries(handle)
    Session->>Session: resolve & validate handle
    Session->>Bridge: extract with resolved path
    Bridge-->>Session: geometry data
    Session->>Cache: store with (kind, path) key
    Session-->>Backend: geometries
    Backend->>Session: close_workbook(handle)
    Session->>Session: validate ownership
    Session->>Cache: clear entries for workbook path
    Session->>WbRegistry: unregister workbook
    Session-->>Backend: success
    end

    rect rgba(150, 100, 200, 0.5)
    Note over Backend,Bridge: Legacy Fallback (Path-only, no lifecycle)
    Backend->>Session: extract_chart_geometries(file_path)
    Session->>Session: check for load/close support
    alt Unsupported
        Session->>Bridge: extract_chart_geometries(file_path)
        Bridge-->>Session: geometries
    end
    Session-->>Backend: geometries
    end
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly Related Issues

  • Typed handle for LibreOfficeSession.load_workbook / close_workbook #77 — Directly addresses the core requirement: implementation of typed LibreOfficeWorkbookHandle, load_workbook() return type change, close_workbook() ownership validation and cache invalidation, and extraction method signature updates to accept handle or path with backward compatibility.

Poem

🐰 A handle most typed, now etched in a type,
No longer a dict in the cipher's dark hype,
Load, validate, close—the trilogy's sane,
Cache swept clean, workbooks regained!
Legacy paths still hop through the lane. 🌟

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 79.49% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description does not follow the required template structure with acceptance criteria (AC-01 through AC-06) and validation sections as specified. Restructure the description to follow the template: add explicit acceptance criteria checkboxes (AC-01 through AC-06), organize validation section with checkboxes for precommit-run, tests, and documentation updates.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically identifies the main change: introducing typed workbook handles for LibreOffice sessions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/typed-soffice

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 16, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 31 complexity · 0 duplication

Metric Results
Complexity 31
Duplication 0

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

@harumiWeb harumiWeb marked this pull request as ready for review April 16, 2026 13:21
@harumiWeb harumiWeb requested a review from Copilot April 16, 2026 13:21
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

This comment was marked as resolved.

@harumiWeb harumiWeb merged commit 7ba6f5a into main Apr 16, 2026
17 of 18 checks passed
@harumiWeb harumiWeb deleted the feat/typed-soffice branch April 21, 2026 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Typed handle for LibreOfficeSession.load_workbook / close_workbook

2 participants