Skip to content

v0.3.0

Choose a tag to compare

@ferdinandobons ferdinandobons released this 08 Jun 18:47
· 68 commits to main since this release

Native charts across all three formats, a deterministic-cover subtitle fill, a
single source of truth for component-survival, and a round of correctness +
quality fixes from a multi-agent code review. Brand Profiles from 0.1.x/0.2.0 keep
working unchanged.

Added

  • A chart block is now authored as a native chart on both Word and PowerPoint
    (a real DrawingML c:chart: an inline w:drawing on docx, a graphicFrame on
    pptx), no longer flattened to body text. bar/column/barh/line/area/pie/
    doughnut map to the matching chart type (unknown -> clustered column, surfaced as
    INFO); series/categories/title come from the block, and the chart inherits the
    document/deck theme's accent colors so it is on-brand by construction. A
    multi-series pie/doughnut surfaces a truncation WARNING; an empty/all-non-numeric
    chart degrades loudly. A shared ooxml.chart builds the docx chart with INLINE
    cached data (no embedded workbook) and is the single data gate both formats use;
    the pptx data workbook's wall-clock timestamps are normalized by
    repack_fixed_timestamps (now recursive over nested OOXML packages), so generation
    stays byte-idempotent on both.
  • Excel charts complete the set: a GridDocument.charts entry ({sheet?, type, title?, anchor, data, categories?, data_titles?}) is authored as a NATIVE
    openpyxl chart that REFERENCES the workbook's own cell ranges (the grid model is
    range-based, so the data lives in the sheet). Same type map / unknown-type INFO
    fallback / loud degrade contract; the chart inherits the workbook theme, and
    generation stays byte-idempotent. All three formats now author native charts.
  • Word: the deterministic cover fill (comprehension absent) now also places
    the authored subtitle into the cover slot identified by its resolved
    cover.subtitle style - correct-by-style, never guessed from the template's
    placeholder text - so the output no longer shows the template's stale demo
    subtitle. Role inference resolves cover.subtitle from a custom
    subtitle-named style (preferred) or the builtin Subtitle; a multilingual
    subtitle name-token family backs it. Templates whose subtitle is a databound
    SDT keep working via core-property sync; extra cover fields (date/id/author)
    remain the comprehension path's job and are still surfaced as unplaced.

Fixed

  • General correctness + quality review (multi-agent, adversarially verified):
    • Word/PPTX tables no longer drop multi-run column headers. A header cell
      authored with rich runs (e.g. plain text + a bold unit) kept only its first
      run through Table.from_dict; every run is now preserved (and the loose
      {"runs": [...]} / {"text": "..."} / run-list / string shapes a body cell
      accepts are accepted for columns too).
    • PowerPoint body text on a placeholderless layout degrades loudly instead
      of vanishing silently (a block_degraded WARNING is now recorded).
    • Word hyperlink runs with empty text no longer emit an empty w:hyperlink.
    • comprehend's skeleton demo/required annotation matched the wrong key
      (always None); it now keys on the region id, so the annotation actually applies.
    • extract wraps its work in error handling (clean ERROR extract: ... + exit 1,
      matching generate) and defaults --scope to auto like the other commands.
    • Idempotency: a non-UTF-8 core.xml no longer crashes the timestamp pin; the
      nested-package dcterms regex uses [^<]* + count=1 so it cannot cross a tag
      boundary on malformed XML.
  • Quality cleanups: consolidated the duplicate docx _apply_*_style helpers,
    removed the unused safe_filename utility, dropped a redundant set() in
    has_part, removed a duplicate visual.no_pages finding, and stopped running
    check_profile twice in the QA gate.

Changed

  • component_survival now has a single source of truth. The pptx generator's
    own pre-reconcile, drop-to-zero variant was removed; the QA gate's
    check_component_survival (which re-reads the shell and output independently,
    for all three formats, on any count decrease) is the sole emitter. This ends the
    duplicate, differently-worded component_survival findings a pptx run produced.