Skip to content

Propose OOXML paragraph index semantics#98

Closed
kiki830621 wants to merge 1 commit into
mainfrom
codex/ooxml-paragraph-index-semantics
Closed

Propose OOXML paragraph index semantics#98
kiki830621 wants to merge 1 commit into
mainfrom
codex/ooxml-paragraph-index-semantics

Conversation

@kiki830621
Copy link
Copy Markdown
Collaborator

Summary

Validation

  • spectra validate ooxml-paragraph-index-semantics
  • spectra analyze ooxml-paragraph-index-semantics

Refs PsychQuant/ooxml-swift#10

@kiki830621
Copy link
Copy Markdown
Collaborator Author

Verify Report — PR #98

Engine

6-AI ensemble: 5 general-purpose Agents (Claude reviewers) + Codex (gpt-5.5 xhigh). All 5/5 findings present + non-empty on first attempt. No Process Gaps.

Aggregate

NEEDS WORK (critical strategic finding) — Spectra structure formally valid (spectra validate --strict passes), but Devil's Advocate + Regression converge on a strategic-level concern: PR #98 may add a 4th index space instead of consolidating the 3 existing ones, AND it conflicts with the active word-aligned-state-sync change which plans to remove positional indices entirely in v1.0.0.

Scope coverage


Cross-repo ooxml-swift#10 — paragraph index semantics

Path selection: PR picks "staged Option A" — new explicit-label API (body-child index) + deprecated legacy at:Int. Both Codex and Logic confirm this is documented; DA argues Option C (functional naming) or Option E (InsertLocation enum) should have been considered.

Findings (merged + deduplicated)

# Severity Finding Source Action
1 CRITICAL (strategic) PR #98 conflicts with active word-aligned-state-sync Decision 3 (ID-based operations, NEVER positional indices). PR #98 adds 6+ positional-index APIs; word-aligned plans v1.0.0 removal of all positional APIs. New APIs would be obsoleted within months. design.md doesn't acknowledge or coordinate. agents:regression+devils-advocate Either abandon PR #98 and close #10 as duplicate of word-aligned Phase 5; OR document explicit coordination with sequencing
2 CRITICAL (technical) spec creates a 4th index space, not consolidates 3 existing ones. spec Requirement 2 excludes SDT children from "top-level paragraph index"; but Document.swift collectTopLevelParagraphs DESCENDS into SDT. AND Non-Goals explicitly forbid changing getParagraphs() behavior. Net result: spec defines a new index space (top-level-paragraph-excluding-SDT) that differs from the existing getParagraphs() result. agents:devils-advocate+codex+requirements+logic Spec must reconcile: either align with getParagraphs() actual SDT-descent OR change getParagraphs() (currently in Non-Goals)
3 BLOCKER (factual) spec.md Scenario describes getParagraphs() returning table-cell paragraph "T" (per Codex). But Document.swift:225 has case .table: break — code does NOT enter tables. Factually wrong scenario. Also violates proposal's Non-Goal "do not change getParagraphs() scope". agents:codex+requirements+logic Fix scenario to use SDT, not table; OR change to getAllParagraphs() variant
4 HIGH insertParagraph rename inconsistency: proposal says deprecate paragraph-only overloads; spec/tasks say also deprecate insertParagraph(_:at:) ambiguous body-child API. Internal contradiction makes implementation ambiguous. design.md doesn't list explicit new API name for body-child insert. agents:requirements+codex Reconcile + add explicit new insertion API name
5 HIGH Deprecation timeline not version-pinned. "One minor release" hand-wave — pre-1.0 lib's minor cadence is unpredictable (1 week to 6 months). Without version pin, deprecation contract is unenforceable. Must lock to specific version (e.g., 0.X.0 + remove in 0.Y.0). agents:devils-advocate Version-pin deprecation lifecycle
6 HIGH che-word-mcp migration cost grossly underestimated. Tasks list "update ~8 callsites in che-word-mcp" but che-word-mcp is a notarized binary plugin (per CLAUDE.md che-mcps-notary pipeline). Migration cost: code change + build-signed + notarize (5-10 min) + marketplace sync + force re-install. ~5-7 ops steps, not 1 task line. agents:devils-advocate Tasks must include full sign+notarize+ship sequence
7 HIGH Index space silent data-corruption window. che-word-mcp v3.15.2 workaround uses body.children.count - 1 for one call-site (Server.swift:7052). If new API rollout migrates some MCP tools but not others (incremental), LLM agents calling mixed-tool sequences in one session get silent data corruption. agents:security Spec MUST require atomic migration OR document non-atomic mitigation
8 HIGH OOB / negative-index / Int.max bounds completely unspecified. Old insertParagraph(_:at:) clamps; new body-child API should throw vs clamp? Spec silent. Implementer could pick wrong choice → silent data corruption. agents:security Spec normative bounds + throw-vs-clamp rule
9 HIGH InsertLocation enum path (Option E) not analyzed. Document.swift already exposes InsertLocation enum used elsewhere; che-word-mcp already uses it. Issue #10's foot-gun could be resolved by routing all mutation APIs through InsertLocation. design.md's Option C/D analysis is incomplete without Option E. agents:devils-advocate Add Option E to design analysis
10 MEDIUM Concurrent mutation / index invalidation not addressed. TOCTOU race: caller gets index, another thread mutates body, original index now points elsewhere. Spec must address index-stability invariant. agents:security Spec Section "Index stability under concurrent mutation"
11 MEDIUM Cross-repo Spectra spec location: ooxml-swift has its own openspec/ (with cross-document-omath-splice in-repo). PR #98 puts pure ooxml-swift spec in macdoc. cross-repo-umbrella.md rule governs umbrella issues, not Spectra spec layout. Either move spec to ooxml-swift OR document convention extension. agents:regression Justify location OR move
12 MEDIUM proposal self-contradiction: Affected code lists mcp/che-word-mcp/Sources as Modified; Non-Goals says "Do not migrate downstream"; task 3.4 explicitly modifies che-word-mcp callers. Three statements inconsistent. agents:regression Pick one model and reconcile
13 MEDIUM Edge case coverage incomplete: empty body, only-tables body, SDT containing paragraphs all not in Scenarios. Spec only covers [paragraph, table, paragraph] case. agents:codex+logic Add Scenarios
14 MEDIUM Idempotency not explicit: delete is non-idempotent (deletes one each call); update is idempotent if index valid. Spec doesn't classify. agents:logic Add idempotency table
15 MEDIUM topLevelParagraphIndex vs atParagraphIndex label ambiguity — design.md shows both labels in different examples. agents:requirements Pick one
16 LOW spec.md missing # Capability Spec Delta header per Spectra convention. agents:logic Add header
17 LOW tasks.md no acceptance criteria per task — only narrative descriptions. agents:logic Add ACs
18 LOW getParagraphs() recursion-into-SDT-but-skip-tables semantic is non-obvious. Spec should explicit doc this. agents:logic Add doc-only Requirement
19 LOW P3 issue → 227-line dual-capability spec proportionality given likely v1.0.0 obsolescence — engineering ROI questionable. agents:devils-advocate+regression Justify OR scale down

Scope Check

Spec scope = body+SDT+table interactions across 3 APIs. The PR also touches che-word-mcp downstream (per Affected Code), which contradicts its own Non-Goal. Scope clarification needed.

Security

HIGH severity at finding #7 (silent data corruption window during partial migration) and #8 (OOB clamping vs throwing unspecified). Both are spec-level — implementation could ship with wrong choice and corrupt user documents silently.

Process Gaps

None — all 5/5 reviewers produced findings on first attempt.


Recommendation

Do NOT merge. PR #98 has the most severe strategic finding of the 5-PR batch: it may be the wrong solution because:

(a) word-aligned-state-sync plans full removal of positional indices in v1.0.0 (Finding #1) — these new APIs obsolesce
(b) The spec creates a 4th index space instead of consolidating 3 (Finding #2) — DOES NOT solve the original issue #10 foot-gun

Three paths:

Path A (recommended): Close PR #98 and close issue ooxml-swift#10 as duplicate of word-aligned-state-sync Phase 5 (ID-based operations replacement). Document the deferral chain.

Path B: Revise PR #98 to (1) acknowledge word-aligned-state-sync coordination, (2) align with getParagraphs() SDT-descent semantics OR explicitly change getParagraphs() (removing Non-Goal), (3) fix factual Scenario error, (4) add 6 missing addressables (#4-#9), (5) add Option E analysis. ~70% rewrite.

Path C: Merge as Phase 0 minimum (lock direction only), then file 8 follow-up issues. NOT recommended — the strategic finding #1 makes Phase 0 spec lock questionable since the lock will be removed in v1.0.0.

Recommended: Path A. Close as duplicate, redirect engineering effort to word-aligned-state-sync.

@kiki830621
Copy link
Copy Markdown
Collaborator Author

Blocked. 2 CRITICAL strategic findings: (1) conflicts with active word-aligned-state-sync Decision 3 (v1.0.0 removes all positional indices — these new APIs would be obsoleted within months); (2) spec accidentally creates a 4th index space instead of consolidating the 3 existing ones (Requirement 2 excludes SDT, but Document.swift descends SDT, AND Non-Goals forbid changing getParagraphs). Recommend Path A: close PR + close ooxml-swift#10 as duplicate of word-aligned Phase 5. Full report at #98 (comment).

@kiki830621
Copy link
Copy Markdown
Collaborator Author

Closing — Superseded by word-aligned-state-sync + ooxml-edit-isomorphism-foundation

Per verify report (#98 (comment)) recommendation Path A: close as duplicate of word-aligned-state-sync v1.0.0 cleanup work.

The two strategic conflicts identified by verify:

  1. word-aligned-state-sync Decision 3 already commits to ID-based operations (no positional indices in v1.0.0) — these new positional-index APIs would be obsoleted within months
  2. This proposal accidentally creates a 4th index space instead of consolidating the 3 existing ones (per Requirement 2's SDT exclusion vs Document.swift SDT-descent reality)

The Edit-type contract pinned in #99 (ooxml-edit-isomorphism-foundation) provides the proper consolidation surface (positional addressing at OOXMLEdit lower-layer, stable IDs at WordEdit upper-layer). Phase 2 implementation tracking issue #105 will handle the runtime when ready.

Issue ooxml-swift#10 (original) should be reframed to cite #99's ADR-003 + ADR-009.

Refs #99 #105 PsychQuant/ooxml-swift#10

@kiki830621 kiki830621 closed this May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant