Skip to content

test(docx-core,docx-mcp): final regression suite for canonical emission (#143)#175

Merged
stevenobiajulu merged 2 commits into
mainfrom
143-regression-canonical-emission-20260508
May 8, 2026
Merged

test(docx-core,docx-mcp): final regression suite for canonical emission (#143)#175
stevenobiajulu merged 2 commits into
mainfrom
143-regression-canonical-emission-20260508

Conversation

@stevenobiajulu
Copy link
Copy Markdown
Member

Summary

FINAL sub-PR for #120. Builds the comprehensive regression suite that locks in the canonical-emission contract introduced by #135#142. After this lands, #120 fully closes out and the umbrella checklist on #118 ticks #120.

Three test groups across two packages:

Group Location Count Purpose
A — Per-primitive emission docx-core/src/integration/canonical-emission-regression.test.ts 10 For every Table A primitive, invoke with ctx and assert expected revision element
B — Round-trip with comparison same file 4 Apply primitive, run compareDocuments, assert AI revisions survive semantically
C — Tool integration through SessionManager docx-mcp/src/integration/canonical-emission-mcp.test.ts 10 For every Table A MCP tool, exercise via SessionManager + save + unzip

Total: 24 new tests locking in the surface.

The litmus test (Group B)

The replace_text round-trip test deliberately locks in two claims:

Claim Result
AI author identity IS preserved through compareDocuments ✅ Asserted
AI date timestamps are NOT preserved (regenerated by comparison) ✅ Asserted as a gap

This is a known partial-correctness boundary that #126 (remove comparison from default save path) will close. The test asserts the current gap explicitly so future maintainers get an alert when comparison gains date-preservation:

expect(
  allDatesAreFixed,
  'Comparison currently regenerates dates. If this test now passes (all dates preserved), comparison gained date-preservation — update this test to assert preservation instead. See #126.',
).toBe(false);

SUPPORT.md updates

Peer review

Reviewer Severity Finding Resolution
Codex Medium SUPPORT.md verification scope drift (22 Table A rows but only 20 verified) Confirmed correct — the 2 unverified are comparison-time legacy paths; explicitly noted in their existing notes
Codex High Group B assertions too weak; wouldn't catch wrapper regeneration with same author Fixed — strengthened to capture {kind, id, author, date, textContent} tuples; the strengthened test caught a real gap (date regeneration) which is now locked in as a known limitation
Codex Low Group C is tool-integration, not MCP dispatch Fixed — describe block renamed to "Tool integration through SessionManager: canonical revision emission"; comment clarifies scope
Both Normal rPrChange and addCommentReply gaps need follow-ups Filed as #173 (rPrChange) and #174 (addCommentReply); SUPPORT.md cross-references both
Gemini "Approval Recommended. Coverage maps perfectly to the 20 primary emitters" Confirmed

Verification

npm run build -w @usejunior/docx-core   # clean
npm run lint:workspaces                  # clean
npm run test:run -w @usejunior/docx-core
# Test Files  83 passed (83)
# Tests  1208 passed | 1 skipped (1209)
npm run test:run -w @usejunior/docx-mcp
# Test Files  71 passed (71)
# Tests  720 passed (720)

Total impact across the #120 chain

Sub-PR Tests added (cumulative)
#135 [120.0] foundation docx-core 805 → 810
#136 [120.1] text 810 → 1102 (+9)
#137 [120.2] document 1102 → 1110
#138 [120.3] comments 1118 → 1147
#139 [120.4] footnotes 1147 → 1155
#140 [120.5] layout 1155 → 1166
#141 [120.6] clear_formatting docx-mcp 696 → 696
#142 [120.7] author config docx-mcp 696 → 710
#143 [120.8] regression (this PR) +24 across both packages

Total chain: ~138 new tests covering the canonical-emission shift end-to-end.

Follow-ups filed (out of scope)

Closes

…on (#143)

Closes #143. **FINAL sub-PR for #120** — closes out the umbrella shift.

Builds the comprehensive regression suite that locks in the canonical-
emission contract introduced by #135-#142. Three test groups:

**Group A — Per-primitive emission** (10 tests in docx-core):
For every Table A primitive, invoke with ctx and assert the expected
revision element is emitted with valid w:id/w:author/w:date attributes.
Walks SUPPORT.md as the source of truth. Locks current behavior
including the known rPrChange gap (#173) and addCommentReply Table B
classification gap (#174).

**Group B — Round-trip with comparison** (4 tests in docx-core):
Apply primitive with ctx, run compareDocuments, assert AI revisions
survive semantically through the comparison pipeline. The litmus test
for the umbrella's correctness story.

The replace_text round-trip test deliberately locks in TWO claims:
- AI author identity IS preserved through compareDocuments
- AI date timestamps are NOT preserved (regenerated by comparison)

This is a known partial-correctness boundary that #126 (remove
comparison from default save path) will close. The test asserts the
current gap explicitly so future maintainers get an alert when
comparison gains date-preservation — the failure message tells them to
update the assertion to require preservation, marking #126 progress.

**Group C — Tool integration through SessionManager** (10 tests in
docx-mcp): For every Table A MCP tool, exercise via SessionManager +
save + unzip. NOT full MCP-dispatch end-to-end (those would also
exercise server.ts CallToolRequestSchema handler) — naming clarifies
the scope. Asserts revision elements with w:author='SafeDocX' in the
saved buffer.

SUPPORT.md updates:

- 'Verified by [120.8] (#143) regression test.' note appended to each
  of the 20 write-time emitter rows in Table A.
- compare_documents and save (tracked branch) rows are intentionally
  NOT marked as verified (they're comparison-time, not write-time).
- replaceParagraphTextRange row reflects the rPrChange follow-up #173.
- addCommentReply row reflects the #174 Table B classification.

Peer review: Gemini (LGTM) + Codex (4 findings, all addressed):
1. SUPPORT.md verification scope drift — confirmed only 20 of 22 Table
   A rows are verified; the 2 unverified are comparison-time legacy.
2. Group B too weak — strengthened to capture pre/post-compare
   revision tuples and lock in the date-regeneration gap explicitly.
3. Group C is tool-integration not MCP dispatch — describe block
   renamed for clarity.
4. rPrChange and addCommentReply gaps — filed as #173 and #174 with
   SUPPORT.md cross-references.

Verification:
  npm run build -w @usejunior/docx-core  # clean
  npm run lint:workspaces                # clean
  npm run test:run -w @usejunior/docx-core
    Test Files  83 passed (83)
    Tests  1208 passed | 1 skipped (1209)
  npm run test:run -w @usejunior/docx-mcp
    Test Files  71 passed (71)
    Tests  720 passed (720)

After this lands, #120 fully closes out and the umbrella checklist on
#118 ticks #120. Total tests added across the entire #120 chain:
1102 → 1208 (docx-core, +106) and 688 → 720 (docx-mcp, +32) = +138
new tests covering the canonical-emission shift end-to-end.

Follow-ups filed (out of scope for this PR):
- #165 — accept_changes side-part extension
- #171 — inferStartingRevisionIdState side-part scanning
- #173 — replaceParagraphTextRange rPrChange emission
- #174 — addCommentReply Table B reclassification

No changes to baselines/.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
site Ready Ready Preview, Comment May 8, 2026 2:26am

Request Review

@github-actions github-actions Bot added the test label May 8, 2026
@stevenobiajulu stevenobiajulu enabled auto-merge (squash) May 8, 2026 02:19
- Remove unused W import and ISO_Z_RE constant
- Correct epic label from 'Document Editing' to 'Document Comparison'
- Type-strengthen toPartMap so Record<K, string> destructuring is well-typed

CI's tsc was stricter than local lint cache; these errors only surfaced
on the clean CI build.
@stevenobiajulu stevenobiajulu merged commit 7dc0513 into main May 8, 2026
22 checks passed
@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@stevenobiajulu stevenobiajulu deleted the 143-regression-canonical-emission-20260508 branch May 27, 2026 17:07
This was referenced Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make tracked changes the canonical representation for the supported surface

1 participant