feat(seq-fields): import, resolve, and export SEQ sequence fields (SD-3018)#3636
Conversation
|
I'm unable to verify against ECMA-376 because the Could you approve the
Let me know once the permission is granted and I'll run the verification and write up the review. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0c379b2c4c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
e156fa7 to
12f92a2
Compare
Add a shared SEQ instruction parser for later import, layout, API, rebuild, and F9 paths. The parser preserves raw instruction text, normalizes only field dispatch, handles quoted and unquoted arguments, parses SEQ-specific switches, records unknown switches, and reuses the existing page-number general-format mapping instead of copying format tables. Parser tests cover the Phase 1 required cases plus pair-reviewer fixes for raw instruction preservation, shared keyword dispatch, and quoted-only unescaping.
Route SEQ field preprocessing case-insensitively and add a v2 sequenceField importer before passthrough so synthetic sd:sequenceField nodes become PM sequenceField nodes instead of being dropped. Reuse the shared SEQ instruction parser in the sequence field translator, carry parsed import attrs onto the PM node, mark imported cached results stale, and keep raw instruction export behavior unchanged. Add focused preprocessing/import/translator tests covering uppercase and lowercase complex and fldSimple SEQ fields plus cached result preservation.
Add a pure SequenceFieldEvaluator next to the shared SEQ parser so layout, API, and update paths can share counter semantics without importing editor or rendering internals. The evaluator owns per-identifier counters, heading serial tracking for \s resets, explicit \r reset precedence, conservative field-argument fallback behavior, hidden-result handling, and shared contract-based number formatting. Add focused coverage for the required Phase 3 numbering, reset, repeat-current, hidden, formatting, empty-identifier, field-argument, and initial-counter cases.
Add SEQ token metadata to TextRun and emit sequence-field layout tokens without relying on a fake zero placeholder. Resolve SEQ display in a post-assembly toFlowBlocks pass using the shared SequenceFieldEvaluator so paragraph, table, list-compatible, and cache-hit blocks are renumbered in document order before layout measurement. Cover numbering modes, formatting, heading-level restarts, story isolation, table traversal, and FlowBlockCache renumbering with focused layout-adapter tests.
Add a shared transaction helper that recomputes body SEQ fields in ProseMirror state using the shared parser/evaluator and style-aware heading resolution from the layout adapter when converter style data is available. Wire raw field insertion, field rebuild, caption insertion/configuration, and F9 updates through the helper while preserving existing TOC and document-stat dispatch behavior. Field discovery now reads sequenceField.resolvedNumber as the resolved display text. Cover updater scopes, caption/field wrapper recomputation, field resolver readback, and F9 SEQ refresh behavior with focused tests.
Update sequenceField export so current evaluated resolvedNumber values are authoritative, including empty current results for hidden fields. Preserve imported cached child content when results are not current, with a fallback to non-current resolvedNumber only when no cached child content exists. Add export-routing tests for current result runs, hidden empty output, cached-content preservation, resolvedNumber fallback, and raw instructionTokens preservation. Add PM-updater coverage documenting Phase 7 conservative field-argument behavior: cached text wins, no-cache references repeat the previous counter, and counters do not advance.
Handle Word-emitted SEQ switches such as \r0 and \s1 by normalizing attached numeric values through the existing restart parser path. This fixes the NDA browser-test case where hidden level reset fields imported with restartNumber null, causing the first visible level2 SEQ field to render as 2 instead of 1. Adds parser coverage for attached restart switches and a layout regression for hidden restart-zero fields seeding the next visible value.
Build the stat/SEQ update transaction from the current editor state after TOC updates dispatch their own transactions. This preserves TOC edits and shifted SEQ positions when F9 updates both TOC and SEQ fields.
Use the shared case-insensitive SEQ instruction check when caption discovery falls back to sequenceField nodes. This keeps imported lowercase seq captions discoverable by the API.
Keep the sequenceField schema default at ARABIC for backward compatibility while the shared parser continues to default parsed instructions to Arabic.
Remember whether the original F9 selection contained a SEQ field before TOC updates dispatch their own transactions. This lets the fresh post-TOC transaction recompute SEQ fields even when TOC edits shift the selected field position.
Add sequenceFieldAttrsFromParsed beside the SEQ parser and use it from import, raw field insertion, caption insertion/configuration, and PM recompute. This keeps parsed instruction attrs and null/default normalization in one place.
Empty text runs were flowing into the segment/space accounting, where a trailing run boundary could be counted as a justification space and an oversized empty run's font size could inflate the visible line height. Short-circuit empty runs after recording font state so they no longer produce phantom spaces, segments, or height.
cb47b0b to
3bf5fa6
Compare
759f6f7 to
28ee0e4
Compare
…lds (SD-3327) (#3637) * feat(numpages): add numeric picture field metadata Extend the shared page-number field format contract with ordinal and numericPicture metadata for later NUMPAGES parser/formatter steps. Add the hidden total-page-number PM attr, expose it on the typed node attrs, and thread it through the layout adapter format mapper with focused coverage. Widen the layout engine page-number state annotation to the shared PageNumberFormat type so the expanded contract remains compile-compatible. * feat(numpages): preserve numeric picture switches Extend parsePageNumberFieldSwitches so Ordinal general switches map to the new ordinal page-number format and non-empty non-zero numeric pictures are retained in pageNumberNumericPicture instead of being discarded. Keep all-zero numeric pictures on the existing zero-padding path for backwards compatibility, and parse switch arguments from the uncollapsed instruction text so quoted picture literals keep their internal whitespace. * feat(numpages): format ordinal and picture counts Extend the shared page-number formatter with English ordinal output and numeric-picture dispatch for preserved NUMPAGES switches. Numeric pictures now take precedence over enum formats and zero padding, while ordinal and existing page-number formats continue through the shared enum formatter. Focused contracts tests cover ordinal suffix rules, numeric-picture dispatch, and page-number normalization. * fix(numpages): format rebuild and shape results Route shape/textbox NUMPAGES through the existing simple pageNumberFormat enum path, matching the SECTIONPAGES branch without adding shape numeric-picture or zero-padding support. Format fields.rebuild total-page-number output with formatPageNumberFieldValue using preserved pageNumberFormat, pageNumberZeroPadding, and pageNumberNumericPicture attrs, and cover rebuilt resolvedText/text content for enum and numeric-picture cases. * fix(numpages): preserve inserted field switches Parse raw NUMPAGES instructions during fields.insert and pass the resulting page-number attrs into total-page-number creation. This preserves numeric picture and general format switches for header/footer NUMPAGES insertion without changing the existing body-insertion restriction. * test(numpages): cover switched field evaluation Add focused coverage for complex and fldSimple NUMPAGES imports, numeric-picture PM/export round-trip, live total-page-count formatting, and shape NUMPAGES formatting. Also thread numeric-picture attrs through the existing export formatter and align the single-block totalPageCount resolver with the formatted runtime path so the new tests exercise the intended behavior. * fix(numpages): preserve F9 field formatting Format total-page-number refreshes in FieldUpdate with the same preserved page-number metadata used by rebuild/export paths. Adds regression coverage for numeric-picture NUMPAGES updates. * fix(numpages): expose preserved field instructions Prefer stored total-page-number instructions during field discovery so fields.list/get report switched NUMPAGES instructions preserved by insert/import. * fix(numpages): format legacy SVG shape totals Apply simple pageNumberFormat handling to NUMPAGES in the legacy SVG text helper, matching the DomPainter shape path and existing PAGE/SECTIONPAGES behavior. * fix(numpages): preserve inserted picture whitespace Parse NUMPAGES insert instructions from the original raw instruction so quoted numeric-picture literals keep significant internal whitespace. * fix(numpages): format edit-mode total pages Apply preserved NUMPAGES formatting in header/footer edit-mode NodeViews and cached DOM refreshes so switched totals match layout/export behavior. * fix(numpages): preserve split header field pictures Join header/footer instrText fragments verbatim so split quoted NUMPAGES numeric pictures do not gain synthetic spaces before parsing. * fix(numpages): preserve quoted picture instructions Keep significant whitespace inside quoted NUMPAGES numeric-picture instructions while still normalizing field-code whitespace outside quotes. * refactor(numpages): share field format attrs mapper Route NUMPAGES formatting call sites through the existing getPageNumberFieldFormat helper and make zeroPadding handling consistent. * test(numpages): cover ordinal and split switch edges Add regression coverage for no-space split instrText switches, ordinal shape NUMPAGES rendering, and three-digit teen ordinal suffixes. * fix(converter): preserve split page field switch args
2d1d907
into
luccas/sd-3007-feature-page-references
…ers (SD-3007) (#3632) * feat(page-reference): resolve PAGEREF fields to live target page numbers Wire PAGEREF field references through to dynamic resolution against the paginated layout, so cross-references display the target's actual page number, format, and relative position. - Parse PAGEREF instructions into typed switches (\h, \p, \* general format, \# numeric picture, CHARFORMAT/MERGEFORMAT) in a shared pageref-instruction parser; store results as typed node attrs. - Add buildPageRefAnchorMap to locate bookmark targets within the resolved layout (fragment PM ranges, with paragraph/table run-range fallbacks) and expose target display page metadata. - Resolve PAGEREF run text in resolveLayout: format the target display number via numeric picture / general format, emit relative-position text ("on page N", "above"/"below"), and bump the paint cache version when the resolved page changes. - Add formatIntegerWithNumericPicture for the Word \# picture subset (placeholders, grouping, sign slots, fractional, section selection). - Preserve instruction tokens for lossless export and capture the first instrText run rPr for CHARFORMAT field run properties. - Gate body page-token resolution behind SD_BODY_PAGE_TOKENS. * fix(page-reference): locate list item anchors * fix(page-reference): resolve table cell refs * fix(page-reference): resolve list item refs * fix(page-reference): preserve refs on repaint * fix(page-reference): apply charformat props * fix(page-reference): export charformat props * fix(page-reference): scope table anchors to fragments * fix(page-reference): scope split row anchors * fix(page-reference): update nested line slices * fix(page-reference): ignore anchor gaps * fix(page-reference): avoid split-line duplication * fix(page-reference): allow nearby bookmark markers * test(page-reference): cover planned pageref gaps * docs(page-reference): document pageref constraints * fix(page-reference): decouple bookmarks from page token flag * feat(seq-fields): import, resolve, and export SEQ sequence fields (SD-3018) (#3636) * feat(seq-fields): add shared SEQ instruction parser Add a shared SEQ instruction parser for later import, layout, API, rebuild, and F9 paths. The parser preserves raw instruction text, normalizes only field dispatch, handles quoted and unquoted arguments, parses SEQ-specific switches, records unknown switches, and reuses the existing page-number general-format mapping instead of copying format tables. Parser tests cover the Phase 1 required cases plus pair-reviewer fixes for raw instruction preservation, shared keyword dispatch, and quoted-only unescaping. * feat(seq-fields): import SEQ fields case-insensitively Route SEQ field preprocessing case-insensitively and add a v2 sequenceField importer before passthrough so synthetic sd:sequenceField nodes become PM sequenceField nodes instead of being dropped. Reuse the shared SEQ instruction parser in the sequence field translator, carry parsed import attrs onto the PM node, mark imported cached results stale, and keep raw instruction export behavior unchanged. Add focused preprocessing/import/translator tests covering uppercase and lowercase complex and fldSimple SEQ fields plus cached result preservation. * feat(seq-fields): add shared SEQ evaluator Add a pure SequenceFieldEvaluator next to the shared SEQ parser so layout, API, and update paths can share counter semantics without importing editor or rendering internals. The evaluator owns per-identifier counters, heading serial tracking for \s resets, explicit \r reset precedence, conservative field-argument fallback behavior, hidden-result handling, and shared contract-based number formatting. Add focused coverage for the required Phase 3 numbering, reset, repeat-current, hidden, formatting, empty-identifier, field-argument, and initial-counter cases. * feat(seq-fields): resolve SEQ display during layout Add SEQ token metadata to TextRun and emit sequence-field layout tokens without relying on a fake zero placeholder. Resolve SEQ display in a post-assembly toFlowBlocks pass using the shared SequenceFieldEvaluator so paragraph, table, list-compatible, and cache-hit blocks are renumbered in document order before layout measurement. Cover numbering modes, formatting, heading-level restarts, story isolation, table traversal, and FlowBlockCache renumbering with focused layout-adapter tests. * feat(seq-fields): recompute SEQ fields for API updates Add a shared transaction helper that recomputes body SEQ fields in ProseMirror state using the shared parser/evaluator and style-aware heading resolution from the layout adapter when converter style data is available. Wire raw field insertion, field rebuild, caption insertion/configuration, and F9 updates through the helper while preserving existing TOC and document-stat dispatch behavior. Field discovery now reads sequenceField.resolvedNumber as the resolved display text. Cover updater scopes, caption/field wrapper recomputation, field resolver readback, and F9 SEQ refresh behavior with focused tests. * fix(seq-fields): export current SEQ results Update sequenceField export so current evaluated resolvedNumber values are authoritative, including empty current results for hidden fields. Preserve imported cached child content when results are not current, with a fallback to non-current resolvedNumber only when no cached child content exists. Add export-routing tests for current result runs, hidden empty output, cached-content preservation, resolvedNumber fallback, and raw instructionTokens preservation. Add PM-updater coverage documenting Phase 7 conservative field-argument behavior: cached text wins, no-cache references repeat the previous counter, and counters do not advance. * fix(seq-fields): parse attached restart switches Handle Word-emitted SEQ switches such as \r0 and \s1 by normalizing attached numeric values through the existing restart parser path. This fixes the NDA browser-test case where hidden level reset fields imported with restartNumber null, causing the first visible level2 SEQ field to render as 2 instead of 1. Adds parser coverage for attached restart switches and a layout regression for hidden restart-zero fields seeding the next visible value. * fix(seq-fields): recompute F9 after TOC state changes Build the stat/SEQ update transaction from the current editor state after TOC updates dispatch their own transactions. This preserves TOC edits and shifted SEQ positions when F9 updates both TOC and SEQ fields. * fix(seq-fields): detect lowercase caption SEQ fallback Use the shared case-insensitive SEQ instruction check when caption discovery falls back to sequenceField nodes. This keeps imported lowercase seq captions discoverable by the API. * fix(seq-fields): restore legacy format default Keep the sequenceField schema default at ARABIC for backward compatibility while the shared parser continues to default parsed instructions to Arabic. * fix(seq-fields): preserve F9 SEQ selection across TOC edits Remember whether the original F9 selection contained a SEQ field before TOC updates dispatch their own transactions. This lets the fresh post-TOC transaction recompute SEQ fields even when TOC edits shift the selected field position. * refactor(seq-fields): share parsed attr projection Add sequenceFieldAttrsFromParsed beside the SEQ parser and use it from import, raw field insertion, caption insertion/configuration, and PM recompute. This keeps parsed instruction attrs and null/default normalization in one place. * fix(converter): parse attached SEQ format switches * fix(measuring): skip empty text runs during paragraph measurement Empty text runs were flowing into the segment/space accounting, where a trailing run boundary could be counted as a justification space and an oversized empty run's font size could inflate the visible line height. Short-circuit empty runs after recording font state so they no longer produce phantom spaces, segments, or height. * feat(numpages): apply field-switch formatting to total page count fields (SD-3327) (#3637) * feat(numpages): add numeric picture field metadata Extend the shared page-number field format contract with ordinal and numericPicture metadata for later NUMPAGES parser/formatter steps. Add the hidden total-page-number PM attr, expose it on the typed node attrs, and thread it through the layout adapter format mapper with focused coverage. Widen the layout engine page-number state annotation to the shared PageNumberFormat type so the expanded contract remains compile-compatible. * feat(numpages): preserve numeric picture switches Extend parsePageNumberFieldSwitches so Ordinal general switches map to the new ordinal page-number format and non-empty non-zero numeric pictures are retained in pageNumberNumericPicture instead of being discarded. Keep all-zero numeric pictures on the existing zero-padding path for backwards compatibility, and parse switch arguments from the uncollapsed instruction text so quoted picture literals keep their internal whitespace. * feat(numpages): format ordinal and picture counts Extend the shared page-number formatter with English ordinal output and numeric-picture dispatch for preserved NUMPAGES switches. Numeric pictures now take precedence over enum formats and zero padding, while ordinal and existing page-number formats continue through the shared enum formatter. Focused contracts tests cover ordinal suffix rules, numeric-picture dispatch, and page-number normalization. * fix(numpages): format rebuild and shape results Route shape/textbox NUMPAGES through the existing simple pageNumberFormat enum path, matching the SECTIONPAGES branch without adding shape numeric-picture or zero-padding support. Format fields.rebuild total-page-number output with formatPageNumberFieldValue using preserved pageNumberFormat, pageNumberZeroPadding, and pageNumberNumericPicture attrs, and cover rebuilt resolvedText/text content for enum and numeric-picture cases. * fix(numpages): preserve inserted field switches Parse raw NUMPAGES instructions during fields.insert and pass the resulting page-number attrs into total-page-number creation. This preserves numeric picture and general format switches for header/footer NUMPAGES insertion without changing the existing body-insertion restriction. * test(numpages): cover switched field evaluation Add focused coverage for complex and fldSimple NUMPAGES imports, numeric-picture PM/export round-trip, live total-page-count formatting, and shape NUMPAGES formatting. Also thread numeric-picture attrs through the existing export formatter and align the single-block totalPageCount resolver with the formatted runtime path so the new tests exercise the intended behavior. * fix(numpages): preserve F9 field formatting Format total-page-number refreshes in FieldUpdate with the same preserved page-number metadata used by rebuild/export paths. Adds regression coverage for numeric-picture NUMPAGES updates. * fix(numpages): expose preserved field instructions Prefer stored total-page-number instructions during field discovery so fields.list/get report switched NUMPAGES instructions preserved by insert/import. * fix(numpages): format legacy SVG shape totals Apply simple pageNumberFormat handling to NUMPAGES in the legacy SVG text helper, matching the DomPainter shape path and existing PAGE/SECTIONPAGES behavior. * fix(numpages): preserve inserted picture whitespace Parse NUMPAGES insert instructions from the original raw instruction so quoted numeric-picture literals keep significant internal whitespace. * fix(numpages): format edit-mode total pages Apply preserved NUMPAGES formatting in header/footer edit-mode NodeViews and cached DOM refreshes so switched totals match layout/export behavior. * fix(numpages): preserve split header field pictures Join header/footer instrText fragments verbatim so split quoted NUMPAGES numeric pictures do not gain synthetic spaces before parsing. * fix(numpages): preserve quoted picture instructions Keep significant whitespace inside quoted NUMPAGES numeric-picture instructions while still normalizing field-code whitespace outside quotes. * refactor(numpages): share field format attrs mapper Route NUMPAGES formatting call sites through the existing getPageNumberFieldFormat helper and make zeroPadding handling consistent. * test(numpages): cover ordinal and split switch edges Add regression coverage for no-space split instrText switches, ordinal shape NUMPAGES rendering, and three-digit teen ordinal suffixes. * fix(converter): preserve split page field switch args * fix(doc-api): normalize section page numbering format in sections resolver Validate the raw numbering format against the supported SectionPageNumberingFormat values before mapping a section range to a section domain, falling back to undefined for unrecognized formats. Return undefined when no page numbering fields are present so empty numbering objects don't shadow the parsed page numbering.
Summary
Adds end-to-end support for Word
SEQ(sequence) fields — the mechanism behind caption numbering (Figure 1,Table 2, …) and other auto-incrementing counters. SEQ numbers are now parsed from OOXML on import, resolved live during layout in document order, recomputed through the Document API and F9, and written back on export.The design centers on two shared, framework-agnostic primitives (an instruction parser and a stateful evaluator) that every code path — import, layout, Document API, export — reuses, so numbering semantics live in exactly one place.
What changed
Shared SEQ primitives (
super-converter/field-references/shared/)seq-instruction.js— tokenizer + parser forSEQ <id> <switches>. Handles\n(next),\c(current),\h(hide),\r N(restart-at),\s N(restart-at-heading-level),\* FORMAT(general format),\# picture(numeric picture), attached numeric switches (\r0,\s1), quoted/escaped tokens, and general-format mapping. ExposesisSeqInstruction,normalizeSeqIdentifier, andsequenceFieldAttrsFromParsed(projects parsed metadata into PM node attrs).seq-evaluator.js—SequenceFieldEvaluator: one instance per linear pass over a story. Maintains per-identifier counters, heading-level serials for\sresets, restart-number/level handling, current-vs-next mode,\hsuppression, and value formatting (numeric picture / page-number format / plain). Field-argument (bookmark-reference) SEQ is conservatively stubbed pending bookmark resolution.Import (
v2/importer/,v3/handlers/sd/sequenceField/)sequenceFieldEntityindocxImporterso SEQ fields import via the v3 translator.extractFieldKeyword), replacing the brittlestartsWith('SEQ ')checks.Layout resolution (
layout-adapter/)TextRuncontract fields:token: 'seq'andseqMetadata.sequence-field.tsconverter emits a SEQ token (with cached text as fallback) instead of baking in a number at conversion time.resolve-sequence-fields.tsruns a single linear pass over fully-assembledFlowBlock[](paragraphs, tables, lists) after block assembly, resolving each SEQ token in document order. This is cache-safe: cached paragraphs still contribute their preserved token metadata to the pass.Document API (
document-api-adapters/)sequence-field-updater.ts—updateSequenceFieldsInTransactionrecomputes SEQ nodes within a transaction, scoped toall/range/identifier, resolving heading levels via the converter context. Body-story only by design.insert/configure, fieldinsert/rebuild, andfield-resolver(returnsresolvedNumberfor sequence fields).caption-resolvernow usesisSeqInstruction.Export
resolvedNumberIsCurrent) when present, falling back to existing content/cached number, with marks preserved.F9 / field-update (
extensions/field-update/)Schema (
extensions/sequence-field/)fieldArgument,sequenceMode,hideResult,restartNumber,hasGeneralFormat,pageNumberFieldFormat,numericPictureFormat,resolvedNumberIsCurrent. Empty SEQ now renders as''rather than'0'.Testing
New unit + integration coverage across the stack:
seq-instruction,seq-evaluator,resolve-sequence-fields,sequenceFieldImporter.integration,sequence-field-export-routing,sequence-field-updater, caption/field plan-engine wrappers (*.seq-fields.test.ts),field-update,caption-resolver, andfield-resolver. (~1,300 lines of tests added.)