feat: render images inside Word textboxes (SD-2804) by tupizz · Pull Request #3207 · superdoc-dev/superdoc

tupizz · 2026-05-07T23:14:40Z

Summary

Renders inline w:drawing images inside Word textbox content. Previously, the textbox imported with the image silently stripped — the textbox rendered as an empty box even though export round-tripped the image untouched.

Linear: SD-2804

Spec basis

ECMA-376 §20.4.2.38 (CT_TxbxContent) defines textbox content as EG_BlockLevelElts (1..unbounded) — i.e. a textbox can hold the same content as the document body, with three exclusions: cross-story refs (comments/footnotes/endnotes), VML, and nested txbxContent. Notably, paragraphs inside w:txbxContent carry the same CT_P content model as body paragraphs, including runs with inline w:drawing images.

The text-only extractor used in extractTextFromTextBox.handleRun only walked w:t / w:tab / w:br / sd:autoPageNumber / sd:totalPageNumber — w:drawing was silently ignored.

Approach

Minimum surgical change — extend the existing text-parts model with one image part kind:

TextPart contract gains optional kind: 'image' plus src / width / height / alt.
Importer (extractTextFromTextBox.handleRun) branches on w:drawing, reuses the v3 handleImageNode to resolve r:embed → media path, then upgrades the path to a data URI from converter.media (the text-parts model has no downstream hydration step like body ImageRuns do).
Painter (createFallbackTextElement) renders parts with kind: 'image' as inline <img> next to text spans.

No new PM nodes, no new pm-adapter wiring, no schema changes, no NodeHandlerContext threading.

Before / after

Fixture: a DOCX with a textbox-in-header containing a single inline image.

Before	After
Empty textbox outline; image silently dropped on import	Image renders inline inside the textbox, matching what Word shows

Captured via agent-browser against the dev server: see /tmp/sd-2804-final3.png.

Test plan

Unit test: importer emits an image part in textContent.parts for an inline w:drawing inside the textbox (encode-image-node-helpers.test.js)
super-editor full suite: 12,645 tests passing
painter-dom full suite: 1,064 tests passing
pm-adapter full suite: 1,788 tests passing
layout-bridge full suite: 1,206 tests passing
Browser: upload the SD-2804 fixture, confirm the image renders inside the textbox

Out of scope (deferred)

The fixture's image is wp:inline inside a textbox run — the most common case. ECMA-376 also permits richer block-level content inside a textbox: tables, lists, SDTs, hyperlinks, fields, math. The current text-parts model can't represent those; surfacing them would need to flow w:txbxContent through the same body pipeline (handleParagraphNode recursion) and likely a container PM node (shapeTextbox schema already exists for the legacy v:pict path).

That refactor is intentionally deferred — the supplied SD-2804 fixture has only an image, and Option B was over-engineering for the immediate user-visible bug. Tracking issue / future PR scope for content beyond inline-image-in-textbox.

…804) ECMA-376 §20.4.2.38 (CT_TxbxContent) lets a textbox hold rich body-level content — paragraphs whose runs can carry inline w:drawing images. The text-only extractor used to silently skip those drawings, so the textbox rendered empty even though export round-tripped the image untouched. The fix surfaces the inline drawing as a textContent part with kind='image' so the existing shape painter can render it alongside text spans: - TextPart contract gains optional kind/src/width/height/alt fields. - extractTextFromTextBox.handleRun branches on w:drawing, reuses the v3 wp drawing handler (handleImageNode) to resolve rId, then upgrades the path-style src to a data URI from converter.media so the painter can drop it straight into <img>. - DomPainter's createFallbackTextElement renders image parts as inline <img> elements next to existing text spans. Linked: SD-2745 (header-anchored floating textboxes — positions the box where this content now renders).

linear · 2026-05-07T23:14:43Z

SD-2804

github-actions · 2026-05-07T23:16:53Z

I wasn't granted permissions for the ecma-spec MCP tools, so I reviewed the diff against my knowledge of ECMA-376 (Part 1, §17 WordprocessingML and §20.4 DrawingML-WordprocessingDrawing).

Status: PASS

The OOXML element handling in this PR is spec-compliant:

w:drawing is a valid child of w:r (run inner content per CT_R), so processing it inside the run-element loop is correct.
The handler correctly looks for wp:inline or wp:anchor as the direct child of w:drawing — those are the two valid choices in CT_Drawing.
The test fixture's element nesting wps:wsp → wps:txbx → w:txbxContent → w:p → w:r → w:drawing → wp:inline → a:graphic → a:graphicData[uri] → pic:pic → pic:blipFill → a:blip[r:embed] matches the schemas (CT_WordprocessingShape, CT_TxbxContent, CT_Inline, CT_GraphicalObject, CT_Picture, CT_BlipFillProperties).
Required attributes are present where needed: wp:docPr has id and name; a:graphicData has uri; a:blip has r:embed; wp:extent has cx/cy. The optional dist* attributes on wp:inline are correctly omitted (they have schema defaults of 0).
wps:cNvSpPr@txBox="1" is the correct marker for a text-bearing shape per the wordprocessingShape (wps:) namespace.

One minor non-blocking note: the comment cites "ECMA-376 §20.4.2.38" for CT_TxbxContent, but txbxContent lives in the WordprocessingML namespace (Part 1, §17), not in DrawingML-WordprocessingDrawing (§20.4). I couldn't verify the exact section number without spec access, so worth a quick double-check — but it's a comment, not a code issue.

codecov-commenter · 2026-05-07T23:24:48Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

luccas-harbour · 2026-05-08T19:35:46Z

Hey @tupizz! I found a few things and left as inline comments. Ping me if you have any questions. Thanks!

Per Luccas's review on PR #3207: - (C1) Skip hidden textbox images. handleImageNode flags wp:docPr hidden="1" via attrs.hidden, but the new image-part branch only checked attrs.src and emitted visible <img>s for them. Top-level hidden drawings are filtered later in the pipeline; image parts bypass that filtering. Gate the textParts.push on imagePm.attrs.hidden !== true so hidden textbox drawings stay hidden, matching the body-level behaviour. - (C2) Drop the duplicated resolveImagePartSrc helper in the importer (it rejected Uint8Array, breaking Y.js binary media). Store the raw path + extension + rId on the image part. pm-adapter's hydrateImageBlocks gains a vectorShape branch that hydrates textContent.parts alongside ImageRuns, so all media path candidates and the Uint8Array → TextDecoder decoding live in a single place. - (C3) Anchored drawings inside textboxes are out of scope — wrap / position / transform metadata isn't carried into the text-parts model. Restrict the textbox-image branch to wp:inline and document the limit in the code comment so a future fixture can extend it intentionally. - (C4) Align inserted images to the text baseline like body inline images do (vertical-align: bottom). ECMA-376 §20.4.2.8 specifies that an inline drawing behaves "like a character glyph of similar size", and the body inline image renderer defaults to vertical-align: bottom (renderer.ts ~L5770, L5847) — the textbox image part used vertical-align: middle, visibly misaligning text next to the image inside a textbox compared to outside it.

…extbox

tupizz · 2026-05-08T23:46:21Z

@luccas-harbour went through all your points and addressed them, please check it again once you're good

tupizz self-assigned this May 7, 2026

tupizz requested review from caio-pizzol and luccas-harbour May 7, 2026 23:14

superdoc-bot Bot added review: thorough review: careful and removed review: thorough labels May 7, 2026

tupizz marked this pull request as ready for review May 7, 2026 23:18

tupizz requested a review from a team as a code owner May 7, 2026 23:18

luccas-harbour reviewed May 8, 2026

View reviewed changes

Comment thread ...itor/src/editors/v1/core/super-converter/v3/handlers/wp/helpers/encode-image-node-helpers.js Outdated

luccas-harbour reviewed May 8, 2026

View reviewed changes

Comment thread ...itor/src/editors/v1/core/super-converter/v3/handlers/wp/helpers/encode-image-node-helpers.js Outdated

luccas-harbour reviewed May 8, 2026

View reviewed changes

Comment thread ...itor/src/editors/v1/core/super-converter/v3/handlers/wp/helpers/encode-image-node-helpers.js Outdated

luccas-harbour reviewed May 8, 2026

View reviewed changes

Comment thread packages/layout-engine/painters/dom/src/renderer.ts Outdated

tupizz added 2 commits May 8, 2026 20:20

Merge branch 'main' into tadeu/sd-2804-feature-render-images-inside-t…

0bcee87

…extbox

tupizz requested a review from luccas-harbour May 8, 2026 23:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: render images inside Word textboxes (SD-2804)#3207

feat: render images inside Word textboxes (SD-2804)#3207
tupizz wants to merge 3 commits intomainfrom
tadeu/sd-2804-feature-render-images-inside-textbox

tupizz commented May 7, 2026

Uh oh!

linear Bot commented May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

codecov-commenter commented May 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luccas-harbour commented May 8, 2026

Uh oh!

tupizz commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tupizz commented May 7, 2026

Summary

Spec basis

Approach

Before / after

Test plan

Out of scope (deferred)

Related

Uh oh!

linear Bot commented May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

codecov-commenter commented May 7, 2026

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

luccas-harbour commented May 8, 2026

Uh oh!

tupizz commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants