Skip to content

fix(docx-io): drop leading blank paragraphs in exportToDocx#4991

Merged
zbeyens merged 1 commit into
udecode:mainfrom
WilliamPeralta:fix/docx-export-leading-blank-paragraphs
Jun 3, 2026
Merged

fix(docx-io): drop leading blank paragraphs in exportToDocx#4991
zbeyens merged 1 commit into
udecode:mainfrom
WilliamPeralta:fix/docx-export-leading-blank-paragraphs

Conversation

@WilliamPeralta
Copy link
Copy Markdown
Contributor

@WilliamPeralta WilliamPeralta commented Jun 3, 2026

  • Auto release

Summary

exportToDocx / htmlToDocxBlob produce a DOCX whose first page starts with ~6 empty paragraphs before any content, even when the source has no leading blanks.

Cause

wrapHtmlForDocx emits a <!DOCTYPE html> and indents the document template. @turbodocx/html-to-docx parses the full document with html-to-vdom, which keeps the DOCTYPE node and the whitespace-only text nodes between <html> / <head> / <body> and the content. convertVTreeToXML turns each top-level text node into a paragraph (buildParagraph), so they become empty <w:p> elements at the top of the document.

Fix

Emit tight markup with no DOCTYPE and no inter-tag whitespace. <head> is skipped by the converter, so the \n inside <style> is harmless.

Evidence

Converting the same body via htmlToDocxBlob and counting <w:p> in word/document.xml:

wrapper paragraphs
current (<!DOCTYPE…> + indentation) 9 (2 with text, 7 empty)
no inter-tag whitespace, with <!DOCTYPE> 3 (1 leading empty)
no DOCTYPE, no whitespace (this PR) 2 (clean)

The <!DOCTYPE html> alone accounts for the last remaining leading empty paragraph: with it present html-to-vdom emits a leading blank paragraph; removing it (keeping <head>/<style>) removes it.

Closes #4990

Notes

Added a changeset (@platejs/docx-io patch). Styling is unaffected (still inlined via juice from the <style> block).

wrapHtmlForDocx emitted a <!DOCTYPE html> and indented the document
template. html-to-docx parses the full document with html-to-vdom, which
keeps the DOCTYPE and the whitespace-only text nodes between tags and
renders each as a blank paragraph at the top of the output. Emit tight
markup with no DOCTYPE so the document starts with the content.

Closes udecode#4990
@WilliamPeralta WilliamPeralta requested a review from a team June 3, 2026 13:59
@codesandbox
Copy link
Copy Markdown

codesandbox Bot commented Jun 3, 2026

Review or Edit in CodeSandbox

Open the branch in Web EditorVS CodeInsiders

Open Preview

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jun 3, 2026

🦋 Changeset detected

Latest commit: 863eefa

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@platejs/docx-io Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. patch Bugfix & documentation PR plugin:docx labels Jun 3, 2026
@zbeyens
Copy link
Copy Markdown
Member

zbeyens commented Jun 3, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@zbeyens zbeyens merged commit fe6d160 into udecode:main Jun 3, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

patch Bugfix & documentation PR plugin:docx size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@platejs/docx-io: exportToDocx adds ~6 blank paragraphs at the top of the document

2 participants