fix(docx-io): drop leading blank paragraphs in exportToDocx#4991
Merged
zbeyens merged 1 commit intoJun 3, 2026
Merged
Conversation
wrapHtmlForDocx emitted a <!DOCTYPE html> and indented the document template. html-to-docx parses the full document with html-to-vdom, which keeps the DOCTYPE and the whitespace-only text nodes between tags and renders each as a blank paragraph at the top of the output. Emit tight markup with no DOCTYPE so the document starts with the content. Closes udecode#4990
Review or Edit in CodeSandboxOpen the branch in Web Editor • VS Code • Insiders |
🦋 Changeset detectedLatest commit: 863eefa The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Member
|
@codex review |
|
Codex Review: Didn't find any major issues. Delightful! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
exportToDocx/htmlToDocxBlobproduce a DOCX whose first page starts with ~6 empty paragraphs before any content, even when the source has no leading blanks.Cause
wrapHtmlForDocxemits a<!DOCTYPE html>and indents the document template.@turbodocx/html-to-docxparses the full document withhtml-to-vdom, which keeps the DOCTYPE node and the whitespace-only text nodes between<html>/<head>/<body>and the content.convertVTreeToXMLturns each top-level text node into a paragraph (buildParagraph), so they become empty<w:p>elements at the top of the document.Fix
Emit tight markup with no DOCTYPE and no inter-tag whitespace.
<head>is skipped by the converter, so the\ninside<style>is harmless.Evidence
Converting the same body via
htmlToDocxBloband counting<w:p>inword/document.xml:<!DOCTYPE…>+ indentation)<!DOCTYPE>The
<!DOCTYPE html>alone accounts for the last remaining leading empty paragraph: with it present html-to-vdom emits a leading blank paragraph; removing it (keeping<head>/<style>) removes it.Closes #4990
Notes
Added a changeset (
@platejs/docx-iopatch). Styling is unaffected (still inlined viajuicefrom the<style>block).