Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First comment node is missing after calling DOMParser.parseFromString() #9861

Closed
psmyrek opened this issue Jun 14, 2021 · 1 comment · Fixed by #9927
Closed

First comment node is missing after calling DOMParser.parseFromString() #9861

psmyrek opened this issue Jun 14, 2021 · 1 comment · Fixed by #9927
Assignees
Labels
domain:v4-compatibility This issue reports a CKEditor 4 feature/option that's missing in CKEditor 5. package:engine type:task This issue reports a chore (non-production change) and other types of "todos".

Comments

@psmyrek
Copy link
Contributor

psmyrek commented Jun 14, 2021

Provide a description of the task

This issue has been found during working on HTML comments.

The DOM string parsed by HtmlDataProcessor#_toDom() does not contain the first comment node, if it is the first node in the DOM string.

This issue is caused because the return value from:

new DOMParser().parseFromString( '<!--COMMENT 1--><p>PARAGRAPH 1</p><!--COMMENT 2-->', 'text/html' ).body.childNodes

is a two-elements list: NodeList [ p, <!--COMMENT 2--> ] without the first comment node <!--COMMENT 1-->. The first comment is somehow hoisted and inserted before the newly created HTML document:

1

This issue could be fixed by wrapping the DOM string with a <body> or even better with a <div>, just like it is done for BasicHtmlWriter#getHtml()

@psmyrek psmyrek added type:task This issue reports a chore (non-production change) and other types of "todos". squad:compat domain:v4-compatibility This issue reports a CKEditor 4 feature/option that's missing in CKEditor 5. labels Jun 14, 2021
@psmyrek
Copy link
Contributor Author

psmyrek commented Jun 17, 2021

I think I understood the reason for this behavior, which is described in the rules for parsing tokens in HTML content.

In short: parsing tokens in an HTML string starts with the so-called "initial" insertion mode. When a parser is in this state and encounters a comment node, it inserts this comment node as the last child of the Document object. The parser then proceeds to successive insertion modes by creating and appending subsequent nodes (like <html>, <head>, <body>), which in turn leads to the fact that the first comment becomes the first node in the document object and it is located before the <html> element.

@Mgsy Mgsy added this to the iteration 44 milestone Jun 17, 2021
@psmyrek psmyrek self-assigned this Jun 18, 2021
ma2ciek added a commit that referenced this issue Jun 25, 2021
Other (engine): Fixed parsing leading HTML comments by `HtmlDataProcessor.toView()`. Closes #9861.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:v4-compatibility This issue reports a CKEditor 4 feature/option that's missing in CKEditor 5. package:engine type:task This issue reports a chore (non-production change) and other types of "todos".
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants