Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple self-closing children causes nested children tree #34

Closed
slaskis opened this issue Sep 6, 2016 · 8 comments
Closed

Multiple self-closing children causes nested children tree #34

slaskis opened this issue Sep 6, 2016 · 8 comments
Labels

Comments

@slaskis
Copy link
Collaborator

slaskis commented Sep 6, 2016

While writing some tests for #33 I found that when using multiple self closing children in a parent it nests them.

In other words this:

<div>
  <div />
  <div />
</div>

Results in a nested tree like this:

<div>
  <div>
    <div></div>
  </div>
</div>

Another find, which is also in the tests, was that the whitelisted self-closing tags (such as <br>) instead causes an exception when written as a open tag, like <br></br>. Which might be correct, but the exception message (Error: multiple root elements must be wrapped in an enclosing tag) was a bit confusing.

@rbiggs
Copy link

rbiggs commented Feb 2, 2017

Just so you know, self-closing tags should always have at least one space character between the tag name and the U+002F ("/") character. Not including a space should result in an error according to the specs from W3C. Same for <img />. See documentation on HTML 5 closing tags.
Standard self-closing tags are:

  • br
  • hr
  • img
  • input
  • meta

These are never closed, you can use the forward slash / to close them or not.

By the way, <div /> is not a self-closing tag and should throw an error. Problem is jQuery does accept this, going against W3C docs. They have examples on their site of using this bad syntax, encouraging junior developers to using this broken HTML syntax:
var div = $('<div/>')

Creates a correct div element, although it is incorrect HTML5.

@jorgebucaran
Copy link

@rbiggs Ah, good to know!

/cc @dodekeract

@slaskis
Copy link
Collaborator Author

slaskis commented Feb 2, 2017

Very interesting @rbiggs! I haven't actually read the spec and was just going by intuition ;)

I do read that spec slightly different though. Regarding start tags it may have one or more space characters unless there are attributes.

And it does only mention self-closing tags for void elements, for which it's optional and "has no effect", and foreign elements (such as svg), for which it is required unless there is an end tag.

As for the normal tags my intuition says that html5 is pretty chill. But you're absolutely right that the validator complains, and it gives me this message:

Self-closing syntax (/>) used on a non-void HTML element. Ignoring the slash and treating as a start tag.

However, the way I've seen this package is less stringified html and more like a "no need to preprocess jsx". And since jsx treats self-closing tags the same as closed tag without children I expected this to work here as well.

@goto-bus-stop
Copy link
Member

At the moment this results in:

<div>div<div></div></div>

So it's still a problem!

@gvergnaud
Copy link

gvergnaud commented Apr 30, 2018

I ran into this issue as well, I'm using hyperx to build a reactive programming framework (evolui) for a few months. I really tried to contribute, but didn't manage to get into the code, so I decided to create a similar package from the ground up. It does pretty much the same thing, but has a focus toward custom components and supports having multiple root elements.

You can find it here https://github.com/gvergnaud/vdom-tag, if anyone is interested.

@mreinstein
Copy link
Contributor

Maybe I'm just confused, but I don't think there's a bug in hyperx here. In a dev console I type:

const d = document.createElement('div')
d.innerHTML = '<div /> <div />'

this is what the console returns for d:

<div>
  <div>
    <div></div>
  </div>
</div>

And that 100% matches what hyperx produces. It seems that hyperx is faithfully applying the same parse logic that the browser runs in this case.

Am I missing something? Where's the actual broken behavior within hyperx?

@mreinstein
Copy link
Contributor

I've also done a similar check with <br></br>, <br>, and <br/>. In all 3 cases hyperx produces a single BR element as expected the same way the browser does.

I guess I'm just wondering if we're having a discussion about what hyperx should be doing in these cases, or if there is something actually wrong with the parser.

From my perspective, hyperx is supposed to be mirroring exactly what the browser parsers will do with html, rather than doing cleanup. e.g., as humans we know what we're after when we type:

<div>
  <div />
  <div />
</div>

but regardless of whether you feed this html string directly to the browser or hyperx, we don't get the value we might expect.

I think cleaning up malformed html is not something that hyperx should be doing. That could happen as a "cleanup step" before this markup makes it to hyperx.

Assuming I haven't missed actual brokenness, we could probably resolve this issue finally by putting a note near the top of the README.md stating something to the effect that "hyperx intends to faithfully parse html by the same rules that browsers use. Custom cleanup of human error in the formatting is outside the scope of this module."

thoughts?

@mreinstein
Copy link
Contributor

Given that this issue is stale and I don't believe the hyperx parser is straying from the in-browser parsers are doing, I'm going to close it. Not trying to silence anyone though. If anyone finds issue with the parsing logic, happy to re-open/re-visit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants