Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<head> and <body> with whitespace doesn't get parsed correctly #317

Closed
remarkablemark opened this issue Jul 10, 2022 · 0 comments · Fixed by #318 or #319
Closed

<head> and <body> with whitespace doesn't get parsed correctly #317

remarkablemark opened this issue Jul 10, 2022 · 0 comments · Fixed by #318 or #319
Labels

Comments

@remarkablemark
Copy link
Owner

remarkablemark commented Jul 10, 2022

Relates to remarkablemark/html-react-parser#624

Expected Behavior

When I parse HTML with html-dom-parser on the client:

<head></head>
<body
>text</body>

I should get both head and body elements.

Actual Behavior

I get only head element and no body:

[
  {
    "parent": null,
    "prev": null,
    "next": null,
    "startIndex": null,
    "endIndex": null,
    "children": [],
    "name": "head",
    "attribs": {},
    "type": "tag"
  }
]

Steps to Reproduce

import parse from 'html-dom-parser'

parse(`
<head></head>
<body
>text</body>
`)

The cause of the bug is due to catch-all-regex for head and body in domparser needs to be DOTALL to include newlines.

Reproducible Demo

https://codesandbox.io/s/html-react-parsser-624-p0i5pd?file=/src/App.js

Environment

  • Version: 3.0.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
1 participant