Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML minifier’s elision of <body> next to <noscript> leads to invalid HTML #8876

Closed
andersk opened this issue Apr 18, 2024 · 1 comment · Fixed by #8877
Closed

HTML minifier’s elision of <body> next to <noscript> leads to invalid HTML #8876

andersk opened this issue Apr 18, 2024 · 1 comment · Fixed by #8877

Comments

@andersk
Copy link
Contributor

andersk commented Apr 18, 2024

When using swc_html_minifier and CodeGenConfig { minify: true, … }, SWC elides the <body> tag, which is usually fine. However, when the following tag is <noscript>, the output is invalid HTML. This also crashes Farm (farm-fe/farm#1210).

Input source (valid):

<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title>Test</title>
  </head>
  <body>
    <noscript>This app requires JavaScript.</noscript>
  </body>
</html>

Output source (invalid):

<!doctype html><html lang=en><meta charset=utf-8><title>Test</title><noscript>This app requires JavaScript.</noscript>
1.79-1.107: error: Non-space character inside “noscript” inside “head”.
1.108-1.118: error: Stray end tag “noscript”.

Reproduction code:

use swc_common::{BytePos, FileName, SourceFile};
use swc_html_codegen::{writer::basic::BasicHtmlWriter, CodeGenerator, CodegenConfig, Emit};
use swc_html_minifier::minify_document;
use swc_html_parser::parse_file_as_document;

fn main() {
    let source = "<!doctype html>
<html lang=\"en\">
  <head>
    <meta charset=\"utf-8\" />
    <title>Test</title>
  </head>
  <body>
    <noscript>This app requires JavaScript.</noscript>
  </body>
</html>"
        .to_string();
    let source_file = SourceFile::new(FileName::Anon, false, FileName::Anon, source, BytePos(1));
    let mut errors = vec![];
    let mut document =
        parse_file_as_document(&source_file, Default::default(), &mut errors).unwrap();
    dbg!(errors);
    minify_document(&mut document, &Default::default());
    let mut minified_source = String::new();
    let mut code_generator = CodeGenerator::new(
        BasicHtmlWriter::new(&mut minified_source, None, Default::default()),
        CodegenConfig {
            minify: true,
            ..CodegenConfig::default()
        },
    );
    code_generator.emit(&document).unwrap();
    println!("{}", minified_source);
}
andersk added a commit to andersk/swc that referenced this issue Apr 19, 2024
For example, transforming <body><noscript> to <noscript> would
incorrectly change the meaning so <noscript> is parsed as a child of
<head>.

Fixes swc-project#8876.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
andersk added a commit to andersk/swc that referenced this issue Apr 19, 2024
For example, transforming <body><noscript> to <noscript> would
incorrectly change the meaning so <noscript> is parsed as a child of
<head>.

Fixes swc-project#8876.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
andersk added a commit to andersk/swc that referenced this issue Apr 19, 2024
For example, transforming <body><noscript> to <noscript> would
incorrectly change the meaning so <noscript> is parsed as a child of
<head>.

Fixes swc-project#8876.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
kdy1 pushed a commit that referenced this issue Apr 19, 2024
)

**Description:**

For example, transforming `<body><noscript>` to `<noscript>` would
incorrectly change the meaning so `<noscript>` is parsed as a child of
`<head>`.

Reference: [§13.2.6.4.4 The "in head" insertion
mode](https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inhead),
[13.2.6.4.6 The "after head" insertion
mode](https://html.spec.whatwg.org/multipage/parsing.html#the-after-head-insertion-mode).

**Related issue:**

- Closes #8876.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
@swc-bot
Copy link
Collaborator

swc-bot commented May 20, 2024

This closed issue has been automatically locked because it had no new activity for a month. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.

@swc-project swc-project locked as resolved and limited conversation to collaborators May 20, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants