Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(html): preserve case-sensitive tag names #5101

Merged

Conversation

ikatyang
Copy link
Member

Non-HTML tag names are considered case-sensitive.

  • I’ve added tests to confirm my change works.
  • (If changing the API or CLI) I’ve documented the changes I’ve made (in the docs/ directory)
  • I’ve read the contributing guidelines.

@ikatyang ikatyang mentioned this pull request Sep 16, 2018
26 tasks
Copy link
Member

@alexander-akait alexander-akait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but maybe switch to https://github.com/Prettyhtml/parse5 will be better, built-in support case insensitivity tags

@ikatyang
Copy link
Member Author

ikatyang commented Sep 17, 2018

It seems the fork does not work in the case where html, head, body, script, etc. are not lowercased. I guess there must be some hardcoded tag names in parse5, cc @StarpTech.

Input:

<!DOCTYPE html>
<HTML CLASS="no-js mY-ClAsS">
  <HEAD>
    <META CHARSET="utf-8">
    <TITLE>My tITlE</TITLE>
    <META NAME="description" content="My CoNtEnT">
  </HEAD>
  <body>
    <P>Hello world!<BR> This is HTML5 Boilerplate.</P>
    <SCRIPT>
      window.ga = function () { ga.q.push(arguments) }; ga.q = []; ga.l = +new Date;
      ga('create', 'UA-XXXXX-Y', 'auto'); ga('send', 'pageview')
    </SCRIPT>
    <SCRIPT src="https://www.google-analytics.com/analytics.js" ASYNC DEFER></SCRIPT>
  </body>
</HTML>

Output: before (parse5) / after (@starptech/prettyhtml-parse)

image

@StarpTech
Copy link

@ikatyang I don't understand your issue. What's your input and output and how looks your code?

@StarpTech
Copy link

StarpTech commented Sep 17, 2018

@ikatyang there are two modes in parse5

  • fragment
  • document

in order to check which modes should be used, I'm searching for head, body tags. I don't respect case-sensitive. I have never seen such a scenario in practice but the fix is an easy one. Your content is nested in a separate page because the document mode was detected.

@ikatyang
Copy link
Member Author

ikatyang commented Sep 17, 2018

I thought the mode was determined by parse/parseFragment, isn't it?

Here's the repro steps:

const text = `
<!DOCTYPE html>
<HTML CLASS="no-js mY-ClAsS">
  <HEAD>
    <META CHARSET="utf-8">
    <TITLE>My tITlE</TITLE>
    <META NAME="description" content="My CoNtEnT">
  </HEAD>
  <body>
    <P>Hello world!<BR> This is HTML5 Boilerplate.</P>
    <SCRIPT>
      window.ga = function () { ga.q.push(arguments) }; ga.q = []; ga.l = +new Date;
      ga('create', 'UA-XXXXX-Y', 'auto'); ga('send', 'pageview')
    </SCRIPT>
    <SCRIPT src="https://www.google-analytics.com/analytics.js" ASYNC DEFER></SCRIPT>
  </body>
</HTML>
`;

const parse5 = require("@starptech/prettyhtml-parse");
const htmlparser2TreeAdapter = require("parse5-htmlparser2-tree-adapter");

const ast = parse5.parse(text, {
  treeAdapter: htmlparser2TreeAdapter,
  sourceCodeLocationInfo: true,
  preserveCaseSensitiveTags: true
});

Issues:

  • the HTML/HEAD/BODY tags are not recognized
  • the SCRIPT tags are parsed as type: "tag" instead of type: "script"

@ikatyang
Copy link
Member Author

I think we could switch to the fork in future PRs when the issue is fixed, I'd like to merge this PR first to continue the work.

(@StarpTech Could you enable the issues in your fork? otherwise I don't know how to report bugs other than ping you.)

@ikatyang ikatyang merged commit 00ed49f into prettier:master Sep 19, 2018
@ikatyang ikatyang deleted the fix/html/preserve-case-sensitive-tag-names branch September 19, 2018 04:53
@StarpTech
Copy link

@ikatyang I enabled issues.

@StarpTech
Copy link

@ikatyang this is actually no issue. You have to parse it with parse5.parseFragment. parse5.parse will wrap it into a separate document.

@ikatyang ikatyang added this to the 1.15 milestone Oct 25, 2018
@lock lock bot added the locked-due-to-inactivity Please open a new issue and fill out the template instead of commenting. label Jan 23, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jan 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked-due-to-inactivity Please open a new issue and fill out the template instead of commenting.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants