Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser issues around MathML in HTML tag transforms #611

Closed
keller-tophat opened this issue Oct 19, 2023 · 4 comments
Closed

Parser issues around MathML in HTML tag transforms #611

keller-tophat opened this issue Oct 19, 2023 · 4 comments

Comments

@keller-tophat
Copy link

keller-tophat commented Oct 19, 2023

Summary

Minifying <math> elements inside an HTML document can result in unexpected attribute and whitespace transforms. It seems like it's trying to parse them as XML applying incorrect whitespace and attribute quoting expectations.

Version

# minify --version
minify v2.12.9-12-g76935f3

Example

echo 'foo <math display=inline>hello</math> world' | minify --type html 

Expected:

foo <math display=inline>hello</math> world

Actual:

foo <math display="nlin">hello</math>world
@tdewolff
Copy link
Owner

Thanks for raising the issue. It is my understanding that <math> and <svg> are XML that can be embedded in HTML, including the tag itself. Thus, using attributes without proper quoting would be an error. But perhaps the tag itself is still HTML and the content is XML? Or is it all HTML? The specification is a bit vague here, or at least the W3C validator doesn't seem to make a difference.

Maybe the XML minifier should leave invalid attribute values as-is, which would fix this issue.

@tdewolff
Copy link
Owner

I've pushed out a change in the XML minifier.

@chamlis
Copy link

chamlis commented Dec 22, 2023

I think there's still an issue around whitespace after a closing math tag. For example, the HTML

a <math display="inline">b</math> c

is minified to

a <math display=inline>b</math>c

losing the whitespace afterwards.

Adding omitSpace = false to the MathToken case in html.go fixes this for me and passes all the tests, but I'm not sure if this is the correct approach. It also doesn't account for <math display="block">, in which case I think removing the whitespace is fine.

@tdewolff
Copy link
Owner

tdewolff commented Jan 1, 2024

You're right, should be fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants