-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify the After attribute value (quoted) state #7430
Conversation
Fix whatwg#1323. As pointed out in the issue, the number of branches in this state can be reduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've stepped through each character's state transitions in the spec and I believe this is equivalent.
Thanks, Fwiw I ran automated equivalence tests + fuzzing to verify the change here: untitaker/html5gum#23 |
@untitaker ok nice! Does the equivalence test care about parse errors? cc @whatwg/html-parser for a second review or chance to object. If we hear nothing by the end of January 2022 I think this can be merged. |
Yes but error positions are not compared |
It's not clear to me what we prefer in the specification. Fewer total steps or fewer branches. @hsivonen thoughts? |
It's no longer January. @hsivonen WDYT? |
I don't see anything semantically wrong with this change. However, I think it's in general not particularly beneficial to tweak the states, since it causes review churn to compare implementations with the spec. Also, there is always a chance for bugs with supposed no-op changes. To the extent implementations don't try to manually inline across these transitions, which would put more distance between what the code looks like and what the spec looks like, it seems to me that this ends up with more branches executed in the common case even if it results in less code. That is, the current character gets checked for being I'm not strongly against this change, but I'm also not convinced that it's a useful change. |
I agree with @hsivonen, when I wrote this I didn't think that this would not be a clear performance win for every impl (it gets particularly complicated when you ask how expensive reconsume is), and also didn't consider the amount of reviews such a change requires. I'll close this PR the way the spec is written, and given the sheer complexity of what it tries to define, there unfortunately is a strong motivator to keep implementations line-by-line in sync with the spec for ease of maintenance, and, well, here it goes both ways. I wish there was a machine-readable version of the spec (essentially a reference implementation running on a custom abstract machine), it would make it significantly easier to do automated tests to check for equivalence of the "optimized state machine" of an implementation vs the spec's state machine. there are too many holes in html5lib-tests atm |
Fix #1323. As pointed out in the issue, the number of branches in this
state can be reduced.
/acknowledgements.html ( diff )
/parsing.html ( diff )