-
-
Notifications
You must be signed in to change notification settings - Fork 621
fix(codegen): escape </script
#11782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. |
8d800f6
to
47c7866
Compare
Thanks very much for doing this! I'm tied up next few days, but will endeavor to review (and hopefully merge) early next week. |
CodSpeed Instrumentation Performance ReportMerging #11782 will improve performances by 13.32%Comparing Summary
Benchmarks breakdown
|
47c7866
to
5a2a3a9
Compare
7865eef
to
3223051
Compare
3223051
to
922f770
Compare
922f770
to
1e860a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
-2% on codegen benchmarks, which isn't surprising given the extra checks. I'll try to optimize it a little in a follow-up.
// Compiler condenses these operations to an 8-byte read, u64 AND, and u64 compare. | ||
// https://godbolt.org/z/oGG16fK6v | ||
let mut slice: [u8; 8] = slice.try_into().unwrap(); | ||
for b in slice.iter_mut().skip(2) { | ||
// `| 32` converts ASCII upper case letters to lower case. | ||
*b |= 32; | ||
} | ||
|
||
slice == *b"</script" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! You shaved a branch off my version.
Follow-on after #11782. Clarify comments.
Follow-on after #11782. Add `#[inline(always)]` to `is_script_close_tag`, to ensure compiler can deduce the slice is 8 bytes long, and remove a branch.
Follow-on after #11782. That PR fixed escaping `</script` in strings, but it regressed some codegen benchmarks by 2%. Optimize the string search to win some of that perf back by: 1. Doing a preliminary search for `<` first, and only the more expensive search for `</script` once a `<` is found. 2. Searching longer strings for `<` in chunks of 16 bytes, using SIMD.
Fixes #10334.
Replace #10340 to cover the escape for both template literals and comments. We don’t need to handle regex, which is already covered by existing codegen.
Please let me know anything needed to merge this PR. Thanks.