Simplify NameValidation by using contains and containsOnly#65869
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
18921ee to
df99d30
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
df99d30 to
f3165a8
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
f3165a8 to
2afe329
Compare
|
EWS run on previous version of this PR (hash 2afe329) Details |
rniwa
left a comment
There was a problem hiding this comment.
We should probably perf A/B test this change with at least 20 iterations on macOS and iOS.
darinadler
left a comment
There was a problem hiding this comment.
I agree that we should check performance on this. It seems likely that having a separate branch for 8-bit will make this faster, so it’s possible the new version is faster!
| return true; | ||
| } | ||
| if (isASCIIAlpha(firstCharacter)) | ||
| return !name.contains(isInvalidElementNameCharacterAfterAlphaStart); |
There was a problem hiding this comment.
This re-checks the first character. The old code skipped that first character. That might not have been an important optimization, but it was intentional. We could keep that optimization by using name.substring(1) instead of name.
I see now that you said “using substring seemed more expensive”. What does “seemed” mean here? Is that a guess about performance or about code size?
There was a problem hiding this comment.
It’s a guess. Cost of a StringView vs checking an additional code unit.
| return true; | ||
| } | ||
| if (firstCharacter == ':' || firstCharacter == '_' || firstCharacter >= 0x80) | ||
| return name.containsOnly<isValidElementNameContinuationCharacter>(); |
There was a problem hiding this comment.
This re-checks the first character. The old code skipped that first character. That might not have been an important optimization, but it was intentional. We could keep that optimization by using name.substring(1) instead of name.
I see now that you said “using substring seemed more expensive”. What does “seemed” mean here? Is that a guess about performance or about code size?
| return false; | ||
| } | ||
| return true; | ||
| return name.containsOnly<isValidASCIIXMLNamePart>(); |
There was a problem hiding this comment.
This re-checks the first character. The old code skipped that first character. That might not have been an important optimization, but it was intentional. We could keep that optimization by using name.substring(1) instead of name.
I see now that you said “using substring seemed more expensive”. What does “seemed” mean here? Is that a guess about performance or about code size?
2afe329 to
f37ad0e
Compare
|
EWS run on current version of this PR (hash f37ad0e) Details
|
|
Failed api-ios checks. Please resolve failures and re-apply Rejecting #65869 from merge queue. |
|
Safe-Merge-Queue: Build #94573. |
https://bugs.webkit.org/show_bug.cgi?id=315737 rdar://178698891 Reviewed by Ryosuke Niwa and Darin Adler. This does result in the first code unit being checked twice for the checks that have special rules for the first code unit, but using substring seemed more expensive. And using contains and containsOnly should be more efficient as there's less is8Bit branching. Tested neutral on Speedometer. Canonical link: https://commits.webkit.org/314632@main
f37ad0e to
dd8ad14
Compare
|
Committed 314632@main (dd8ad14): https://commits.webkit.org/314632@main Reviewed commits have been landed. Closing PR #65869 and removing active labels. |
🧪 services
dd8ad14
f37ad0e
🧪 win-tests