Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
[JSC] Improve String#split for uglify-js-wtb
https://bugs.webkit.org/show_bug.cgi?id=251823 rdar://105104781 Reviewed by Michael Saboff. This patch improves JetStream2/uglify-js-wtb by 1% with String#split optimizations. 1. We should just use ArrayWithContigous array for result of String#split, and let's say `1` capacity at least. At that point, we will always return an array with at least one capacity. So, let's avoid structure transition, and butterfly reallocation. 2. We should search for BoyerMoore lookahead character patterns with Greedy one-character pattern. For example, /\r?\n/ is very frequently used. We should search for [\r\n] character to start matching in this case. However, if we have /\r?\n/ pattern, 1. first character can be [\r\n] 2. second character can be null, or [\n] So, this confuses fixed-sized BoyerMoore lookahead generation. So, when we encounter greedy one-character pattern, we cut the BM prefix length at this point. So, in the above case, we only consider about first character [\r\n] case. This is still beneficial since previously we give up completely when we encounter greedy fixed-sized one-character pattern. ToT Patched string-split-space 130.3261+-0.7798 ^ 125.6028+-0.4402 ^ definitely 1.0376x faster string-split 206.8225+-1.1932 ^ 128.5716+-0.7137 ^ definitely 1.6086x faster * JSTests/microbenchmarks/string-split-space.js: Added. (split): * JSTests/microbenchmarks/string-split.js: Added. (split): * Source/JavaScriptCore/runtime/RegExpPrototype.cpp: (JSC::JSC_DEFINE_HOST_FUNCTION): * Source/JavaScriptCore/yarr/YarrJIT.cpp: Canonical link: https://commits.webkit.org/259941@main
- Loading branch information
1 parent
35591d2
commit 8b7e7ed
Showing
4 changed files
with
134 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
var string = `Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.`; | ||
|
||
function split(string, regexp) | ||
{ | ||
return string.split(regexp); | ||
} | ||
noInline(split); | ||
|
||
var regexp = / /; | ||
for (var i = 0; i < 1e5; ++i) | ||
split(string, regexp); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
var string = `Lorem | ||
ipsum | ||
dolor | ||
sit | ||
amet, | ||
consectetur | ||
adipiscing | ||
elit, | ||
sed | ||
do | ||
eiusmod | ||
tempor | ||
incididunt | ||
ut | ||
labore | ||
et | ||
dolore | ||
magna | ||
aliqua. | ||
Ut | ||
enim | ||
ad | ||
minim | ||
veniam, | ||
quis | ||
nostrud | ||
exercitation | ||
ullamco | ||
laboris | ||
nisi | ||
ut | ||
aliquip | ||
ex | ||
ea | ||
commodo | ||
consequat. | ||
Duis | ||
aute | ||
irure | ||
dolor | ||
in | ||
reprehenderit | ||
in | ||
voluptate | ||
velit | ||
esse | ||
cillum | ||
dolore | ||
eu | ||
fugiat | ||
nulla | ||
pariatur. | ||
Excepteur | ||
sint | ||
occaecat | ||
cupidatat | ||
non | ||
proident, | ||
sunt | ||
in | ||
culpa | ||
qui | ||
officia | ||
deserunt | ||
mollit | ||
anim | ||
id | ||
est | ||
laborum.`; | ||
|
||
function split(string, regexp) | ||
{ | ||
return string.split(regexp); | ||
} | ||
noInline(split); | ||
|
||
var regexp = /\r?\n/; | ||
for (var i = 0; i < 1e5; ++i) | ||
split(string, regexp); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters