Skip to content

[JSC] Implement String#match in C++#64600

Open
sosukesuzuki wants to merge 1 commit intoWebKit:mainfrom
sosukesuzuki:eng/string-prototype-match-cpp
Open

[JSC] Implement String#match in C++#64600
sosukesuzuki wants to merge 1 commit intoWebKit:mainfrom
sosukesuzuki:eng/string-prototype-match-cpp

Conversation

@sosukesuzuki
Copy link
Copy Markdown
Contributor

@sosukesuzuki sosukesuzuki commented May 9, 2026

bd01109

[JSC] Implement `String#match` in C++
https://bugs.webkit.org/show_bug.cgi?id=314466

Reviewed by NOBODY (OOPS!).

Move String.prototype.match from a JS builtin to a C++ host function and add
a StringMatch DFG node, mirroring the String#split work in 312843@main.

Notes:

* StringMatch is converted to RegExpMatchFast in fixup so the existing
  RegExpMatchFastGlobal / RegExpExec strength reductions apply.

* addStringMatchPrimordialChecks uses a single CheckStructure instead of the
  TryGetById chain in addStringReplacePrimordialChecks. Fixup-inserted
  TryGetByIds have no bytecode profile and never specialise, so they stay as
  repeated vmCalls. CheckStructure is stricter (subclasses OSR exit) but
  matches isSymbolMatchFastAndNonObservable, and the parser bails on BadCache.

* isSymbolMatchFastAndNonObservable mirrors the replace variant: lastIndex must
  be a number since RegExp.prototype[@@match] reads it. No species watchpoint
  is needed since @@match does not use SpeciesConstructor.

* stringProtoFuncMatch re-validates after ToString(this), since a user toString
  can mutate the regexp before RegExp.prototype[@@match] reads flags/exec.

                                    TipOfTree                  Patched

regexp-match-digit-literal       64.9662+-0.6089     ^     62.2319+-0.5453        ^ definitely 1.0439x faster
string-match-digit-short        273.6600+-0.7725     ^    269.4541+-1.5833        ^ definitely 1.0156x faster
global-atom-match-one-char        0.7787+-0.0359     ^      0.5617+-0.0209        ^ definitely 1.3863x faster

Tests: JSTests/microbenchmarks/regexp-match-digit-cached.js
       JSTests/microbenchmarks/regexp-match-digit-literal.js
       JSTests/microbenchmarks/string-match-digit-short.js
       JSTests/microbenchmarks/string-match-letter-short.js
       JSTests/stress/string-prototype-match-edge-cases.js
       JSTests/stress/string-prototype-match-strength-reduction.js
       JSTests/stress/string-prototype-match-watchpoint-invalidation.js

* JSTests/microbenchmarks/regexp-match-digit-cached.js: Added.
(match):
* JSTests/microbenchmarks/regexp-match-digit-literal.js: Added.
(match):
* JSTests/microbenchmarks/string-match-digit-short.js: Added.
(match):
* JSTests/microbenchmarks/string-match-letter-short.js: Added.
(match):
* JSTests/stress/string-prototype-match-edge-cases.js: Added.
(shouldBe):
(shouldThrow):
(shouldBe.String.prototype.match.call.toString):
(shouldBe.string_appeared_here.match.x.a):
(shouldBe.re.Symbol.match):
(shouldThrow.obj3.get Symbol):
(shouldBe.MyRegExp):
(shouldBe.MyRegExp2.prototype.Symbol.match):
(shouldBe.MyRegExp2):
(shouldBe.fake.get flags):
(shouldBe.fake.get global):
(shouldBe.fake.get hasIndices):
(shouldBe.fake.get ignoreCase):
(shouldBe.fake.get multiline):
(shouldBe.fake.get sticky):
(shouldBe.fake.get unicode):
(shouldBe.fake.get unicodeSets):
(shouldBe.fake.get dotAll):
* JSTests/stress/string-prototype-match-strength-reduction.js: Added.
(shouldBe):
(matchGlobalLiteral):
(matchNonGlobalLiteral):
(matchStickyLiteral):
(matchUnicodeGlobal):
(matchCached):
(matchStringPattern):
(matchPoly):
(shouldBe.matchPoly):
* JSTests/stress/string-prototype-match-watchpoint-invalidation.js: Added.
(shouldBe):
(match):
(re.Symbol.match):
(RegExp.prototype.Symbol.match):
(RegExp.prototype.exec):
(value):
(receiver.toString.re.exec):
(receiver.toString.RegExp.prototype.exec):
(get return):
* Source/JavaScriptCore/builtins/StringPrototype.js:
(match): Deleted.
* Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:
(JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects):
* Source/JavaScriptCore/dfg/DFGBackwardsPropagationPhase.cpp:
(JSC::DFG::BackwardsPropagationPhase::propagate):
* Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:
(JSC::DFG::ByteCodeParser::handleIntrinsicCall):
* Source/JavaScriptCore/dfg/DFGClobberize.h:
(JSC::DFG::clobberize):
* Source/JavaScriptCore/dfg/DFGCloneHelper.h:
* Source/JavaScriptCore/dfg/DFGDoesGC.cpp:
(JSC::DFG::doesGC):
* Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:
(JSC::DFG::FixupPhase::fixupNode):
(JSC::DFG::FixupPhase::addStringMatchPrimordialChecks):
* Source/JavaScriptCore/dfg/DFGGraph.h:
* Source/JavaScriptCore/dfg/DFGJITCode.cpp:
(JSC::DFG::JITData::JITData):
(JSC::DFG::JITData::tryInitialize):
* Source/JavaScriptCore/dfg/DFGJITCode.h:
* Source/JavaScriptCore/dfg/DFGNodeType.h:
* Source/JavaScriptCore/dfg/DFGOperations.cpp:
(JSC::DFG::JSC_DEFINE_JIT_OPERATION):
* Source/JavaScriptCore/dfg/DFGOperations.h:
* Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:
* Source/JavaScriptCore/dfg/DFGSafeToExecute.h:
(JSC::DFG::safeToExecute):
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:
(JSC::DFG::SpeculativeJIT::compile):
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:
(JSC::DFG::SpeculativeJIT::compile):
* Source/JavaScriptCore/ftl/FTLCapabilities.cpp:
(JSC::FTL::canCompile):
* Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:
(JSC::FTL::DFG::LowerDFGToB3::compileNode):
(JSC::FTL::DFG::LowerDFGToB3::compileStringMatch):
* Source/JavaScriptCore/runtime/Intrinsic.h:
* Source/JavaScriptCore/runtime/JSGlobalObject.cpp:
(JSC::JSGlobalObject::init):
* Source/JavaScriptCore/runtime/JSGlobalObject.h:
* Source/JavaScriptCore/runtime/RegExpConstructor.cpp:
(JSC::regExpCreate):
* Source/JavaScriptCore/runtime/RegExpConstructor.h:
* Source/JavaScriptCore/runtime/RegExpObject.h:
* Source/JavaScriptCore/runtime/RegExpObjectInlines.h:
(JSC::RegExpObject::isSymbolMatchFastAndNonObservable):
* Source/JavaScriptCore/runtime/RegExpPrototype.cpp:
(JSC::regExpMatchFast):
(JSC::JSC_DEFINE_HOST_FUNCTION):
* Source/JavaScriptCore/runtime/RegExpPrototype.h:
* Source/JavaScriptCore/runtime/StringPrototype.cpp:
(JSC::StringPrototype::finishCreation):
(JSC::stringMatchSlow):
(JSC::JSC_DEFINE_HOST_FUNCTION):
* Source/JavaScriptCore/runtime/StringPrototype.h:

a7b3161

Misc iOS, visionOS, tvOS & watchOS macOS Linux Windows
✅ 🧪 style ✅ 🛠 ios ✅ 🛠 mac ✅ 🛠 wpe ✅ 🛠 win
✅ 🛠 ios-sim ✅ 🛠 mac-AS-debug ✅ 🧪 wpe-wk2 ❌ 🧪 win-tests
✅ 🧪 webkitperl ✅ 🧪 ios-wk2 ✅ 🧪 api-mac ✅ 🧪 api-wpe
✅ 🧪 ios-wk2-wpt ✅ 🧪 api-mac-debug ✅ 🛠 gtk3-libwebrtc
✅ 🛠 🧪 jsc ✅ 🧪 api-ios ✅ 🧪 mac-wk1 ✅ 🛠 gtk
✅ 🛠 🧪 jsc-debug-arm64 ✅ 🛠 ios-safer-cpp ✅ 🧪 mac-wk2 ✅ 🧪 gtk-wk2
✅ 🛠 vision ✅ 🧪 mac-AS-debug-wk2 ✅ 🧪 api-gtk
✅ 🛠 vision-sim ✅ 🧪 mac-wk2-stress ✅ 🛠 playstation
✅ 🧪 vision-wk2 ✅ 🧪 mac-intel-wk2 ✅ 🛠 jsc-armv7
✅ 🛠 tv ✅ 🛠 mac-safer-cpp ✅ 🧪 jsc-armv7-tests
✅ 🛠 tv-sim
✅ 🛠 watch
✅ 🛠 watch-sim

@sosukesuzuki sosukesuzuki self-assigned this May 9, 2026
@sosukesuzuki sosukesuzuki added the JavaScriptCore For bugs in JavaScriptCore, the JS engine used by WebKit, other than kxmlcore issues. label May 9, 2026
https://bugs.webkit.org/show_bug.cgi?id=314466

Reviewed by NOBODY (OOPS!).

Move String.prototype.match from a JS builtin to a C++ host function and add
a StringMatch DFG node, mirroring the String#split work in 312843@main.

Notes:

* StringMatch is converted to RegExpMatchFast in fixup so the existing
  RegExpMatchFastGlobal / RegExpExec strength reductions apply.

* addStringMatchPrimordialChecks uses a single CheckStructure instead of the
  TryGetById chain in addStringReplacePrimordialChecks. Fixup-inserted
  TryGetByIds have no bytecode profile and never specialise, so they stay as
  repeated vmCalls. CheckStructure is stricter (subclasses OSR exit) but
  matches isSymbolMatchFastAndNonObservable, and the parser bails on BadCache.

* isSymbolMatchFastAndNonObservable mirrors the replace variant: lastIndex must
  be a number since RegExp.prototype[@@match] reads it. No species watchpoint
  is needed since @@match does not use SpeciesConstructor.

* stringProtoFuncMatch re-validates after ToString(this), since a user toString
  can mutate the regexp before RegExp.prototype[@@match] reads flags/exec.

                                    TipOfTree                  Patched

regexp-match-digit-literal       64.9662+-0.6089     ^     62.2319+-0.5453        ^ definitely 1.0439x faster
string-match-digit-short        273.6600+-0.7725     ^    269.4541+-1.5833        ^ definitely 1.0156x faster
global-atom-match-one-char        0.7787+-0.0359     ^      0.5617+-0.0209        ^ definitely 1.3863x faster

Tests: JSTests/microbenchmarks/regexp-match-digit-cached.js
       JSTests/microbenchmarks/regexp-match-digit-literal.js
       JSTests/microbenchmarks/string-match-digit-short.js
       JSTests/microbenchmarks/string-match-letter-short.js
       JSTests/stress/string-prototype-match-edge-cases.js
       JSTests/stress/string-prototype-match-strength-reduction.js
       JSTests/stress/string-prototype-match-watchpoint-invalidation.js

* JSTests/microbenchmarks/regexp-match-digit-cached.js: Added.
(match):
* JSTests/microbenchmarks/regexp-match-digit-literal.js: Added.
(match):
* JSTests/microbenchmarks/string-match-digit-short.js: Added.
(match):
* JSTests/microbenchmarks/string-match-letter-short.js: Added.
(match):
* JSTests/stress/string-prototype-match-edge-cases.js: Added.
(shouldBe):
(shouldThrow):
(shouldBe.String.prototype.match.call.toString):
(shouldBe.string_appeared_here.match.x.a):
(shouldBe.re.Symbol.match):
(shouldThrow.obj3.get Symbol):
(shouldBe.MyRegExp):
(shouldBe.MyRegExp2.prototype.Symbol.match):
(shouldBe.MyRegExp2):
(shouldBe.fake.get flags):
(shouldBe.fake.get global):
(shouldBe.fake.get hasIndices):
(shouldBe.fake.get ignoreCase):
(shouldBe.fake.get multiline):
(shouldBe.fake.get sticky):
(shouldBe.fake.get unicode):
(shouldBe.fake.get unicodeSets):
(shouldBe.fake.get dotAll):
* JSTests/stress/string-prototype-match-strength-reduction.js: Added.
(shouldBe):
(matchGlobalLiteral):
(matchNonGlobalLiteral):
(matchStickyLiteral):
(matchUnicodeGlobal):
(matchCached):
(matchStringPattern):
(matchPoly):
(shouldBe.matchPoly):
* JSTests/stress/string-prototype-match-watchpoint-invalidation.js: Added.
(shouldBe):
(match):
(re.Symbol.match):
(RegExp.prototype.Symbol.match):
(RegExp.prototype.exec):
(value):
(receiver.toString.re.exec):
(receiver.toString.RegExp.prototype.exec):
(get return):
* Source/JavaScriptCore/builtins/StringPrototype.js:
(match): Deleted.
* Source/JavaScriptCore/dfg/DFGAbstractInterpreterInlines.h:
(JSC::DFG::AbstractInterpreter<AbstractStateType>::executeEffects):
* Source/JavaScriptCore/dfg/DFGBackwardsPropagationPhase.cpp:
(JSC::DFG::BackwardsPropagationPhase::propagate):
* Source/JavaScriptCore/dfg/DFGByteCodeParser.cpp:
(JSC::DFG::ByteCodeParser::handleIntrinsicCall):
* Source/JavaScriptCore/dfg/DFGClobberize.h:
(JSC::DFG::clobberize):
* Source/JavaScriptCore/dfg/DFGCloneHelper.h:
* Source/JavaScriptCore/dfg/DFGDoesGC.cpp:
(JSC::DFG::doesGC):
* Source/JavaScriptCore/dfg/DFGFixupPhase.cpp:
(JSC::DFG::FixupPhase::fixupNode):
(JSC::DFG::FixupPhase::addStringMatchPrimordialChecks):
* Source/JavaScriptCore/dfg/DFGGraph.h:
* Source/JavaScriptCore/dfg/DFGJITCode.cpp:
(JSC::DFG::JITData::JITData):
(JSC::DFG::JITData::tryInitialize):
* Source/JavaScriptCore/dfg/DFGJITCode.h:
* Source/JavaScriptCore/dfg/DFGNodeType.h:
* Source/JavaScriptCore/dfg/DFGOperations.cpp:
(JSC::DFG::JSC_DEFINE_JIT_OPERATION):
* Source/JavaScriptCore/dfg/DFGOperations.h:
* Source/JavaScriptCore/dfg/DFGPredictionPropagationPhase.cpp:
* Source/JavaScriptCore/dfg/DFGSafeToExecute.h:
(JSC::DFG::safeToExecute):
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT.cpp:
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT.h:
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT32_64.cpp:
(JSC::DFG::SpeculativeJIT::compile):
* Source/JavaScriptCore/dfg/DFGSpeculativeJIT64.cpp:
(JSC::DFG::SpeculativeJIT::compile):
* Source/JavaScriptCore/ftl/FTLCapabilities.cpp:
(JSC::FTL::canCompile):
* Source/JavaScriptCore/ftl/FTLLowerDFGToB3.cpp:
(JSC::FTL::DFG::LowerDFGToB3::compileNode):
(JSC::FTL::DFG::LowerDFGToB3::compileStringMatch):
* Source/JavaScriptCore/runtime/Intrinsic.h:
* Source/JavaScriptCore/runtime/JSGlobalObject.cpp:
(JSC::JSGlobalObject::init):
* Source/JavaScriptCore/runtime/JSGlobalObject.h:
* Source/JavaScriptCore/runtime/RegExpConstructor.cpp:
(JSC::regExpCreate):
* Source/JavaScriptCore/runtime/RegExpConstructor.h:
* Source/JavaScriptCore/runtime/RegExpObject.h:
* Source/JavaScriptCore/runtime/RegExpObjectInlines.h:
(JSC::RegExpObject::isSymbolMatchFastAndNonObservable):
* Source/JavaScriptCore/runtime/RegExpPrototype.cpp:
(JSC::regExpMatchFast):
(JSC::JSC_DEFINE_HOST_FUNCTION):
* Source/JavaScriptCore/runtime/RegExpPrototype.h:
* Source/JavaScriptCore/runtime/StringPrototype.cpp:
(JSC::StringPrototype::finishCreation):
(JSC::stringMatchSlow):
(JSC::JSC_DEFINE_HOST_FUNCTION):
* Source/JavaScriptCore/runtime/StringPrototype.h:
@sosukesuzuki sosukesuzuki force-pushed the eng/string-prototype-match-cpp branch from bd01109 to a7b3161 Compare May 9, 2026 09:58
@sosukesuzuki sosukesuzuki marked this pull request as ready for review May 10, 2026 05:23
@sosukesuzuki sosukesuzuki requested a review from a team as a code owner May 10, 2026 05:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

JavaScriptCore For bugs in JavaScriptCore, the JS engine used by WebKit, other than kxmlcore issues.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants