Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
[JSC] Introduce SubjectSampler to heuristically pick BM search in RegExp
https://bugs.webkit.org/show_bug.cgi?id=252065 rdar://105284820 Reviewed by Mark Lam. BoyerMoore search's effectiveness depends on whether we can pick a good anchor which rarely appears on the actual text. And if we pick a character which appears super frequently in text, then it does not have much effectiveness or rather slows down RegExp performance since BM search adds additional searching code. So in this patch, we integrate an idea of V8 Irregexp, which samples 128 characters of a text at compile time and use character frequency as a weight to pick better BoyerMoore search character. Our weight calculation is simpler one than V8, and it is effective in our benchmarks. This patch improves JetStream2/regex-dna-SP by 5-10%. * Source/JavaScriptCore/runtime/RegExp.cpp: (JSC::RegExp::compile): (JSC::RegExp::compileMatchOnly): * Source/JavaScriptCore/runtime/RegExp.h: * Source/JavaScriptCore/runtime/RegExpInlines.h: (JSC::RegExp::compileIfNecessary): (JSC::RegExp::matchInline): (JSC::RegExp::compileIfNecessaryMatchOnly): * Source/JavaScriptCore/yarr/YarrJIT.cpp: (JSC::Yarr::SubjectSampler::SubjectSampler): (JSC::Yarr::SubjectSampler::frequency const): (JSC::Yarr::SubjectSampler::sample): (JSC::Yarr::SubjectSampler::dump const): (JSC::Yarr::SubjectSampler::is8Bit const): (JSC::Yarr::SubjectSampler::add): (JSC::Yarr::BoyerMooreInfo::findBestCharacterSequence const): (JSC::Yarr::BoyerMooreInfo::findWorthwhileCharacterSequenceForLookahead const): (JSC::Yarr::jitCompile): * Source/JavaScriptCore/yarr/YarrJIT.h: Canonical link: https://commits.webkit.org/260142@main
- Loading branch information
1 parent
da77e20
commit d43e2a2
Showing
5 changed files
with
101 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters