Consolidate special case regexp_match logic#21486
Conversation
For anchored patterns like `^...(capture)....*$` where the replacement is `\1`, build a shorter regex (stripping trailing `.*$`) and use `captures_read` with `CaptureLocations` for direct extraction — no `expand()`, no `String` allocation. 2.4x improvement.
|
run benchmark regx |
|
run benchmark clickbench_1 |
|
run clickbench_partitioned |
|
🤖 Criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing alamb/test_coverage (2117a33) to 603bfb4 (merge-base) diff File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing alamb/test_coverage (2117a33) to 603bfb4 (merge-base) diff using: clickbench_1 File an issue against this benchmark runner |
|
🤖 Criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_1 — base (merge-base)
clickbench_1 — branch
File an issue against this benchmark runner |
regexp_replaceby stripping trailing .* from anchored patterns. 2.4x improvement (ClickBench Q28) #21379Which issue does this PR close?
regexp_replaceby stripping trailing .* from anchored patterns. 2.4x improvement (ClickBench Q28) #21379 from @DandandanRationale for this change
#21379 adds a specific optimization, but it had some non trivial code duplication
I wanted to reduce the duplication and also make the code easier to read(in my opinion)
What changes are included in this PR?
Move the special RegExp logic into its own struct, and add copious comments
Are these changes tested?
By existing tests, and the additional tests added in
Are there any user-facing changes?