-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Opt](function) Optimize like function for non-literal modes #59866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 31344 ms |
TPC-DS: Total hot run time: 172554 ms |
ClickBench: Total hot run time: 26.62 s |
|
run buildall |
TPC-H: Total hot run time: 32201 ms |
TPC-DS: Total hot run time: 173885 ms |
ClickBench: Total hot run time: 26.92 s |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes the LIKE function for non-literal pattern modes by introducing fast-path pattern analysis to avoid regex compilation for simple patterns. The optimization reduces query time from over 4 minutes to 4.5 seconds (a ~54x speedup) for patterns like LIKE concat('%', SearchPhrase, '%').
Changes:
- Added
LikeFastPathenum to categorize pattern types (ALLPASS, EQUALS, STARTS_WITH, ENDS_WITH, SUBSTRING, REGEX) - Introduced
extract_like_fast_pathfunction for lightweight pattern analysis without regex compilation - Modified
like_fn_scalarto use fast-path implementations before falling back to regex
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| be/src/vec/functions/like.h | Added LikeFastPath enum and extract_like_fast_path function for pattern analysis; updated includes to C++ standard headers |
| be/src/vec/functions/like.cpp | Modified like_fn_scalar to implement fast-path matching for simple patterns before falling back to regex |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
run beut |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run p0 |
|
run external |
|
run vault_p0 |
|
run cloud_p0 |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
before:
```sql
mysql> SELECT count(*) FROM hits_100m WHERE URL LIKE concat('%', SearchPhrase, '%');
+----------+
| count(*) |
+----------+
| 90144150 |
+----------+
1 row in set (4 min 5.15 sec)
```
now:
```sql
mysql> SELECT count(*) FROM hits_100m WHERE URL LIKE concat('%', SearchPhrase, '%');
+----------+
| count(*) |
+----------+
| 90144150 |
+----------+
1 row in set (4.50 sec)
```
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
before:
now:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)