-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[feat](map) remove duplicated keys in ColumnMap #54068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
Possible file(s) that should be tracked in LFS detected: 🚨The following file(s) exceeds the file size limit:
Consider using |
TPC-H: Total hot run time: 33786 ms |
TPC-DS: Total hot run time: 173861 ms |
ClickBench: Total hot run time: 32.59 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
Possible file(s) that should be tracked in LFS detected: 🚨The following file(s) exceeds the file size limit:
Consider using |
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 34171 ms |
TPC-DS: Total hot run time: 162234 ms |
ClickBench: Total hot run time: 32.76 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Possible file(s) that should be tracked in LFS detected: 🚨The following file(s) exceeds the file size limit:
Consider using |
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 34142 ms |
TPC-DS: Total hot run time: 172407 ms |
ClickBench: Total hot run time: 33.65 s |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
| assert_cast<const ColumnArray&>(*k_arr).get_offsets_ptr()); | ||
| } | ||
|
|
||
| Status ColumnMap::deduplicate_keys(bool recursive) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
两个block合并的时候也需要去重
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
b08eae0 to
24cea7b
Compare
|
run buildall |
Possible file(s) that should be tracked in LFS detected: 🚨The following file(s) exceeds the file size limit:
Consider using |
TPC-H: Total hot run time: 34630 ms |
| result_col_map_offsets[row] = offset; | ||
| } | ||
|
|
||
| RETURN_IF_ERROR(map_column->deduplicate_keys()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map_column->deduplicate_keys(true)? 比如是:map(2, map(2, 3, 2, 3))
| const auto inner_rows = keys_column->size(); | ||
| const auto rows = offsets_column->size(); | ||
|
|
||
| if (recursive) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是不是总是检查嵌套的map ,如果是就去去重一遍,这样调用 deduplicate_keys 不用考虑是否嵌套了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
理论上咱们生成一个 map column 时会就会去重,那么大部分场景下可能不需要递归式的去重。
TPC-DS: Total hot run time: 188693 ms |
ClickBench: Total hot run time: 29.68 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
24cea7b to
23d61a4
Compare
|
run buildall |
Possible file(s) that should be tracked in LFS detected: 🚨The following file(s) exceeds the file size limit:
Consider using |
TPC-H: Total hot run time: 34873 ms |
TPC-DS: Total hot run time: 189719 ms |
ClickBench: Total hot run time: 30.01 s |
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
csun5285
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Possible file(s) that should be tracked in LFS detected: 🚨The following file(s) exceeds the file size limit:
Consider using |
What problem does this PR solve?
Doc: apache/doris-website#2676
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)