Skip to content

[fix](local exchange) Fix BUCKET_HASH_SHUFFLE partition expr#60764

Open
Gabriel39 wants to merge 2 commits intoapache:masterfrom
Gabriel39:fix_02141
Open

[fix](local exchange) Fix BUCKET_HASH_SHUFFLE partition expr#60764
Gabriel39 wants to merge 2 commits intoapache:masterfrom
Gabriel39:fix_02141

Conversation

@Gabriel39
Copy link
Contributor

@Gabriel39 Gabriel39 commented Feb 14, 2026

What problem does this PR solve?

Refactor require_bucket_distribution to be propagated through the plan tree instead of being accumulated as a class member variable.

Problem:
The _require_bucket_distribution flag was accumulated globally across the entire plan tree, which caused incorrect partition expression selection for BUCKET_HASH_SHUFFLE in branches that shouldn't require bucket distribution.

Solution:

  • Remove _require_bucket_distribution member variable from PipelineFragmentContext
  • Add require_bucket_distribution parameter to _create_tree_helper() and _create_operator()
  • Compute current_require_bucket_distribution per-subtree based on whether the current operator is colocated and uses hash exchange
  • Simplify _build_operators_for_set_operation_node() by removing unnecessary parameters

This ensures bucket distribution requirements are correctly propagated down each branch of the plan tree independently, similar to how followed_by_shuffled_operator is handled.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Gabriel39
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 28922 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d6be3efbdd35d5462325c5f186bd579085845274, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17665	4512	4284	4284
q2	q3	10635	808	536	536
q4	4690	364	259	259
q5	7748	1245	1038	1038
q6	216	175	149	149
q7	831	848	692	692
q8	10580	1558	1468	1468
q9	5771	4824	4746	4746
q10	6860	1881	1612	1612
q11	467	258	234	234
q12	746	572	458	458
q13	17778	4216	3419	3419
q14	229	220	216	216
q15	982	797	786	786
q16	742	711	670	670
q17	824	917	456	456
q18	6131	5341	5209	5209
q19	1402	974	633	633
q20	522	509	398	398
q21	4620	1884	1419	1419
q22	349	287	240	240
Total cold run time: 99788 ms
Total hot run time: 28922 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4469	4342	4354	4342
q2	q3	1748	2165	1734	1734
q4	859	1187	786	786
q5	4001	4350	4342	4342
q6	174	169	142	142
q7	1711	1585	1479	1479
q8	2402	2678	2525	2525
q9	7860	7613	7547	7547
q10	2640	2843	2451	2451
q11	557	426	411	411
q12	522	598	449	449
q13	3985	4401	3696	3696
q14	301	306	280	280
q15	874	830	801	801
q16	719	777	720	720
q17	1195	1507	1307	1307
q18	7043	6795	6602	6602
q19	975	1027	993	993
q20	2093	2165	2089	2089
q21	3956	3680	3384	3384
q22	462	473	405	405
Total cold run time: 48546 ms
Total hot run time: 46485 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183373 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d6be3efbdd35d5462325c5f186bd579085845274, data reload: false

query5	4867	648	505	505
query6	326	220	211	211
query7	4247	476	277	277
query8	349	266	232	232
query9	8766	2756	2753	2753
query10	531	365	339	339
query11	17033	17349	17269	17269
query12	222	135	131	131
query13	1283	545	361	361
query14	7492	3284	3113	3113
query14_1	2947	2974	2867	2867
query15	207	201	182	182
query16	1037	509	479	479
query17	1108	736	643	643
query18	2919	453	375	375
query19	228	222	180	180
query20	136	130	144	130
query21	224	147	136	136
query22	5241	4962	4864	4864
query23	17210	16699	16564	16564
query23_1	16719	16709	16681	16681
query24	7082	1624	1230	1230
query24_1	1209	1248	1213	1213
query25	560	524	425	425
query26	823	266	154	154
query27	2727	526	284	284
query28	4400	1872	1870	1870
query29	797	545	469	469
query30	308	240	204	204
query31	863	728	654	654
query32	83	73	69	69
query33	512	336	278	278
query34	898	916	553	553
query35	640	688	602	602
query36	1064	1112	929	929
query37	133	92	84	84
query38	2922	2923	2857	2857
query39	905	871	861	861
query39_1	830	825	836	825
query40	225	149	132	132
query41	63	63	57	57
query42	105	104	102	102
query43	368	382	353	353
query44	
query45	195	186	183	183
query46	901	996	607	607
query47	2117	2132	2052	2052
query48	314	319	238	238
query49	634	477	383	383
query50	678	273	214	214
query51	4093	4082	4003	4003
query52	103	111	100	100
query53	290	338	281	281
query54	290	277	259	259
query55	87	82	82	82
query56	351	321	307	307
query57	1339	1294	1229	1229
query58	281	273	277	273
query59	2528	2683	2477	2477
query60	332	333	316	316
query61	147	141	145	141
query62	604	593	554	554
query63	314	272	279	272
query64	4140	1246	986	986
query65	
query66	1338	453	349	349
query67	16513	16264	16313	16264
query68	
query69	398	324	283	283
query70	985	978	957	957
query71	346	306	297	297
query72	2757	2667	2433	2433
query73	540	547	320	320
query74	10041	9929	9727	9727
query75	2833	2721	2471	2471
query76	2315	1038	681	681
query77	361	401	309	309
query78	11292	11330	10658	10658
query79	3033	800	589	589
query80	1768	622	531	531
query81	595	297	243	243
query82	957	145	119	119
query83	331	254	235	235
query84	262	118	99	99
query85	880	467	450	450
query86	518	304	299	299
query87	3110	3065	2971	2971
query88	3645	2699	2669	2669
query89	426	363	341	341
query90	2083	177	170	170
query91	158	154	136	136
query92	84	80	68	68
query93	1793	828	508	508
query94	646	321	290	290
query95	583	391	317	317
query96	640	537	234	234
query97	2450	2484	2398	2398
query98	232	218	212	212
query99	994	949	883	883
Total cold run time: 256768 ms
Total hot run time: 183373 ms

yiguolei pushed a commit that referenced this pull request Feb 14, 2026
### What problem does this PR solve?

pick #60764

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/20) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.68% (19537/37083)
Line Coverage 36.25% (182155/502559)
Region Coverage 32.61% (141407/433614)
Branch Coverage 33.64% (61270/182158)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (20/20) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.43% (26682/36336)
Line Coverage 56.59% (283701/501310)
Region Coverage 54.04% (236674/437991)
Branch Coverage 55.83% (102097/182862)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 15, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.4-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants