Skip to content

[refine](sort) Remove heap sorter runtime predicate guard#63937

Closed
Mryange wants to merge 1 commit into
apache:masterfrom
Mryange:Remove-heap-sort-runtime-predicage-guard
Closed

[refine](sort) Remove heap sorter runtime predicate guard#63937
Mryange wants to merge 1 commit into
apache:masterfrom
Mryange:Remove-heap-sort-runtime-predicage-guard

Conversation

@Mryange
Copy link
Copy Markdown
Contributor

@Mryange Mryange commented Jun 1, 2026

What problem does this PR solve?

Problem Summary:

HeapSorter kept a have_runtime_predicate flag and skipped its local TopN filtering path whenever the sort node had a runtime predicate. The heap sorter should no longer branch on whether a runtime predicate exists. This PR removes that constructor argument and member state, and always uses the existing heap filter path once the heap queue is valid and has reached the heap size.

The sort sink no longer queries QueryContext::has_runtime_predicate() when constructing HeapSorter. The heap sorter unit test is also updated so it checks the expected top value directly instead of reading it from an input block that may be modified by the filtering path.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Jun 1, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking findings found in the changed heap-sorter runtime-predicate path.

Critical checkpoint conclusions:

  • Goal and proof: The PR removes the runtime-predicate guard so HeapSorter always uses its local TopN filtering once the heap is full. The updated unit test covers the changed block-mutation expectation, though I did not run the BE unit test in this runner.
  • Scope: The change is small and focused on HeapSorter construction and the append fast path.
  • Concurrency: No new shared state or locking is introduced. Existing runtime predicate update remains in SortSinkOperatorX::sink after append_block.
  • Lifecycle/static state: No new lifecycle management or static/global initialization concerns.
  • Configuration/compatibility: No config, protocol, storage, or serialization compatibility changes.
  • Parallel paths: TOPN_SORT and FULL_SORT paths are unchanged; the change is specific to HEAP_SORT, which is where the runtime predicate source is handled.
  • Tests: Existing heap sorter unit test was adjusted; no regression or unit test execution was performed locally.
  • Observability: Existing SortSink counters/timers remain, including TopNFilterTime/TopNFilterRows and UpdateRuntimePredicateTime.
  • Data correctness: Local heap filtering and runtime predicate updates both use the current heap top bound; I did not find a path where enabling both drops rows that should remain in the final top-N.
  • Performance: Removing the guard enables the existing local filter in more cases and removes one QueryContext lookup during sorter construction.

User focus: No additional user-provided review focus was present.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Jun 1, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 100.00% (5/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 54.06% (21042/38922)
Line Coverage 37.59% (199380/530341)
Region Coverage 33.84% (156026/461076)
Branch Coverage 34.83% (67900/194930)

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Jun 1, 2026

run external

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (5/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.83% (28142/38118)
Line Coverage 57.80% (305792/529042)
Region Coverage 54.94% (255759/465488)
Branch Coverage 56.46% (110472/195660)

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29484 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5ed6aecd3851f71d14fb75144888bb3242d7dc02, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17780	4166	4072	4072
q2	q3	10806	1511	832	832
q4	4695	470	340	340
q5	7543	881	597	597
q6	184	173	138	138
q7	785	842	654	654
q8	9376	1620	1675	1620
q9	5991	4568	4579	4568
q10	6796	1814	1544	1544
q11	431	266	251	251
q12	636	424	288	288
q13	18145	3329	2809	2809
q14	264	257	238	238
q15	q16	789	800	706	706
q17	917	978	938	938
q18	7386	5765	5528	5528
q19	1149	1220	1041	1041
q20	519	409	270	270
q21	6212	2875	2723	2723
q22	464	381	327	327
Total cold run time: 100868 ms
Total hot run time: 29484 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5057	4733	4800	4733
q2	q3	4967	5338	4690	4690
q4	2143	2186	1398	1398
q5	4726	4924	4767	4767
q6	228	170	130	130
q7	1861	1759	1636	1636
q8	2418	2113	2117	2113
q9	8003	7819	7637	7637
q10	4750	4660	4218	4218
q11	530	380	356	356
q12	736	734	522	522
q13	2984	3346	2776	2776
q14	273	284	247	247
q15	q16	682	696	638	638
q17	1280	1259	1252	1252
q18	7308	6881	6917	6881
q19	1127	1086	1108	1086
q20	2227	2229	1955	1955
q21	5287	4600	4527	4527
q22	535	469	437	437
Total cold run time: 57122 ms
Total hot run time: 51999 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171010 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5ed6aecd3851f71d14fb75144888bb3242d7dc02, data reload: false

query5	4324	661	533	533
query6	349	226	205	205
query7	4261	579	310	310
query8	359	228	218	218
query9	8800	4128	4097	4097
query10	450	357	294	294
query11	5781	2343	2194	2194
query12	188	132	133	132
query13	1278	612	441	441
query14	6058	5435	5111	5111
query14_1	4458	4429	4445	4429
query15	216	211	184	184
query16	1034	462	454	454
query17	995	778	626	626
query18	2480	531	353	353
query19	217	209	169	169
query20	153	139	136	136
query21	210	139	121	121
query22	13588	13597	13378	13378
query23	17281	16562	16238	16238
query23_1	16415	16394	16327	16327
query24	7356	1756	1318	1318
query24_1	1311	1325	1312	1312
query25	545	479	434	434
query26	1322	314	177	177
query27	2718	601	343	343
query28	4457	2043	2019	2019
query29	965	629	518	518
query30	315	236	196	196
query31	1135	1099	945	945
query32	92	84	76	76
query33	546	374	315	315
query34	1228	1142	663	663
query35	793	813	692	692
query36	1417	1401	1271	1271
query37	159	110	101	101
query38	3232	3189	3093	3093
query39	940	932	902	902
query39_1	883	889	883	883
query40	252	155	133	133
query41	70	70	71	70
query42	118	121	111	111
query43	332	335	299	299
query44	
query45	225	214	202	202
query46	1057	1217	757	757
query47	2418	2367	2268	2268
query48	388	430	293	293
query49	650	520	404	404
query50	987	353	270	270
query51	4362	4300	4388	4300
query52	119	112	99	99
query53	263	290	211	211
query54	345	305	274	274
query55	97	92	90	90
query56	323	323	333	323
query57	1453	1432	1339	1339
query58	328	293	286	286
query59	1610	1657	1516	1516
query60	343	357	340	340
query61	186	186	180	180
query62	712	676	596	596
query63	264	219	222	219
query64	2478	862	693	693
query65	
query66	1755	506	379	379
query67	29745	29794	29674	29674
query68	
query69	524	331	293	293
query70	1034	1013	1001	1001
query71	311	285	278	278
query72	3019	2686	2466	2466
query73	861	832	411	411
query74	5171	4956	4741	4741
query75	2724	2603	2267	2267
query76	2262	1147	770	770
query77	407	416	343	343
query78	12266	12490	11811	11811
query79	1403	1135	737	737
query80	658	544	477	477
query81	467	279	241	241
query82	1582	155	127	127
query83	361	290	264	264
query84	311	141	117	117
query85	908	559	457	457
query86	417	337	312	312
query87	3452	3367	3221	3221
query88	3672	2791	2760	2760
query89	451	398	347	347
query90	1969	188	196	188
query91	184	176	148	148
query92	81	80	74	74
query93	1596	1432	863	863
query94	528	354	312	312
query95	678	388	344	344
query96	1061	766	342	342
query97	2754	2755	2605	2605
query98	243	232	228	228
query99	1157	1166	1030	1030
Total cold run time: 254474 ms
Total hot run time: 171010 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (5/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.77% (28119/38118)
Line Coverage 57.73% (305416/529042)
Region Coverage 54.89% (255529/465488)
Branch Coverage 56.35% (110262/195660)

@Mryange Mryange closed this Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants