Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](pipelinex) Fix MaxScannerThreadNum calculation error in file scan operator when turn on pipelinex. #33037

Conversation

kaka11chen
Copy link
Contributor

Proposed changes

MaxScannerThreadNum in file scan operator when turn on pipelinex is incorrect, it will cost many memory and causing performance degradation. This PR fix it.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Mar 29, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman
Copy link
Contributor

run buildall

@kaka11chen kaka11chen changed the title [Fix] (pipelinex) Fix MaxScannerThreadNum calculation error in file scan operator when turn on pipelinex. [Fix](pipelinex) Fix MaxScannerThreadNum calculation error in file scan operator when turn on pipelinex. Mar 29, 2024
@doris-robot
Copy link

TPC-H: Total hot run time: 38813 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 355033cefe1130b99b49247b6b7bd89baba6dd31, data reload: false

------ Round 1 ----------------------------------
q1	17776	4151	4092	4092
q2	2103	194	196	194
q3	10464	1196	1393	1196
q4	10202	827	977	827
q5	7467	3003	2960	2960
q6	216	139	136	136
q7	1128	639	612	612
q8	9424	2078	2046	2046
q9	6691	6247	6164	6164
q10	8469	3510	3506	3506
q11	425	246	236	236
q12	385	231	210	210
q13	17773	2864	2913	2864
q14	271	237	244	237
q15	529	485	473	473
q16	500	381	378	378
q17	962	900	880	880
q18	7505	6521	6573	6521
q19	1590	1533	1547	1533
q20	612	314	310	310
q21	3575	3132	3170	3132
q22	367	309	306	306
Total cold run time: 108434 ms
Total hot run time: 38813 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4089	4020	4040	4020
q2	333	221	222	221
q3	2984	2936	2962	2936
q4	1872	1895	1856	1856
q5	5255	5232	5248	5232
q6	209	123	126	123
q7	2248	1839	1819	1819
q8	3232	3308	3289	3289
q9	8525	8498	8510	8498
q10	3762	4700	4010	4010
q11	550	462	479	462
q12	785	624	645	624
q13	13400	3227	3118	3118
q14	328	280	280	280
q15	523	486	473	473
q16	496	433	417	417
q17	1783	1753	1746	1746
q18	8186	7855	7657	7657
q19	1725	1686	1664	1664
q20	1980	1823	1846	1823
q21	5188	4936	5001	4936
q22	508	452	452	452
Total cold run time: 67961 ms
Total hot run time: 55656 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.56% (8840/24861)
Line Coverage: 27.30% (72485/265559)
Region Coverage: 26.50% (37504/141540)
Branch Coverage: 23.31% (19122/82048)
Coverage Report: http://coverage.selectdb-in.cc/coverage/355033cefe1130b99b49247b6b7bd89baba6dd31_355033cefe1130b99b49247b6b7bd89baba6dd31/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 181625 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 355033cefe1130b99b49247b6b7bd89baba6dd31, data reload: false

query1	1232	1128	361	361
query2	6215	2041	1957	1957
query3	6672	224	214	214
query4	24701	21443	21485	21443
query5	4144	403	396	396
query6	275	200	180	180
query7	4607	305	310	305
query8	224	169	179	169
query9	8475	2202	2218	2202
query10	427	240	257	240
query11	15165	14480	14391	14391
query12	147	103	97	97
query13	1641	388	421	388
query14	8431	6943	6832	6832
query15	204	174	178	174
query16	6727	285	294	285
query17	949	626	604	604
query18	1851	295	291	291
query19	212	171	166	166
query20	104	95	99	95
query21	194	138	140	138
query22	4969	4849	4758	4758
query23	33255	32661	32457	32457
query24	13482	3226	3265	3226
query25	719	431	440	431
query26	1764	171	174	171
query27	3169	388	391	388
query28	6950	1919	1863	1863
query29	1266	632	637	632
query30	305	159	161	159
query31	1030	760	774	760
query32	105	65	68	65
query33	732	261	256	256
query34	1132	515	534	515
query35	894	750	744	744
query36	1019	882	866	866
query37	279	77	82	77
query38	3628	3557	3588	3557
query39	1653	1610	1601	1601
query40	241	156	147	147
query41	50	48	47	47
query42	118	110	113	110
query43	466	418	417	417
query44	1115	744	754	744
query45	297	280	277	277
query46	1113	865	816	816
query47	2038	1898	1924	1898
query48	413	336	331	331
query49	971	394	393	393
query50	846	430	428	428
query51	6966	6931	6841	6841
query52	117	96	104	96
query53	375	304	310	304
query54	302	252	245	245
query55	96	89	80	80
query56	258	242	233	233
query57	1260	1182	1195	1182
query58	262	232	232	232
query59	2851	2583	2379	2379
query60	259	243	230	230
query61	95	87	87	87
query62	664	462	454	454
query63	308	283	285	283
query64	5831	3082	3427	3082
query65	3053	3007	3012	3007
query66	1310	338	326	326
query67	15609	14817	14865	14817
query68	7586	575	570	570
query69	563	335	323	323
query70	1202	1133	1089	1089
query71	520	272	289	272
query72	6281	2593	2423	2423
query73	802	331	334	331
query74	6837	6343	6345	6343
query75	3650	2302	2304	2302
query76	4976	1159	1207	1159
query77	543	268	246	246
query78	11076	10190	10097	10097
query79	9693	547	544	544
query80	1740	424	451	424
query81	528	227	221	221
query82	681	104	103	103
query83	206	169	166	166
query84	263	88	93	88
query85	1387	297	284	284
query86	476	279	248	248
query87	3661	3530	3469	3469
query88	4299	2438	2414	2414
query89	578	367	370	367
query90	1907	181	184	181
query91	141	107	110	107
query92	61	52	52	52
query93	7375	534	538	534
query94	1099	201	201	201
query95	431	328	333	328
query96	628	275	275	275
query97	2708	2472	2499	2472
query98	233	223	216	216
query99	1340	836	861	836
Total cold run time: 299703 ms
Total hot run time: 181625 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.87 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 355033cefe1130b99b49247b6b7bd89baba6dd31, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.23	0.05	0.05
query4	1.67	0.07	0.07
query5	0.48	0.48	0.49
query6	1.14	0.65	0.66
query7	0.02	0.01	0.02
query8	0.05	0.05	0.05
query9	0.56	0.52	0.51
query10	0.56	0.57	0.58
query11	0.14	0.11	0.12
query12	0.13	0.12	0.12
query13	0.60	0.60	0.59
query14	0.78	0.79	0.78
query15	0.87	0.84	0.84
query16	0.34	0.35	0.36
query17	0.97	1.00	0.97
query18	0.25	0.26	0.24
query19	1.83	1.73	1.74
query20	0.01	0.01	0.01
query21	15.54	0.76	0.71
query22	3.00	5.58	1.38
query23	17.50	1.42	1.11
query24	1.35	0.22	0.25
query25	0.13	0.09	0.09
query26	0.29	0.19	0.20
query27	0.08	0.09	0.09
query28	13.87	0.97	0.93
query29	12.72	3.33	3.32
query30	0.28	0.09	0.09
query31	2.80	0.39	0.39
query32	3.29	0.48	0.48
query33	2.81	2.86	2.90
query34	15.50	4.38	4.31
query35	4.40	4.40	4.38
query36	0.67	0.48	0.48
query37	0.19	0.19	0.17
query38	0.19	0.15	0.17
query39	0.04	0.04	0.05
query40	0.17	0.14	0.15
query41	0.09	0.05	0.05
query42	0.07	0.06	0.05
query43	0.04	0.04	0.04
Total cold run time: 105.77 s
Total hot run time: 29.87 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 355033cefe1130b99b49247b6b7bd89baba6dd31 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       16.9 seconds inserted 10000000 Rows, about 591K ops/s

@morningman morningman merged commit fdd4036 into apache:master Mar 30, 2024
26 of 31 checks passed
yiguolei pushed a commit that referenced this pull request Apr 1, 2024
…scan operator when turn on pipelinex. (#33037)

MaxScannerThreadNum in file scan operator when turn on pipelinex is incorrect, it will cost many memory and causing performance degradation. This PR fix it.
morningman pushed a commit to morningman/doris that referenced this pull request Apr 7, 2024
…scan operator when turn on pipelinex. (apache#33037)

MaxScannerThreadNum in file scan operator when turn on pipelinex is incorrect, it will cost many memory and causing performance degradation. This PR fix it.
morningman pushed a commit to morningman/doris that referenced this pull request Apr 7, 2024
…scan operator when turn on pipelinex. (apache#33037)

MaxScannerThreadNum in file scan operator when turn on pipelinex is incorrect, it will cost many memory and causing performance degradation. This PR fix it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.2-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants