Skip to content

[improvement](compaction) start 1 cumu compaction thread each disk by default#29430

Merged
dataroaring merged 4 commits intoapache:masterfrom
dataroaring:compaction_opt
Jan 3, 2024
Merged

[improvement](compaction) start 1 cumu compaction thread each disk by default#29430
dataroaring merged 4 commits intoapache:masterfrom
dataroaring:compaction_opt

Conversation

@dataroaring
Copy link
Contributor

We start 1 cumu and base compaction thread each disk.

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

We start 1 cumu and base compaction thread each disk.
@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2024

clang-tidy review says "All clean, LGTM! 👍"

@dataroaring
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2024

PR approved by anyone and no changes requested.

@dataroaring dataroaring changed the title (compaction) compaction threads is 1 for each disk [improvement](compaction) compaction threads is 1 for each disk Jan 2, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2024

clang-tidy review says "All clean, LGTM! 👍"

@dataroaring
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2024

clang-tidy review says "All clean, LGTM! 👍"

@luwei16
Copy link
Contributor

luwei16 commented Jan 2, 2024

LGTM

@dataroaring dataroaring changed the title [improvement](compaction) compaction threads is 1 for each disk [improvement](compaction) start 1 cumu compaction thread and 1 base compaction thread each disk Jan 2, 2024
@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit e471199b8845e6f37f3cf8bf013796b49f3b3396, data reload: false

------ Round 1 ----------------------------------
q1	18661	5465	5158	5158
q2	2044	161	150	150
q3	10611	1065	1119	1065
q4	10248	792	834	792
q5	7802	2990	2925	2925
q6	214	138	135	135
q7	908	515	552	515
q8	9301	2018	2028	2018
q9	6837	6400	6345	6345
q10	8262	3047	2980	2980
q11	425	220	216	216
q12	392	243	239	239
q13	18013	3624	3630	3624
q14	249	215	210	210
q15	599	545	541	541
q16	456	399	420	399
q17	962	492	549	492
q18	7530	6748	6723	6723
q19	1575	1399	1307	1307
q20	727	347	315	315
q21	2814	2392	2420	2392
q22	378	319	360	319
Total cold run time: 109008 ms
Total hot run time: 38860 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5150	5064	5084	5064
q2	339	235	228	228
q3	3344	3308	3275	3275
q4	2227	1995	1977	1977
q5	5806	5756	5803	5756
q6	217	128	125	125
q7	2350	1927	1881	1881
q8	3360	3427	3464	3427
q9	8904	8851	8883	8851
q10	3829	3854	3873	3854
q11	585	482	518	482
q12	805	639	689	639
q13	8256	3157	3192	3157
q14	289	266	272	266
q15	589	537	530	530
q16	553	539	506	506
q17	1935	1749	1755	1749
q18	8733	8489	8324	8324
q19	1620	1594	1584	1584
q20	2210	1971	1969	1969
q21	5590	5227	5305	5227
q22	557	518	500	500
Total cold run time: 67248 ms
Total hot run time: 59371 ms

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit e471199b8845e6f37f3cf8bf013796b49f3b3396, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5507	5123	5070	5070
q2	388	167	155	155
q3	1442	1112	1128	1112
q4	1107	820	801	801
q5	3102	3116	3130	3116
q6	221	144	131	131
q7	969	541	535	535
q8	2166	2293	2212	2212
q9	6667	6667	6679	6667
q10	3148	3161	3117	3117
q11	352	224	216	216
q12	374	243	232	232
q13	4398	3661	3624	3624
q14	247	208	218	208
q15	623	555	553	553
q16	457	419	419	419
q17	1041	438	452	438
q18	7106	6787	6826	6787
q19	1641	1512	1558	1512
q20	604	360	367	360
q21	2862	2465	2492	2465
q22	386	331	331	331
Total cold run time: 44808 ms
Total hot run time: 40061 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5229	5060	5093	5060
q2	335	251	235	235
q3	3374	3346	3309	3309
q4	2141	2016	2005	2005
q5	5952	5917	5925	5917
q6	231	124	126	124
q7	2384	1935	1937	1935
q8	3549	3650	3681	3650
q9	9052	9012	8997	8997
q10	3898	3894	3938	3894
q11	591	485	476	476
q12	820	705	631	631
q13	3900	3193	3167	3167
q14	298	272	273	272
q15	617	559	545	545
q16	542	520	505	505
q17	2039	1803	1778	1778
q18	8728	8466	8448	8448
q19	1888	1677	1692	1677
q20	2282	2001	1996	1996
q21	5669	5428	5275	5275
q22	556	491	514	491
Total cold run time: 64075 ms
Total hot run time: 60387 ms

@dataroaring dataroaring changed the title [improvement](compaction) start 1 cumu compaction thread and 1 base compaction thread each disk [improvement](compaction) start 1 cumu compaction thread and 1 base compaction thread each disk by default Jan 2, 2024
morningman
morningman previously approved these changes Jan 2, 2024
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 2, 2024
@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.64% (8618/23521)
Line Coverage: 28.70% (70051/244122)
Region Coverage: 27.68% (36260/131000)
Branch Coverage: 24.37% (18522/76012)
Coverage Report: http://coverage.selectdb-in.cc/coverage/e471199b8845e6f37f3cf8bf013796b49f3b3396_e471199b8845e6f37f3cf8bf013796b49f3b3396/report/index.html

@dataroaring dataroaring changed the title [improvement](compaction) start 1 cumu compaction thread and 1 base compaction thread each disk by default [improvement](compaction) start 1 cumu compaction thread each disk by default Jan 2, 2024
@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Jan 2, 2024
@doris-robot
Copy link

TPC-DS test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpcds-tools

TPC-DS sf100 test result on commit e471199b8845e6f37f3cf8bf013796b49f3b3396, data reload: false

run tpcds-sf100 query with default conf and session variables
query1	939	343	347	343
query2	6423	1863	1904	1863
query3	6660	215	205	205
query4	26302	22499	22510	22499
query5	5708	510	534	510
query6	275	182	179	179
query7	4570	271	270	270
query8	227	198	201	198
query9	8218	2569	2598	2569
query10	456	276	261	261
query11	16194	15747	15602	15602
query12	127	81	75	75
query13	1624	327	321	321
query14	11651	7152	7124	7124
query15	237	188	203	188
query16	6418	278	276	276
query17	1808	493	486	486
query18	1929	275	262	262
query19	270	139	142	139
query20	84	80	77	77
query21	179	97	97	97
query22	5099	4847	4798	4798
query23	31993	31314	31206	31206
query24	11880	2825	2828	2825
query25	575	337	347	337
query26	1693	146	146	146
query27	2846	279	287	279
query28	7020	1925	1919	1919
query29	2045	401	380	380
query30	298	145	145	145
query31	981	765	763	763
query32	90	57	61	57
query33	737	275	273	273
query34	862	438	432	432
query35	870	749	710	710
query36	1270	1272	1192	1192
query37	186	75	73	73
query38	3379	3305	3246	3246
query39	1336	1313	1301	1301
query40	304	91	93	91
query41	41	35	35	35
query42	97	87	95	87
query43	538	496	493	493
query44	1055	702	710	702
query45	193	188	184	184
query46	1067	632	633	632
query47	1701	1568	1502	1502
query48	327	261	250	250
query49	1201	325	325	325
query50	757	334	328	328
query51	5411	5247	5334	5247
query52	91	88	83	83
query53	218	153	153	153
query54	1400	561	597	561
query55	95	95	86	86
query56	207	195	190	190
query57	1059	952	960	952
query58	216	199	205	199
query59	2877	2685	2651	2651
query60	263	242	250	242
query61	82	81	81	81
query62	643	471	457	457
query63	161	151	154	151
query64	5881	1774	1748	1748
query65	3345	3258	3261	3258
query66	1276	344	330	330
query67	15639	15532	15170	15170
query68	12523	535	559	535
query69	505	252	239	239
query70	1709	1566	1473	1473
query71	497	233	227	227
query72	5548	3539	3529	3529
query73	2838	318	317	317
query74	7031	6381	6487	6381
query75	5031	2246	2280	2246
query76	6229	1074	1102	1074
query77	675	284	283	283
query78	9096	8867	8497	8497
query79	5008	509	509	509
query80	678	361	360	360
query81	450	210	206	206
query82	214	104	102	102
query83	164	138	146	138
query84	242	55	55	55
query85	938	273	271	271
query86	458	387	401	387
query87	3473	3400	3345	3345
query88	3656	2276	2299	2276
query89	346	267	264	264
query90	1875	207	197	197
query91	116	97	91	91
query92	63	51	55	51
query93	3289	507	492	492
query94	808	186	191	186
query95	452	416	397	397
query96	640	324	321	321
query97	4290	4199	4191	4191
query98	209	189	191	189
query99	1124	803	885	803
Total cold run time: 299779 ms
Total hot run time: 179446 ms

@dataroaring
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit 44fec777dc1d389e5bf5447fbcfd3f3a6309c68e, data reload: false

------ Round 1 ----------------------------------
q1	17616	5120	5136	5120
q2	2024	156	144	144
q3	10531	1128	1077	1077
q4	10195	791	840	791
q5	7800	2968	2944	2944
q6	215	138	134	134
q7	912	522	546	522
q8	9300	2005	2028	2005
q9	7004	6464	6425	6425
q10	8272	3066	3009	3009
q11	437	219	211	211
q12	390	232	234	232
q13	18005	3620	3629	3620
q14	236	210	215	210
q15	577	533	548	533
q16	495	392	423	392
q17	976	482	619	482
q18	7431	6828	6677	6677
q19	1571	1408	1443	1408
q20	712	349	356	349
q21	2803	2408	2423	2408
q22	374	323	336	323
Total cold run time: 107876 ms
Total hot run time: 39016 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5171	5079	5059	5059
q2	332	249	256	249
q3	3313	3293	3264	3264
q4	2109	2012	1995	1995
q5	5829	5810	5784	5784
q6	217	124	127	124
q7	2334	1942	1903	1903
q8	3399	3443	3465	3443
q9	8781	8735	8716	8716
q10	3789	3827	3862	3827
q11	587	460	488	460
q12	795	625	653	625
q13	13452	3180	3188	3180
q14	287	260	266	260
q15	599	534	536	534
q16	557	504	537	504
q17	1938	1763	1758	1758
q18	8709	8369	8304	8304
q19	1634	1591	1583	1583
q20	2224	1972	1951	1951
q21	5688	5348	5303	5303
q22	525	534	471	471
Total cold run time: 72269 ms
Total hot run time: 59297 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.63% (8616/23521)
Line Coverage: 28.68% (70023/244122)
Region Coverage: 27.67% (36245/131000)
Branch Coverage: 24.36% (18519/76012)
Coverage Report: http://coverage.selectdb-in.cc/coverage/44fec777dc1d389e5bf5447fbcfd3f3a6309c68e_44fec777dc1d389e5bf5447fbcfd3f3a6309c68e/report/index.html

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit 44fec777dc1d389e5bf5447fbcfd3f3a6309c68e, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5515	5175	5178	5175
q2	402	173	166	166
q3	1452	1122	1120	1120
q4	1082	861	810	810
q5	3084	3136	3157	3136
q6	223	143	130	130
q7	981	568	513	513
q8	2161	2206	2302	2206
q9	6687	6691	6662	6662
q10	3184	3093	3112	3093
q11	339	221	213	213
q12	383	231	230	230
q13	4389	3628	3627	3627
q14	262	225	213	213
q15	608	557	540	540
q16	477	400	416	400
q17	1058	522	528	522
q18	7057	6777	6780	6777
q19	1645	1532	1473	1473
q20	595	362	355	355
q21	2903	2466	2486	2466
q22	391	315	315	315
Total cold run time: 44878 ms
Total hot run time: 40142 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5137	5027	5137	5027
q2	331	258	238	238
q3	3360	3321	3308	3308
q4	2134	2009	2025	2009
q5	5937	5944	5924	5924
q6	225	123	124	123
q7	2383	1959	1913	1913
q8	3583	3638	3670	3638
q9	9049	8982	9019	8982
q10	3869	3899	3910	3899
q11	580	476	484	476
q12	807	649	623	623
q13	3909	3205	3197	3197
q14	291	305	277	277
q15	612	544	552	544
q16	582	473	524	473
q17	2043	1820	1823	1820
q18	8760	8354	8417	8354
q19	1750	1659	1664	1659
q20	2284	1994	2001	1994
q21	5656	5270	5335	5270
q22	543	526	487	487
Total cold run time: 63825 ms
Total hot run time: 60235 ms

@doris-robot
Copy link

TPC-DS test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpcds-tools

TPC-DS sf100 test result on commit 44fec777dc1d389e5bf5447fbcfd3f3a6309c68e, data reload: false

run tpcds-sf100 query with default conf and session variables
query1	891	358	344	344
query2	3186	1895	1842	1842
query3	5101	210	204	204
query4	27223	22372	22367	22367
query5	2707	559	523	523
query6	245	173	176	173
query7	4174	277	275	275
query8	213	201	203	201
query9	6829	2631	2551	2551
query10	398	266	242	242
query11	16257	15435	15461	15435
query12	129	77	76	76
query13	1317	332	339	332
query14	9754	7178	7083	7083
query15	222	191	198	191
query16	5338	275	272	272
query17	1562	494	498	494
query18	1651	274	265	265
query19	247	145	138	138
query20	85	77	81	77
query21	164	98	93	93
query22	5178	4846	4600	4600
query23	32051	31223	31022	31022
query24	12175	2817	2806	2806
query25	619	345	341	341
query26	1420	144	147	144
query27	2869	284	280	280
query28	6855	1938	1934	1934
query29	1474	403	397	397
query30	274	141	157	141
query31	1005	782	787	782
query32	65	61	56	56
query33	727	278	275	275
query34	919	446	428	428
query35	833	739	755	739
query36	1378	1221	1315	1221
query37	141	71	72	71
query38	3417	3282	3270	3270
query39	1324	1297	1270	1270
query40	236	90	90	90
query41	37	36	35	35
query42	96	90	86	86
query43	501	494	498	494
query44	1054	701	720	701
query45	194	187	185	185
query46	1052	630	644	630
query47	1681	1603	1574	1574
query48	349	255	256	255
query49	980	327	321	321
query50	735	313	323	313
query51	5339	5195	5248	5195
query52	90	98	85	85
query53	219	143	152	143
query54	2007	570	568	568
query55	96	90	87	87
query56	218	194	204	194
query57	1036	952	967	952
query58	230	204	210	204
query59	2791	2680	2631	2631
query60	270	258	247	247
query61	82	82	88	82
query62	643	454	474	454
query63	164	151	145	145
query64	5140	1744	1648	1648
query65	3346	3297	3286	3286
query66	1363	337	326	326
query67	15967	15453	15609	15453
query68	10835	540	552	540
query69	447	256	245	245
query70	1699	1600	1523	1523
query71	443	228	223	223
query72	5709	3599	3496	3496
query73	2257	316	310	310
query74	6910	6421	6400	6400
query75	4988	2315	2255	2255
query76	5702	1064	1157	1064
query77	644	276	260	260
query78	9185	8777	8671	8671
query79	1042	502	498	498
query80	557	378	369	369
query81	452	208	209	208
query82	194	104	106	104
query83	170	138	137	137
query84	241	55	52	52
query85	970	272	266	266
query86	423	385	386	385
query87	3631	3360	3330	3330
query88	3106	2263	2267	2263
query89	321	257	248	248
query90	1957	213	192	192
query91	127	88	94	88
query92	60	50	52	50
query93	2300	497	441	441
query94	849	188	190	188
query95	466	421	401	401
query96	636	317	313	313
query97	4266	4147	4147	4147
query98	212	195	190	190
query99	1092	861	839	839
Total cold run time: 278224 ms
Total hot run time: 179035 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.36 seconds
stream load tsv: 563 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.4 seconds inserted 10000000 Rows, about 352K ops/s
storage size: 17187670564 Bytes

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 3, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2024

PR approved by at least one committer and no changes requested.

@dataroaring dataroaring merged commit d6cb2d6 into apache:master Jan 3, 2024
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.0-merged kind/behavior-changed reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants