Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve](load) limit flush thread num proportional to CPU count #33325

Merged
merged 1 commit into from
Apr 27, 2024

Conversation

kaijchen
Copy link
Contributor

@kaijchen kaijchen commented Apr 7, 2024

Proposed changes

Add BE config max_flush_thread_num_per_cpu and max_high_priority_flush_thread_num_per_cpu .
The default value is 8.

The max threads for flush thread pool will be:

number of threads = min(flush_thread_num_per_store * num_store,
                        max_flush_thread_num_per_cpu * num_cpu)
number of threads = min(high_priority_flush_thread_num_per_store * num_store,
                        max_high_priority_flush_thread_num_per_cpu * num_cpu)

This is for preventing the flush thread pool to be too large when the machine has a lot of disks.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Copy link
Contributor

github-actions bot commented Apr 7, 2024

clang-tidy review says "All clean, LGTM! 👍"

@kaijchen
Copy link
Contributor Author

kaijchen commented Apr 7, 2024

run buildall

Copy link
Contributor

github-actions bot commented Apr 7, 2024

clang-tidy review says "All clean, LGTM! 👍"

@kaijchen kaijchen changed the title [config](load) add flush thread num total limit in BE config [improve](load) limit flush thread num by CPU count Apr 9, 2024
Copy link
Contributor

github-actions bot commented Apr 9, 2024

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
Copy link
Contributor

github-actions bot commented Apr 9, 2024

clang-tidy review says "All clean, LGTM! 👍"

@kaijchen
Copy link
Contributor Author

kaijchen commented Apr 9, 2024

run buildall

Copy link
Contributor

github-actions bot commented Apr 9, 2024

clang-tidy review says "All clean, LGTM! 👍"

@kaijchen kaijchen changed the title [improve](load) limit flush thread num by CPU count [improve](load) limit flush thread num proportional to CPU count Apr 9, 2024
@kaijchen
Copy link
Contributor Author

run buildall

1 similar comment
@kaijchen
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

2 similar comments
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@dataroaring
Copy link
Contributor

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 38504 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2028ebf3cd6d324259fbab9d219705823ce64d7a, data reload: false

------ Round 1 ----------------------------------
q1	17593	4493	4308	4308
q2	2011	206	193	193
q3	10406	1203	1139	1139
q4	10193	748	723	723
q5	7493	2707	2654	2654
q6	219	131	133	131
q7	1014	601	584	584
q8	9224	2061	2050	2050
q9	7583	6585	6528	6528
q10	8578	3521	3507	3507
q11	456	243	230	230
q12	541	219	213	213
q13	17757	2919	2975	2919
q14	281	226	234	226
q15	521	478	485	478
q16	516	378	377	377
q17	966	712	654	654
q18	7483	6789	6680	6680
q19	6777	1524	1462	1462
q20	671	318	301	301
q21	3447	2837	2840	2837
q22	368	316	310	310
Total cold run time: 114098 ms
Total hot run time: 38504 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4306	4164	4161	4161
q2	372	278	256	256
q3	2967	2745	2800	2745
q4	1906	1530	1570	1530
q5	5359	5368	5346	5346
q6	212	123	124	123
q7	2255	1903	1877	1877
q8	3233	3358	3308	3308
q9	8651	8617	8732	8617
q10	4138	3935	4086	3935
q11	609	502	493	493
q12	798	629	627	627
q13	16198	3317	3169	3169
q14	303	308	284	284
q15	533	485	484	484
q16	499	433	463	433
q17	1844	1511	1524	1511
q18	8336	7962	7764	7764
q19	1659	1557	1548	1548
q20	2043	1876	1830	1830
q21	5144	4947	4940	4940
q22	574	472	484	472
Total cold run time: 71939 ms
Total hot run time: 55453 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.45% (8907/25127)
Line Coverage: 27.14% (73148/269551)
Region Coverage: 26.27% (37813/143923)
Branch Coverage: 23.07% (19268/83510)
Coverage Report: http://coverage.selectdb-in.cc/coverage/2028ebf3cd6d324259fbab9d219705823ce64d7a_2028ebf3cd6d324259fbab9d219705823ce64d7a/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 185557 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2028ebf3cd6d324259fbab9d219705823ce64d7a, data reload: false

query1	899	380	376	376
query2	6228	2629	2303	2303
query3	6655	201	194	194
query4	23893	21274	21263	21263
query5	4111	423	412	412
query6	274	174	171	171
query7	4572	290	284	284
query8	221	174	183	174
query9	8493	2424	2387	2387
query10	431	238	253	238
query11	14697	14115	14163	14115
query12	134	87	84	84
query13	1630	360	366	360
query14	9716	7036	7866	7036
query15	307	189	185	185
query16	8198	263	270	263
query17	1915	570	555	555
query18	2118	280	272	272
query19	303	155	156	155
query20	95	86	91	86
query21	205	129	123	123
query22	5111	4892	4861	4861
query23	33918	33210	33375	33210
query24	11203	3029	3060	3029
query25	642	421	406	406
query26	1193	164	173	164
query27	2540	372	397	372
query28	7299	2142	2171	2142
query29	907	648	625	625
query30	281	200	182	182
query31	986	760	771	760
query32	95	54	59	54
query33	755	266	280	266
query34	1134	504	506	504
query35	843	692	717	692
query36	1068	950	949	949
query37	137	69	71	69
query38	3541	3364	3403	3364
query39	1648	1574	1584	1574
query40	220	126	122	122
query41	45	44	43	43
query42	108	96	96	96
query43	586	540	548	540
query44	1248	737	741	737
query45	289	283	272	272
query46	1112	748	717	717
query47	2050	1960	1950	1950
query48	374	312	319	312
query49	937	391	385	385
query50	783	410	401	401
query51	6888	6811	6725	6725
query52	103	90	88	88
query53	342	276	278	276
query54	292	230	226	226
query55	75	71	68	68
query56	239	219	228	219
query57	1219	1160	1144	1144
query58	220	201	193	193
query59	3321	3166	3121	3121
query60	258	233	243	233
query61	88	88	88	88
query62	608	444	436	436
query63	301	275	280	275
query64	5032	4018	3832	3832
query65	3045	3000	2999	2999
query66	748	335	328	328
query67	15414	15396	15301	15301
query68	6762	548	550	548
query69	531	301	298	298
query70	1275	1204	1182	1182
query71	1484	1261	1263	1261
query72	6533	2628	2482	2482
query73	724	322	322	322
query74	6906	6374	6464	6374
query75	3920	2664	2621	2621
query76	4989	991	990	990
query77	603	260	267	260
query78	10918	10147	10172	10147
query79	9992	522	521	521
query80	1801	430	480	430
query81	525	237	243	237
query82	860	100	91	91
query83	201	163	165	163
query84	263	86	88	86
query85	1318	285	304	285
query86	451	300	298	298
query87	3474	3281	3249	3249
query88	5255	2403	2398	2398
query89	528	365	366	365
query90	2011	183	181	181
query91	123	95	96	95
query92	56	47	45	45
query93	7424	515	503	503
query94	1049	179	179	179
query95	389	294	300	294
query96	620	270	269	269
query97	3132	2942	2925	2925
query98	232	215	211	211
query99	1229	861	865	861
Total cold run time: 299919 ms
Total hot run time: 185557 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.8 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2028ebf3cd6d324259fbab9d219705823ce64d7a, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.03
query3	0.23	0.05	0.05
query4	1.68	0.07	0.08
query5	0.50	0.50	0.51
query6	1.48	0.73	0.72
query7	0.02	0.02	0.01
query8	0.05	0.05	0.04
query9	0.54	0.50	0.50
query10	0.54	0.56	0.55
query11	0.16	0.11	0.12
query12	0.13	0.12	0.13
query13	0.61	0.58	0.58
query14	0.75	0.77	0.76
query15	0.82	0.82	0.81
query16	0.35	0.38	0.34
query17	1.02	0.99	0.97
query18	0.20	0.24	0.23
query19	1.77	1.65	1.71
query20	0.02	0.01	0.01
query21	15.44	0.64	0.63
query22	4.29	7.47	1.52
query23	18.28	1.33	1.16
query24	1.74	0.27	0.22
query25	0.14	0.08	0.09
query26	0.27	0.17	0.16
query27	0.08	0.08	0.07
query28	13.28	1.00	0.98
query29	12.59	3.31	3.26
query30	0.26	0.07	0.06
query31	2.84	0.37	0.38
query32	3.30	0.47	0.47
query33	2.80	2.87	2.79
query34	17.09	4.42	4.39
query35	4.58	4.52	4.54
query36	0.63	0.46	0.46
query37	0.18	0.15	0.15
query38	0.15	0.15	0.14
query39	0.05	0.03	0.03
query40	0.17	0.14	0.14
query41	0.09	0.05	0.04
query42	0.06	0.04	0.05
query43	0.04	0.04	0.04
Total cold run time: 109.34 s
Total hot run time: 29.8 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 2028ebf3cd6d324259fbab9d219705823ce64d7a with default session variables
Stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       14.1 seconds inserted 10000000 Rows, about 709K ops/s

@@ -203,15 +203,18 @@ void FlushToken::_flush_memtable(std::unique_ptr<MemTable> memtable_ptr, int32_t

void MemTableFlushExecutor::init(int num_disk) {
num_disk = std::max(1, num_disk);
size_t min_threads = std::max(1, config::flush_thread_num_per_store);
size_t max_threads = num_disk * min_threads;
int num_cpus = static_cast<int>(std::thread::hardware_concurrency());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the value is not well defined or not computable,hardware_concurrency may return 0. We should process this case.

@@ -662,6 +662,12 @@ DEFINE_mInt64(storage_flood_stage_left_capacity_bytes, "1073741824"); // 1GB
DEFINE_Int32(flush_thread_num_per_store, "6");
// number of thread for flushing memtable per store, for high priority load task
DEFINE_Int32(high_priority_flush_thread_num_per_store, "6");
// number of threads = min(flush_thread_num_per_store * num_store,
// max_flush_thread_num_per_cpu * num_cpu)
DEFINE_Int32(max_flush_thread_num_per_cpu, "8");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value of 8 is a bit large, and 2 or 4 may be more suitable.

be/src/common/config.cpp Outdated Show resolved Hide resolved
be/src/common/config.cpp Outdated Show resolved Hide resolved
be/src/olap/memtable_flush_executor.cpp Outdated Show resolved Hide resolved
@kaijchen
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.46% (8921/25161)
Line Coverage: 27.16% (73323/270004)
Region Coverage: 26.27% (37860/144113)
Branch Coverage: 23.07% (19290/83612)
Coverage Report: http://coverage.selectdb-in.cc/coverage/f7f3f2cd9e15ae8e804f7d3d5ddc008492634bba_f7f3f2cd9e15ae8e804f7d3d5ddc008492634bba/report/index.html

@kaijchen
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

4 similar comments
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.20% (8918/25337)
Line Coverage: 26.97% (73323/271867)
Region Coverage: 26.15% (37881/144875)
Branch Coverage: 22.97% (19294/83990)
Coverage Report: http://coverage.selectdb-in.cc/coverage/a74a8bbfa69e8b0abde7ebf33879dc9e5badc1dc_a74a8bbfa69e8b0abde7ebf33879dc9e5badc1dc/report/index.html

@kaijchen
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.19% (8918/25342)
Line Coverage: 26.97% (73335/271879)
Region Coverage: 26.15% (37886/144892)
Branch Coverage: 22.97% (19293/83996)
Coverage Report: http://coverage.selectdb-in.cc/coverage/66c0039e6be44f77479ed2ee0002e309fbe87f73_66c0039e6be44f77479ed2ee0002e309fbe87f73/report/index.html

be/src/common/config.cpp Outdated Show resolved Hide resolved
be/src/common/config.h Outdated Show resolved Hide resolved
be/src/olap/memtable_flush_executor.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 26, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@liaoxin01
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41718 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit cb7bba63f12ae2055ec01a19ac601b9215a44db7, data reload: false

------ Round 1 ----------------------------------
q1	17590	4331	4258	4258
q2	2012	189	190	189
q3	10486	1235	1195	1195
q4	10196	849	869	849
q5	7520	2778	2737	2737
q6	219	132	132	132
q7	1078	623	632	623
q8	9215	2182	2104	2104
q9	9262	6854	6854	6854
q10	9190	3886	3879	3879
q11	444	241	247	241
q12	463	236	229	229
q13	17253	3166	3193	3166
q14	287	228	233	228
q15	510	472	461	461
q16	501	391	395	391
q17	1000	681	740	681
q18	8508	7799	7806	7799
q19	5448	1522	1525	1522
q20	638	329	326	326
q21	5228	3584	4174	3584
q22	341	276	270	270
Total cold run time: 117389 ms
Total hot run time: 41718 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4575	4417	4436	4417
q2	359	270	267	267
q3	3167	2979	2924	2924
q4	1896	1597	1544	1544
q5	5519	5545	5547	5545
q6	219	121	121	121
q7	2379	1974	1983	1974
q8	3245	3430	3469	3430
q9	8788	8914	8883	8883
q10	4010	3769	3868	3769
q11	607	492	482	482
q12	810	629	603	603
q13	15923	3173	3182	3173
q14	305	302	284	284
q15	536	487	502	487
q16	491	438	431	431
q17	1814	1477	1457	1457
q18	7754	7548	7611	7548
q19	1678	1567	1556	1556
q20	1967	1762	1802	1762
q21	10262	4866	4880	4866
q22	567	479	492	479
Total cold run time: 76871 ms
Total hot run time: 56002 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185330 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit cb7bba63f12ae2055ec01a19ac601b9215a44db7, data reload: false

query1	901	364	344	344
query2	6442	2322	2389	2322
query3	6632	213	208	208
query4	23496	21160	21285	21160
query5	4109	405	407	405
query6	258	190	181	181
query7	4588	284	289	284
query8	239	182	190	182
query9	8563	2354	2310	2310
query10	421	241	260	241
query11	14791	14283	14298	14283
query12	140	90	86	86
query13	1634	368	371	368
query14	10722	7876	7484	7484
query15	251	171	177	171
query16	8101	262	272	262
query17	1736	591	572	572
query18	2105	282	269	269
query19	207	149	155	149
query20	93	86	84	84
query21	199	132	126	126
query22	4990	4875	4802	4802
query23	33602	32898	33316	32898
query24	6553	2960	2976	2960
query25	572	393	393	393
query26	695	151	148	148
query27	1909	320	332	320
query28	3545	2020	2013	2013
query29	888	594	590	590
query30	251	151	151	151
query31	941	725	733	725
query32	91	51	51	51
query33	478	239	243	239
query34	869	469	474	469
query35	765	681	670	670
query36	1059	917	929	917
query37	102	64	68	64
query38	3146	3073	2979	2979
query39	1580	1542	1525	1525
query40	207	124	125	124
query41	42	39	37	37
query42	101	94	97	94
query43	567	538	535	535
query44	1060	726	751	726
query45	281	267	226	226
query46	1060	712	702	702
query47	1913	1852	1857	1852
query48	364	289	292	289
query49	760	395	396	395
query50	778	384	389	384
query51	6829	6681	6703	6681
query52	99	89	90	89
query53	341	285	281	281
query54	253	231	233	231
query55	76	72	72	72
query56	235	217	217	217
query57	1216	1124	1137	1124
query58	209	196	191	191
query59	3439	3145	3291	3145
query60	255	233	228	228
query61	88	88	109	88
query62	584	449	433	433
query63	302	269	275	269
query64	8160	7247	7087	7087
query65	3064	2997	3058	2997
query66	787	335	329	329
query67	15843	15013	14894	14894
query68	8423	535	541	535
query69	542	309	302	302
query70	1252	1143	1086	1086
query71	469	268	264	264
query72	7540	2588	2403	2403
query73	729	320	325	320
query74	6464	6088	6009	6009
query75	4356	2690	2651	2651
query76	4720	996	1013	996
query77	694	263	262	262
query78	11014	10266	10088	10088
query79	8267	517	520	517
query80	1147	430	426	426
query81	492	218	225	218
query82	777	95	93	93
query83	207	164	161	161
query84	269	80	83	80
query85	939	265	257	257
query86	373	294	304	294
query87	3252	3084	3045	3045
query88	4927	2323	2317	2317
query89	535	386	361	361
query90	2010	182	180	180
query91	125	99	97	97
query92	63	46	50	46
query93	6068	515	489	489
query94	965	184	179	179
query95	386	296	297	296
query96	603	260	262	260
query97	3129	2922	2941	2922
query98	227	217	217	217
query99	1272	864	834	834
Total cold run time: 288813 ms
Total hot run time: 185330 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.18% (8919/25351)
Line Coverage: 26.97% (73347/271982)
Region Coverage: 26.14% (37889/144947)
Branch Coverage: 22.95% (19286/84018)
Coverage Report: http://coverage.selectdb-in.cc/coverage/cb7bba63f12ae2055ec01a19ac601b9215a44db7_cb7bba63f12ae2055ec01a19ac601b9215a44db7/report/index.html

Copy link
Collaborator

@gavinchou gavinchou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement

Copy link
Contributor

PR approved by anyone and no changes requested.

@liaoxin01 liaoxin01 merged commit a9040c8 into apache:master Apr 27, 2024
24 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.x reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants