Skip to content

[Fix](Job)Concurrency may result in event loss#29385

Merged
JNSimba merged 1 commit intoapache:masterfrom
CalvinKirs:master-disruptor
Jan 2, 2024
Merged

[Fix](Job)Concurrency may result in event loss#29385
JNSimba merged 1 commit intoapache:masterfrom
CalvinKirs:master-disruptor

Conversation

@CalvinKirs
Copy link
Member

@CalvinKirs CalvinKirs commented Jan 2, 2024

Proposed changes

Our previous internal scheduling was single-threaded, without concurrency, so we adopted the single-producer mode, but later provided an external trigger method. At this time, concurrency would occasionally occur, resulting in message loss. Now we have switched to multi-producer, which is used by the bottom layer CAS controls concurrency

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@CalvinKirs
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit d96e861aeee41254e10715bf58f891f400cee351, data reload: false

------ Round 1 ----------------------------------
q1	18464	5439	5185	5185
q2	2040	159	147	147
q3	10602	1117	1192	1117
q4	10221	801	853	801
q5	7823	2948	2886	2886
q6	215	140	137	137
q7	940	551	499	499
q8	9308	2042	2054	2042
q9	6894	6454	6443	6443
q10	8240	3100	3093	3093
q11	435	223	218	218
q12	395	241	234	234
q13	18000	3663	3684	3663
q14	252	219	213	213
q15	569	534	538	534
q16	458	432	423	423
q17	1000	488	504	488
q18	7405	6758	6772	6758
q19	1587	1293	1386	1293
q20	735	352	351	351
q21	2864	2301	2479	2301
q22	382	314	324	314
Total cold run time: 108829 ms
Total hot run time: 39140 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5181	5051	5120	5051
q2	339	252	250	250
q3	3329	3279	3306	3279
q4	2144	2033	2034	2033
q5	5822	5837	5818	5818
q6	218	128	126	126
q7	2316	1920	1926	1920
q8	3415	3474	3463	3463
q9	8905	8772	8815	8772
q10	3812	3874	3839	3839
q11	603	487	468	468
q12	797	624	640	624
q13	7621	3239	3252	3239
q14	310	267	272	267
q15	608	543	527	527
q16	561	487	524	487
q17	1985	1771	1776	1771
q18	8786	8508	8406	8406
q19	1647	1573	1592	1573
q20	2225	2007	1955	1955
q21	5619	5248	5226	5226
q22	540	518	518	518
Total cold run time: 66783 ms
Total hot run time: 59612 ms

@doris-robot
Copy link

TPC-DS test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpcds-tools

TPC-DS sf100 test result on commit d96e861aeee41254e10715bf58f891f400cee351, data reload: false

run tpcds-sf100 query with default conf and session variables
query1	884	363	350	350
query2	5202	1806	2020	1806
query3	5720	213	200	200
query4	25917	22496	22442	22442
query5	3499	559	576	559
query6	254	183	182	182
query7	4413	277	263	263
query8	227	208	198	198
query9	7779	2593	2631	2593
query10	410	249	244	244
query11	16125	15642	15471	15471
query12	131	83	76	76
query13	1510	325	322	322
query14	11286	7334	7068	7068
query15	253	191	194	191
query16	6414	295	277	277
query17	1668	510	498	498
query18	1868	276	261	261
query19	281	136	138	136
query20	81	78	80	78
query21	169	96	96	96
query22	4804	4536	4852	4536
query23	32272	31275	31211	31211
query24	11947	2795	2809	2795
query25	575	345	350	345
query26	1598	148	144	144
query27	2806	278	286	278
query28	6625	1966	1959	1959
query29	1763	398	383	383
query30	292	145	151	145
query31	968	777	795	777
query32	88	61	58	58
query33	709	276	258	258
query34	884	455	438	438
query35	899	820	784	784
query36	1349	1171	1219	1171
query37	164	79	74	74
query38	3376	3257	3273	3257
query39	1313	1313	1279	1279
query40	270	93	89	89
query41	45	38	35	35
query42	99	95	96	95
query43	530	488	470	470
query44	1073	717	730	717
query45	201	188	189	188
query46	1083	650	638	638
query47	1710	1638	1620	1620
query48	338	257	260	257
query49	1081	326	330	326
query50	793	361	334	334
query51	5428	5340	5286	5286
query52	95	84	92	84
query53	225	149	148	148
query54	1147	577	580	577
query55	96	88	92	88
query56	212	200	203	200
query57	1040	924	912	912
query58	222	209	208	208
query59	2808	2604	2546	2546
query60	250	223	242	223
query61	87	81	83	81
query62	679	457	458	457
query63	174	156	159	156
query64	5511	1781	1699	1699
query65	3350	3292	3273	3273
query66	1367	341	332	332
query67	15927	15824	15298	15298
query68	12783	537	527	527
query69	513	253	260	253
query70	1637	1433	1498	1433
query71	484	230	232	230
query72	5648	3523	3493	3493
query73	2991	317	312	312
query74	7018	6408	6441	6408
query75	5528	2277	2250	2250
query76	6274	1160	1116	1116
query77	668	268	270	268
query78	9111	8887	8623	8623
query79	1029	514	509	509
query80	525	378	376	376
query81	468	208	207	207
query82	196	104	107	104
query83	164	136	137	136
query84	248	55	55	55
query85	934	277	273	273
query86	387	368	373	368
query87	3531	3348	3412	3348
query88	3050	2260	2277	2260
query89	337	262	262	262
query90	1952	218	216	216
query91	119	89	89	89
query92	61	58	53	53
query93	1543	518	495	495
query94	840	192	191	191
query95	474	433	405	405
query96	635	316	318	316
query97	4272	4165	4156	4156
query98	209	199	186	186
query99	1111	790	819	790
Total cold run time: 287104 ms
Total hot run time: 179248 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.61 seconds
stream load tsv: 579 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.6 seconds inserted 10000000 Rows, about 349K ops/s
storage size: 17183560407 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit d96e861aeee41254e10715bf58f891f400cee351, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5484	5149	5179	5149
q2	390	173	158	158
q3	1465	1188	1170	1170
q4	1087	837	832	832
q5	3125	3139	3108	3108
q6	227	138	131	131
q7	989	550	521	521
q8	2160	2191	2273	2191
q9	6680	6688	6660	6660
q10	3199	3122	3147	3122
q11	342	237	220	220
q12	390	248	242	242
q13	4392	3623	3633	3623
q14	247	213	210	210
q15	613	551	544	544
q16	458	389	405	389
q17	1054	605	527	527
q18	7064	6727	6720	6720
q19	1646	1522	1480	1480
q20	572	367	355	355
q21	2900	2482	2454	2454
q22	389	307	318	307
Total cold run time: 44873 ms
Total hot run time: 40113 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5151	5082	5085	5082
q2	345	250	257	250
q3	3367	3350	3316	3316
q4	2142	2017	1994	1994
q5	5928	5916	5899	5899
q6	223	127	125	125
q7	2369	1909	1946	1909
q8	3567	3647	3676	3647
q9	9011	8989	8998	8989
q10	3857	3903	3894	3894
q11	594	508	482	482
q12	803	648	633	633
q13	3901	3189	3226	3189
q14	297	277	292	277
q15	614	547	544	544
q16	530	537	502	502
q17	2036	1787	1804	1787
q18	8773	8457	8354	8354
q19	1742	1687	1711	1687
q20	2292	2006	1974	1974
q21	5713	5200	5391	5200
q22	557	508	517	508
Total cold run time: 63812 ms
Total hot run time: 60242 ms

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 2, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2024

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2024

PR approved by anyone and no changes requested.

Copy link
Member

@JNSimba JNSimba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JNSimba JNSimba merged commit eac9600 into apache:master Jan 2, 2024
seawinde pushed a commit to seawinde/doris that referenced this pull request Jan 3, 2024
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants