Skip to content

[fix](injection) don't disturb CLOSE_LOAD message in LoadStream#30097

Merged
dataroaring merged 1 commit intoapache:masterfrom
kaijchen:fix-load-stream-injection
Jan 19, 2024
Merged

[fix](injection) don't disturb CLOSE_LOAD message in LoadStream#30097
dataroaring merged 1 commit intoapache:masterfrom
kaijchen:fix-load-stream-injection

Conversation

@kaijchen
Copy link
Member

@kaijchen kaijchen commented Jan 18, 2024

Proposed changes

Although it's OK to disturb CLOSE_LOAD message in LoadStream,
doing so will also make these cases run very slow (till close wait timeout).

This PR removes fault injection to CLOSE_LOAD message.
Missing CLOSE_LOAD will only be tested in cases for close_wait timeout.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@kaijchen
Copy link
Member Author

run buildall

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 18, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 38605 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5f1fdaae0f6cfeeb05f0e5e960abdd6b9fe9ff85, data reload: false

------ Round 1 ----------------------------------
q1	17635	5231	5299	5231
q2	2041	147	131	131
q3	10632	1159	1128	1128
q4	10224	770	844	770
q5	7753	3111	3140	3111
q6	198	121	120	120
q7	867	485	480	480
q8	9212	1884	1929	1884
q9	7295	6385	6317	6317
q10	8189	3026	3060	3026
q11	407	212	197	197
q12	354	192	189	189
q13	17989	3350	3340	3340
q14	249	215	211	211
q15	555	519	498	498
q16	479	384	376	376
q17	931	529	538	529
q18	7418	6993	6715	6715
q19	1573	1341	1372	1341
q20	601	299	316	299
q21	2740	2433	2407	2407
q22	355	305	321	305
Total cold run time: 107697 ms
Total hot run time: 38605 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5598	5498	5438	5438
q2	325	219	213	213
q3	3342	3210	3209	3209
q4	2084	2091	2033	2033
q5	5930	5884	5744	5744
q6	196	118	112	112
q7	2328	1803	1818	1803
q8	3199	3366	3379	3366
q9	8920	8920	8858	8858
q10	3898	3803	3791	3791
q11	543	454	447	447
q12	784	610	605	605
q13	16912	3178	3119	3119
q14	292	243	264	243
q15	546	504	512	504
q16	508	467	473	467
q17	1861	1825	1843	1825
q18	9479	9728	9567	9567
q19	21284	1552	1486	1486
q20	4635	1926	1930	1926
q21	14186	5480	5351	5351
q22	947	543	558	543
Total cold run time: 107797 ms
Total hot run time: 60650 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.69% (8658/23598)
Line Coverage: 28.75% (70697/245884)
Region Coverage: 27.64% (36469/131946)
Branch Coverage: 24.35% (18658/76616)
Coverage Report: http://coverage.selectdb-in.cc/coverage/5f1fdaae0f6cfeeb05f0e5e960abdd6b9fe9ff85_5f1fdaae0f6cfeeb05f0e5e960abdd6b9fe9ff85/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 176385 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5f1fdaae0f6cfeeb05f0e5e960abdd6b9fe9ff85, data reload: false

query1	922	334	325	325
query2	6573	1972	1950	1950
query3	6696	201	202	201
query4	34101	22078	22265	22078
query5	6895	541	606	541
query6	259	181	167	167
query7	4600	266	259	259
query8	218	170	173	170
query9	8478	2539	2542	2539
query10	419	222	224	222
query11	17003	15469	15478	15469
query12	123	66	71	66
query13	1728	372	378	372
query14	10442	6814	6939	6814
query15	212	184	181	181
query16	4598	233	225	225
query17	995	478	467	467
query18	1789	256	253	253
query19	176	138	130	130
query20	73	71	64	64
query21	188	132	129	129
query22	4808	4751	4815	4751
query23	31500	30810	30768	30768
query24	11488	2850	2768	2768
query25	567	318	303	303
query26	1622	141	147	141
query27	3211	267	277	267
query28	7242	1826	1819	1819
query29	1474	611	629	611
query30	278	137	138	137
query31	929	746	769	746
query32	74	50	49	49
query33	704	213	205	205
query34	1133	455	460	455
query35	877	778	762	762
query36	1340	1226	1202	1202
query37	94	63	59	59
query38	3377	3250	3235	3235
query39	1313	1277	1268	1268
query40	210	84	80	80
query41	38	38	34	34
query42	92	77	85	77
query43	531	480	497	480
query44	1047	681	693	681
query45	199	178	170	170
query46	1055	635	658	635
query47	1677	1532	1513	1513
query48	391	323	313	313
query49	1126	281	284	281
query50	688	309	302	302
query51	5315	5201	5186	5186
query52	94	79	75	75
query53	318	250	252	250
query54	879	451	440	440
query55	78	76	77	76
query56	171	173	172	172
query57	988	968	957	957
query58	192	163	162	162
query59	3033	2723	2648	2648
query60	216	182	197	182
query61	90	82	86	82
query62	623	390	360	360
query63	274	255	251	251
query64	5061	1772	1769	1769
query65	3340	3227	3242	3227
query66	1282	323	306	306
query67	15578	15063	15077	15063
query68	14141	520	513	513
query69	626	306	294	294
query70	2227	1503	1418	1418
query71	550	214	203	203
query72	5004	2788	2828	2788
query73	3738	318	321	318
query74	7299	6387	6468	6387
query75	5278	2252	2322	2252
query76	6231	1068	1029	1029
query77	738	230	230	230
query78	9428	8574	8847	8574
query79	2458	493	523	493
query80	594	312	307	307
query81	469	202	208	202
query82	204	77	81	77
query83	164	120	121	120
query84	277	74	66	66
query85	1072	320	318	318
query86	383	390	386	386
query87	3569	3372	3306	3306
query88	2909	2156	2145	2145
query89	420	354	350	350
query90	2181	182	182	182
query91	154	130	132	130
query92	57	42	45	42
query93	1333	435	417	417
query94	1284	157	155	155
query95	499	442	456	442
query96	624	311	320	311
query97	4287	4140	4165	4140
query98	203	191	183	183
query99	988	734	667	667
Total cold run time: 302494 ms
Total hot run time: 176385 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.33 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5f1fdaae0f6cfeeb05f0e5e960abdd6b9fe9ff85, data reload: false

query1	0.03	0.03	0.02
query2	0.05	0.02	0.03
query3	0.22	0.05	0.06
query4	1.70	0.08	0.08
query5	0.53	0.52	0.52
query6	1.28	0.62	0.63
query7	0.02	0.01	0.01
query8	0.04	0.02	0.02
query9	0.54	0.51	0.48
query10	0.57	0.56	0.56
query11	0.11	0.08	0.08
query12	0.11	0.10	0.09
query13	0.59	0.60	0.60
query14	0.75	0.82	0.77
query15	0.78	0.78	0.78
query16	0.37	0.36	0.38
query17	1.00	0.99	0.97
query18	0.23	0.26	0.24
query19	1.85	1.76	1.78
query20	0.01	0.01	0.01
query21	15.43	0.59	0.60
query22	2.37	2.32	2.37
query23	17.52	0.73	0.84
query24	2.35	1.22	1.11
query25	0.37	0.34	0.14
query26	0.49	0.13	0.14
query27	0.06	0.06	0.05
query28	11.33	0.81	0.76
query29	12.50	3.19	3.23
query30	0.54	0.48	0.44
query31	2.77	0.34	0.34
query32	3.39	0.48	0.48
query33	3.24	3.24	3.23
query34	15.82	4.34	4.24
query35	4.30	4.24	4.23
query36	1.12	1.07	1.07
query37	0.06	0.04	0.04
query38	0.03	0.02	0.03
query39	0.02	0.02	0.02
query40	0.18	0.12	0.12
query41	0.07	0.01	0.02
query42	0.02	0.01	0.02
query43	0.02	0.02	0.02
Total cold run time: 104.78 s
Total hot run time: 31.33 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 5f1fdaae0f6cfeeb05f0e5e960abdd6b9fe9ff85 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       14.8 seconds inserted 10000000 Rows, about 675K ops/s

});
// CLOSE_LOAD message should not be fault injected,
// otherwise the message will be ignored and causing close wait timeout
if (hdr.opcode() != PStreamHeader::CLOSE_LOAD) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        DBUG_EXECUTE_IF("LoadStream._dispatch.unknown_loadid", {
            if (hdr.opcode() != PStreamHeader::CLOSE_LOAD) {
                PUniqueId& load_id = const_cast<PUniqueId&>(hdr.load_id());
                load_id.set_hi(UNKNOWN_ID_FOR_TEST);
                load_id.set_lo(UNKNOWN_ID_FOR_TEST);
            }
        });

Placing the if condition into the debug point may be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.0 dev/3.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants