Skip to content

[Fix](Variant) add more info before crash in serialization#29344

Merged
xiaokang merged 1 commit intoapache:masterfrom
eldenmoon:tmp-fix
Dec 31, 2023
Merged

[Fix](Variant) add more info before crash in serialization#29344
xiaokang merged 1 commit intoapache:masterfrom
eldenmoon:tmp-fix

Conversation

@eldenmoon
Copy link
Member

This is a temporary fix to prevent from crash, to prevent from blocking 2.1 and add more debug info to trace crash

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@eldenmoon
Copy link
Member Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit b76691952deeb32a1b31706f20d84d4b88522338, data reload: false

------ Round 1 ----------------------------------
q1	17674	5054	5134	5054
q2	2016	167	151	151
q3	10519	1072	1146	1072
q4	10200	822	811	811
q5	7762	2889	2867	2867
q6	203	131	129	129
q7	915	555	497	497
q8	9255	2003	1996	1996
q9	6773	6351	6382	6351
q10	8222	3031	2972	2972
q11	420	227	213	213
q12	386	232	230	230
q13	18008	3607	3623	3607
q14	246	206	207	206
q15	598	544	535	535
q16	477	396	395	395
q17	968	437	478	437
q18	7272	6620	6667	6620
q19	1572	1381	1427	1381
q20	743	330	352	330
q21	2778	2369	2400	2369
q22	361	324	340	324
Total cold run time: 107368 ms
Total hot run time: 38547 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5082	5383	5042	5042
q2	340	235	255	235
q3	3276	3266	3222	3222
q4	2090	1974	1996	1974
q5	5780	5786	5762	5762
q6	208	123	123	123
q7	2288	1883	1896	1883
q8	3349	3418	3455	3418
q9	8723	8771	8708	8708
q10	3772	3829	3847	3829
q11	586	489	469	469
q12	808	649	655	649
q13	6639	3212	3177	3177
q14	308	276	265	265
q15	609	542	533	533
q16	568	495	511	495
q17	1906	1777	1764	1764
q18	8618	8365	8225	8225
q19	1635	1582	1554	1554
q20	2193	1978	1949	1949
q21	5472	5237	5209	5209
q22	572	531	504	504
Total cold run time: 64822 ms
Total hot run time: 58989 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.63% (8614/23517)
Line Coverage: 28.69% (70033/244125)
Region Coverage: 27.67% (36242/130959)
Branch Coverage: 24.38% (18523/75982)
Coverage Report: http://coverage.selectdb-in.cc/coverage/b76691952deeb32a1b31706f20d84d4b88522338_b76691952deeb32a1b31706f20d84d4b88522338/report/index.html

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit b76691952deeb32a1b31706f20d84d4b88522338, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5460	5151	5114	5114
q2	384	172	150	150
q3	1479	1235	1218	1218
q4	1079	855	843	843
q5	3129	3067	3058	3058
q6	228	129	128	128
q7	994	555	542	542
q8	2163	2261	2219	2219
q9	6722	6685	6636	6636
q10	3150	3125	3062	3062
q11	356	221	221	221
q12	381	240	231	231
q13	4423	3621	3666	3621
q14	261	211	219	211
q15	600	554	550	550
q16	463	398	400	398
q17	1046	592	509	509
q18	7044	6732	6736	6732
q19	1651	1559	1557	1557
q20	570	346	336	336
q21	2919	2446	2468	2446
q22	404	324	338	324
Total cold run time: 44906 ms
Total hot run time: 40106 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5151	5070	5040	5040
q2	328	249	272	249
q3	3369	3325	3328	3325
q4	2131	2016	2046	2016
q5	5958	5928	5923	5923
q6	226	124	128	124
q7	2377	1908	1932	1908
q8	3543	3663	3703	3663
q9	8995	8956	9008	8956
q10	3878	3946	3903	3903
q11	586	489	484	484
q12	814	697	647	647
q13	3899	3236	3212	3212
q14	300	262	286	262
q15	606	536	553	536
q16	556	518	512	512
q17	2032	1837	1841	1837
q18	8715	8356	8437	8356
q19	1741	1710	1721	1710
q20	2273	2006	1998	1998
q21	5657	5353	5349	5349
q22	569	538	505	505
Total cold run time: 63704 ms
Total hot run time: 60515 ms

@doris-robot
Copy link

TPC-DS test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpcds-tools

TPC-DS sf100 test result on commit b76691952deeb32a1b31706f20d84d4b88522338, data reload: false

run tpcds-sf100 query with default conf and session variables
query1	927	357	344	344
query2	6405	1923	1953	1923
query3	6646	206	201	201
query4	27864	22497	22462	22462
query5	5293	520	533	520
query6	268	191	190	190
query7	4579	283	285	283
query8	231	207	197	197
query9	8289	2830	2617	2617
query10	454	277	257	257
query11	16324	15523	15619	15523
query12	136	80	78	78
query13	1630	323	328	323
query14	11557	7158	7077	7077
query15	235	196	198	196
query16	6454	277	275	275
query17	1817	491	505	491
query18	1917	279	257	257
query19	272	156	142	142
query20	80	80	78	78
query21	192	96	96	96
query22	5172	4970	4761	4761
query23	32116	31360	31107	31107
query24	12127	2803	2786	2786
query25	585	349	342	342
query26	1702	146	147	146
query27	2926	280	284	280
query28	7052	2042	2026	2026
query29	2006	413	411	411
query30	288	149	145	145
query31	965	772	770	770
query32	87	61	60	60
query33	732	290	276	276
query34	866	450	447	447
query35	883	791	790	790
query36	1290	1223	1196	1196
query37	185	82	77	77
query38	3373	3272	3276	3272
query39	1343	1300	1282	1282
query40	302	93	88	88
query41	36	34	33	33
query42	96	94	95	94
query43	566	486	531	486
query44	1133	726	735	726
query45	198	184	180	180
query46	1058	660	672	660
query47	1691	1628	1606	1606
query48	341	269	271	269
query49	1219	334	325	325
query50	809	374	341	341
query51	5358	5261	5224	5224
query52	94	87	91	87
query53	216	154	151	151
query54	1447	582	592	582
query55	99	100	94	94
query56	213	202	202	202
query57	1046	908	997	908
query58	230	208	216	208
query59	2775	2651	2616	2616
query60	239	235	225	225
query61	85	82	81	81
query62	601	459	476	459
query63	161	152	150	150
query64	5926	1765	1754	1754
query65	3349	3271	3254	3254
query66	1316	340	329	329
query67	15839	15505	15443	15443
query68	12671	524	529	524
query69	503	260	262	260
query70	1701	1542	1505	1505
query71	497	234	237	234
query72	5670	3574	3565	3565
query73	2872	326	317	317
query74	6963	6447	6395	6395
query75	5193	2294	2272	2272
query76	6264	1065	1136	1065
query77	660	267	299	267
query78	9179	8915	8629	8629
query79	1054	523	509	509
query80	614	394	375	375
query81	473	212	223	212
query82	215	105	113	105
query83	167	139	141	139
query84	247	56	56	56
query85	953	281	275	275
query86	409	389	393	389
query87	3529	3412	3360	3360
query88	3382	2282	2283	2282
query89	338	269	263	263
query90	1882	215	209	209
query91	121	95	91	91
query92	65	57	58	57
query93	1782	502	441	441
query94	809	196	194	194
query95	471	423	419	419
query96	628	326	321	321
query97	4307	4116	4189	4116
query98	216	208	197	197
query99	1122	852	845	845
Total cold run time: 296668 ms
Total hot run time: 180167 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.65 seconds
stream load tsv: 564 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.5 seconds inserted 10000000 Rows, about 350K ops/s
storage size: 17188821234 Bytes

@eldenmoon
Copy link
Member Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

This is a temporary fix to prevent from crash, to prevent from blocking 2.1 and add more debug info to trace crash
@eldenmoon
Copy link
Member Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit 6a733bcb71cdd0dcecae47785faffe969063f810, data reload: false

------ Round 1 ----------------------------------
q1	17647	5038	5078	5038
q2	2036	160	146	146
q3	10545	1078	1134	1078
q4	10173	751	824	751
q5	7811	2949	2977	2949
q6	208	130	131	130
q7	912	544	509	509
q8	9275	2003	2004	2003
q9	6844	6369	6341	6341
q10	8239	3004	2931	2931
q11	411	209	215	209
q12	388	237	238	237
q13	18010	3652	3664	3652
q14	248	213	200	200
q15	603	559	532	532
q16	457	417	402	402
q17	960	519	464	464
q18	7357	6698	6595	6595
q19	1568	1270	1349	1270
q20	694	331	354	331
q21	2858	2313	2393	2313
q22	359	329	323	323
Total cold run time: 107603 ms
Total hot run time: 38404 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5140	5103	5059	5059
q2	338	254	259	254
q3	3332	3269	3320	3269
q4	2153	2071	2034	2034
q5	6111	5990	5790	5790
q6	209	121	121	121
q7	2294	1914	1945	1914
q8	3360	3450	3472	3450
q9	8731	8714	8735	8714
q10	3755	3829	3848	3829
q11	591	481	493	481
q12	798	667	619	619
q13	8082	3236	3216	3216
q14	292	261	263	261
q15	604	531	529	529
q16	553	476	504	476
q17	1928	1762	1736	1736
q18	8649	8322	8243	8243
q19	1603	1544	1568	1544
q20	2181	1988	1981	1981
q21	5618	5286	5334	5286
q22	534	499	509	499
Total cold run time: 66856 ms
Total hot run time: 59305 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.64% (8616/23517)
Line Coverage: 28.69% (70049/244125)
Region Coverage: 27.68% (36255/130961)
Branch Coverage: 24.39% (18530/75984)
Coverage Report: http://coverage.selectdb-in.cc/coverage/6a733bcb71cdd0dcecae47785faffe969063f810_6a733bcb71cdd0dcecae47785faffe969063f810/report/index.html

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools

Tpch sf100 test result on commit 6a733bcb71cdd0dcecae47785faffe969063f810, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5402	5113	5171	5113
q2	390	176	177	176
q3	1445	1211	1176	1176
q4	1086	846	758	758
q5	3133	3107	3044	3044
q6	221	132	129	129
q7	982	520	534	520
q8	2154	2293	2240	2240
q9	6669	6663	6654	6654
q10	3157	3118	3150	3118
q11	338	217	208	208
q12	385	235	230	230
q13	4366	3644	3622	3622
q14	252	230	228	228
q15	593	552	564	552
q16	450	418	402	402
q17	1054	575	541	541
q18	7069	6774	6712	6712
q19	1629	1464	1572	1464
q20	595	364	342	342
q21	2849	2472	2456	2456
q22	401	322	324	322
Total cold run time: 44620 ms
Total hot run time: 40007 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	5126	5056	5044	5044
q2	332	252	242	242
q3	3395	3350	3296	3296
q4	2130	2014	2025	2014
q5	5951	5915	5948	5915
q6	229	126	121	121
q7	2375	1901	1912	1901
q8	3555	3633	3670	3633
q9	8992	8995	8978	8978
q10	3861	3887	3906	3887
q11	600	476	482	476
q12	814	633	665	633
q13	3881	3179	3182	3179
q14	309	278	279	278
q15	609	541	557	541
q16	539	511	523	511
q17	2069	1797	1821	1797
q18	8748	8291	8449	8291
q19	1741	1637	1680	1637
q20	2278	2008	1983	1983
q21	5680	5314	5314	5314
q22	582	536	494	494
Total cold run time: 63796 ms
Total hot run time: 60165 ms

@doris-robot
Copy link

TPC-DS test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpcds-tools

TPC-DS sf100 test result on commit 6a733bcb71cdd0dcecae47785faffe969063f810, data reload: false

run tpcds-sf100 query with default conf and session variables
query1	929	350	339	339
query2	6440	1961	1931	1931
query3	6657	207	206	206
query4	26396	22494	22268	22268
query5	6033	512	566	512
query6	267	182	185	182
query7	4583	278	279	278
query8	236	225	199	199
query9	8261	2708	2573	2573
query10	442	267	251	251
query11	16211	15721	15393	15393
query12	131	84	77	77
query13	1627	331	317	317
query14	11806	7023	7240	7023
query15	240	183	190	183
query16	6316	273	277	273
query17	1827	502	499	499
query18	1927	273	257	257
query19	271	143	137	137
query20	82	78	80	78
query21	183	96	93	93
query22	5115	4458	4763	4458
query23	31888	31168	31136	31136
query24	11865	2829	2838	2829
query25	579	351	347	347
query26	1708	142	143	142
query27	2866	280	293	280
query28	7076	1983	1973	1973
query29	2073	385	386	385
query30	288	141	147	141
query31	973	767	777	767
query32	85	62	62	62
query33	718	280	264	264
query34	866	440	446	440
query35	892	748	778	748
query36	1297	1203	1222	1203
query37	187	77	75	75
query38	3380	3324	3221	3221
query39	1317	1297	1270	1270
query40	304	90	92	90
query41	39	34	33	33
query42	93	92	95	92
query43	548	487	494	487
query44	1067	713	729	713
query45	192	191	184	184
query46	1070	628	625	625
query47	1694	1557	1529	1529
query48	342	271	265	265
query49	1188	329	324	324
query50	786	326	325	325
query51	5513	5292	5222	5222
query52	99	83	95	83
query53	213	152	143	143
query54	1368	556	586	556
query55	98	89	88	88
query56	209	204	195	195
query57	1035	966	908	908
query58	237	207	211	207
query59	2761	2622	2570	2570
query60	255	226	234	226
query61	85	82	81	81
query62	642	486	452	452
query63	176	158	151	151
query64	5930	1756	1704	1704
query65	3337	3264	3285	3264
query66	1241	328	337	328
query67	15531	15403	15253	15253
query68	12703	545	535	535
query69	521	251	255	251
query70	1748	1500	1508	1500
query71	491	228	233	228
query72	5545	3560	3560	3560
query73	2892	309	317	309
query74	7014	6376	6396	6376
query75	5232	2278	2265	2265
query76	6301	1124	1134	1124
query77	657	273	282	273
query78	9115	8716	8723	8716
query79	1044	516	516	516
query80	538	385	361	361
query81	452	208	207	207
query82	206	104	105	104
query83	183	138	134	134
query84	243	53	52	52
query85	959	281	274	274
query86	388	358	382	358
query87	3571	3370	3381	3370
query88	2946	2284	2272	2272
query89	348	264	274	264
query90	1850	212	211	211
query91	117	89	91	89
query92	63	55	55	55
query93	1385	519	500	500
query94	840	197	189	189
query95	457	423	418	418
query96	626	315	319	315
query97	4295	4165	4157	4157
query98	211	196	193	193
query99	1078	860	881	860
Total cold run time: 294110 ms
Total hot run time: 178944 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.6 seconds
stream load tsv: 561 seconds loaded 74807831229 Bytes, about 127 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.0 seconds inserted 10000000 Rows, about 357K ops/s
storage size: 17183642281 Bytes

Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Dec 31, 2023
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@xiaokang xiaokang merged commit fcc4cfb into apache:master Dec 31, 2023
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants