Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](mtmv) Fix getting related partition table wrongly when multi base partition table exists #34781

Merged

Conversation

seawinde
Copy link
Contributor

@seawinde seawinde commented May 13, 2024

Proposed changes

Fix getting related partition table wrongly when multi base partition table exists
such as base table def is as following:

CREATE TABLE `test1` (
`pre_batch_no` VARCHAR(100) NULL COMMENT 'pre_batch_no',
`batch_no` VARCHAR(100) NULL COMMENT 'batch_no',
`vin_type1` VARCHAR(50) NULL COMMENT 'vin',
`upgrade_day` date COMMENT 'upgrade_day'
) ENGINE=OLAP
unique KEY(`pre_batch_no`,`batch_no`, `vin_type1`, `upgrade_day`)
COMMENT 'OLAP'
PARTITION BY RANGE(`upgrade_day`)
(
FROM ("2024-03-20") TO ("2024-03-31") INTERVAL 1 DAY
)
DISTRIBUTED BY HASH(`vin_type1`) BUCKETS 10
PROPERTIES (
       "replication_num" = "1"
);
CREATE TABLE `test2` (
`batch_no` VARCHAR(100) NULL COMMENT 'batch_no',
`vin_type2` VARCHAR(50) NULL COMMENT 'vin',
`status` VARCHAR(50) COMMENT 'status',
`upgrade_day` date  not null COMMENT 'upgrade_day' 
) ENGINE=OLAP
Duplicate KEY(`batch_no`,`vin_type2`)
COMMENT 'OLAP'
PARTITION BY RANGE(`upgrade_day`)
(
FROM ("2024-01-01") TO ("2024-01-10") INTERVAL 1 DAY
)
DISTRIBUTED BY HASH(`vin_type2`) BUCKETS 10
PROPERTIES (
       "replication_num" = "1"
);

if you create partition mv which partition by t1.upgrade_day as following it will be successful

select 
  t1.upgrade_day, 
  t1.batch_no, 
  t1.vin_type1 
from 
  (
    SELECT 
      batch_no, 
      vin_type1, 
      upgrade_day 
    FROM 
      test1 
    where 
      batch_no like 'c%' 
    group by 
      batch_no, 
      vin_type1, 
      upgrade_day
  ) t1 
  left join (
    select 
      batch_no, 
      vin_type2, 
      status 
    from 
      test2 
    group by 
      batch_no, 
      vin_type2, 
      status
  ) t2 on t1.vin_type1 = t2.vin_type2;

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41800 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a4ad559018016e0ff58b685f3ce68e57d785bf23, data reload: false

------ Round 1 ----------------------------------
q1	6446	4328	4282	4282
q2	653	194	192	192
q3	3250	1129	1153	1129
q4	1026	803	781	781
q5	2664	2719	2629	2629
q6	228	139	140	139
q7	1010	585	582	582
q8	2102	2362	2166	2166
q9	7067	6803	6765	6765
q10	4120	3873	3972	3873
q11	371	249	233	233
q12	410	255	245	245
q13	16664	2971	3088	2971
q14	282	244	224	224
q15	521	488	506	488
q16	491	401	401	401
q17	1002	686	694	686
q18	8316	7710	7738	7710
q19	2139	1565	1596	1565
q20	555	333	329	329
q21	5252	4126	4232	4126
q22	370	284	299	284
Total cold run time: 64939 ms
Total hot run time: 41800 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4482	4447	4382	4382
q2	374	262	279	262
q3	3182	2927	2875	2875
q4	1887	1603	1585	1585
q5	5274	5301	5298	5298
q6	219	126	128	126
q7	2240	1889	1879	1879
q8	3210	3359	3340	3340
q9	8460	8393	8336	8336
q10	3918	3691	3745	3691
q11	586	477	490	477
q12	778	583	606	583
q13	11428	3014	2972	2972
q14	288	263	278	263
q15	512	462	485	462
q16	485	425	419	419
q17	1782	1489	1469	1469
q18	7626	7703	7537	7537
q19	1679	1509	1551	1509
q20	1959	1801	1729	1729
q21	4950	4846	4821	4821
q22	555	488	504	488
Total cold run time: 65874 ms
Total hot run time: 54503 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187483 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a4ad559018016e0ff58b685f3ce68e57d785bf23, data reload: false

query1	902	373	348	348
query2	6445	2549	2395	2395
query3	6642	209	208	208
query4	25361	21251	21269	21251
query5	4119	422	424	422
query6	262	178	175	175
query7	4582	298	305	298
query8	241	188	188	188
query9	8420	2394	2396	2394
query10	437	254	256	254
query11	14756	14182	14235	14182
query12	134	94	92	92
query13	1635	371	374	371
query14	8996	7679	7758	7679
query15	224	184	180	180
query16	7149	266	263	263
query17	1310	565	553	553
query18	1913	277	266	266
query19	202	156	157	156
query20	88	86	82	82
query21	197	132	128	128
query22	4976	4783	4808	4783
query23	34190	33566	33541	33541
query24	5085	2932	2871	2871
query25	471	358	379	358
query26	683	160	159	159
query27	1832	324	332	324
query28	3925	2086	2053	2053
query29	827	628	624	624
query30	228	156	153	153
query31	908	764	747	747
query32	62	55	56	55
query33	425	255	253	253
query34	888	497	491	491
query35	754	678	688	678
query36	1028	917	936	917
query37	105	65	69	65
query38	2891	2734	2780	2734
query39	1637	1621	1567	1567
query40	208	130	130	130
query41	43	38	36	36
query42	102	95	98	95
query43	586	575	559	559
query44	1086	731	745	731
query45	265	256	242	242
query46	1074	744	732	732
query47	1963	1845	1847	1845
query48	382	300	295	295
query49	780	401	390	390
query50	776	391	421	391
query51	6904	6802	6707	6707
query52	106	95	90	90
query53	360	289	287	287
query54	541	422	441	422
query55	76	76	78	76
query56	243	221	231	221
query57	1223	1140	1129	1129
query58	216	204	220	204
query59	3452	3329	3386	3329
query60	257	238	248	238
query61	90	89	86	86
query62	567	505	468	468
query63	320	312	298	298
query64	7751	7378	7386	7378
query65	3168	3114	3109	3109
query66	652	372	353	353
query67	15293	15117	14798	14798
query68	4508	539	545	539
query69	466	359	303	303
query70	1200	1099	1086	1086
query71	403	265	267	265
query72	7912	2615	2336	2336
query73	714	324	326	324
query74	6479	6138	6216	6138
query75	3310	2634	2663	2634
query76	2204	913	935	913
query77	421	266	265	265
query78	10641	10250	10149	10149
query79	1256	517	525	517
query80	995	444	444	444
query81	486	226	228	226
query82	1165	91	96	91
query83	244	166	163	163
query84	245	86	86	86
query85	953	289	269	269
query86	399	329	301	301
query87	3241	3136	3083	3083
query88	3144	2363	2354	2354
query89	467	374	403	374
query90	1942	194	192	192
query91	129	95	98	95
query92	57	51	51	51
query93	1010	531	512	512
query94	1154	183	183	183
query95	394	305	307	305
query96	581	267	269	267
query97	3169	3009	3020	3009
query98	228	223	213	213
query99	1294	923	899	899
Total cold run time: 264225 ms
Total hot run time: 187483 ms

@seawinde seawinde force-pushed the fix_partition_track_fail_when_multi_partition branch from a4ad559 to 12d0109 Compare May 27, 2024 13:31
@seawinde
Copy link
Contributor Author

run buildall

@seawinde seawinde force-pushed the fix_partition_track_fail_when_multi_partition branch from 12d0109 to 9c1f063 Compare May 27, 2024 13:33
@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40779 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9c1f06372c2cedc74d24cf5d2e4aa08a28b97cf6, data reload: false

------ Round 1 ----------------------------------
q1	17594	4359	4205	4205
q2	2018	184	183	183
q3	10467	1217	1189	1189
q4	10193	778	799	778
q5	7508	2697	2707	2697
q6	242	134	134	134
q7	951	602	618	602
q8	9227	2113	2085	2085
q9	9520	6709	6636	6636
q10	9869	3934	3855	3855
q11	443	244	234	234
q12	534	229	221	221
q13	17403	3190	3262	3190
q14	257	215	213	213
q15	507	482	467	467
q16	510	397	405	397
q17	974	687	610	610
q18	8382	7770	7752	7752
q19	3387	1563	1506	1506
q20	619	326	314	314
q21	5236	3244	4112	3244
q22	350	267	290	267
Total cold run time: 116191 ms
Total hot run time: 40779 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4505	4464	4394	4394
q2	382	270	281	270
q3	3164	2961	2975	2961
q4	1930	1613	1630	1613
q5	5407	5534	5501	5501
q6	217	127	130	127
q7	2176	1863	1872	1863
q8	3250	3411	3394	3394
q9	8573	8685	8667	8667
q10	4002	3745	3876	3745
q11	584	492	505	492
q12	830	659	654	654
q13	15991	3173	3145	3145
q14	307	282	276	276
q15	547	470	481	470
q16	490	441	442	441
q17	1793	1496	1476	1476
q18	7704	7591	7455	7455
q19	3281	1565	1587	1565
q20	2014	1788	1766	1766
q21	8264	4842	4874	4842
q22	552	488	478	478
Total cold run time: 75963 ms
Total hot run time: 55595 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 170445 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9c1f06372c2cedc74d24cf5d2e4aa08a28b97cf6, data reload: false

query1	929	383	378	378
query2	6465	2447	2262	2262
query3	6641	206	203	203
query4	19087	17274	17395	17274
query5	4157	420	422	420
query6	258	160	154	154
query7	4584	296	291	291
query8	248	180	184	180
query9	8572	2354	2321	2321
query10	446	282	269	269
query11	10711	10184	10046	10046
query12	140	91	91	91
query13	1639	389	371	371
query14	9391	7697	7550	7550
query15	223	168	164	164
query16	7844	262	255	255
query17	1738	516	516	516
query18	1999	272	268	268
query19	196	151	153	151
query20	96	85	82	82
query21	207	149	132	132
query22	4289	4131	4058	4058
query23	33644	33083	33109	33083
query24	6636	2913	2908	2908
query25	524	354	382	354
query26	704	155	157	155
query27	1949	323	318	318
query28	3716	2011	2023	2011
query29	850	605	594	594
query30	231	150	154	150
query31	949	760	723	723
query32	92	52	55	52
query33	501	269	259	259
query34	848	486	471	471
query35	694	601	630	601
query36	1030	929	895	895
query37	104	66	67	66
query38	2896	2783	2750	2750
query39	849	803	779	779
query40	197	128	126	126
query41	45	46	45	45
query42	103	96	98	96
query43	596	553	569	553
query44	1066	735	744	735
query45	178	168	159	159
query46	1044	724	691	691
query47	1838	1784	1787	1784
query48	363	305	310	305
query49	777	388	392	388
query50	838	395	382	382
query51	6858	6933	6808	6808
query52	104	89	91	89
query53	355	285	287	285
query54	546	442	437	437
query55	84	75	72	72
query56	293	235	245	235
query57	1110	1042	1011	1011
query58	231	206	218	206
query59	3523	3498	3375	3375
query60	276	266	260	260
query61	87	90	90	90
query62	546	450	461	450
query63	313	284	282	282
query64	8467	2263	1689	1689
query65	3154	3108	3108	3108
query66	793	340	325	325
query67	15467	15245	14800	14800
query68	4525	525	530	525
query69	438	270	274	270
query70	1157	1072	1167	1072
query71	390	278	271	271
query72	7297	5501	2745	2745
query73	729	328	323	323
query74	6064	5629	5634	5629
query75	3345	2632	2646	2632
query76	2257	1082	961	961
query77	400	264	266	264
query78	10210	9797	9770	9770
query79	1683	511	499	499
query80	902	434	423	423
query81	511	224	222	222
query82	961	91	89	89
query83	259	166	167	166
query84	251	85	83	83
query85	975	268	275	268
query86	446	288	322	288
query87	3280	3126	3086	3086
query88	3543	2442	2436	2436
query89	482	395	398	395
query90	2059	192	194	192
query91	136	172	100	100
query92	64	48	49	48
query93	1836	505	484	484
query94	1160	188	186	186
query95	397	310	319	310
query96	596	271	266	266
query97	3204	2977	3019	2977
query98	251	223	216	216
query99	1136	858	869	858
Total cold run time: 256273 ms
Total hot run time: 170445 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.74 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9c1f06372c2cedc74d24cf5d2e4aa08a28b97cf6, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.23	0.05	0.05
query4	1.68	0.06	0.08
query5	0.49	0.53	0.50
query6	1.14	0.73	0.72
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.54	0.48	0.49
query10	0.55	0.55	0.54
query11	0.16	0.11	0.11
query12	0.15	0.12	0.12
query13	0.60	0.58	0.59
query14	0.77	0.79	0.78
query15	0.82	0.80	0.81
query16	0.36	0.37	0.37
query17	1.02	1.02	1.04
query18	0.22	0.25	0.26
query19	1.77	1.71	1.70
query20	0.01	0.01	0.01
query21	15.44	0.70	0.67
query22	4.64	6.79	2.12
query23	18.32	1.41	1.20
query24	1.82	0.27	0.19
query25	0.15	0.09	0.08
query26	0.26	0.17	0.18
query27	0.07	0.07	0.08
query28	13.32	1.02	1.01
query29	13.31	3.32	3.26
query30	0.24	0.05	0.07
query31	2.86	0.38	0.39
query32	3.30	0.48	0.46
query33	2.87	2.91	2.88
query34	16.87	4.47	4.44
query35	4.59	4.51	4.61
query36	0.65	0.46	0.48
query37	0.18	0.15	0.15
query38	0.15	0.15	0.14
query39	0.04	0.04	0.03
query40	0.16	0.14	0.15
query41	0.09	0.04	0.04
query42	0.05	0.05	0.05
query43	0.03	0.03	0.04
Total cold run time: 110.11 s
Total hot run time: 30.74 s

@seawinde
Copy link
Contributor Author

run buildall

@morrySnow
Copy link
Contributor

run compile

@seawinde
Copy link
Contributor Author

run buildall

@seawinde seawinde force-pushed the fix_partition_track_fail_when_multi_partition branch from a2688f8 to a280d9e Compare May 28, 2024 07:13
@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41141 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a280d9eb37b86c7f166994f8bf3453615de36353, data reload: false

------ Round 1 ----------------------------------
q1	17611	4325	4237	4237
q2	2026	199	192	192
q3	10434	1262	1239	1239
q4	10192	836	806	806
q5	7477	2687	2742	2687
q6	223	130	135	130
q7	962	616	628	616
q8	9225	2149	2084	2084
q9	9943	6690	6747	6690
q10	9878	3916	3918	3916
q11	453	245	242	242
q12	529	230	222	222
q13	17196	3137	3191	3137
q14	278	238	229	229
q15	518	477	469	469
q16	484	375	377	375
q17	958	732	697	697
q18	8317	7964	7870	7870
q19	2878	1452	1584	1452
q20	660	338	326	326
q21	5188	3247	3289	3247
q22	365	278	287	278
Total cold run time: 115795 ms
Total hot run time: 41141 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4486	4443	4449	4443
q2	383	275	274	274
q3	3153	2937	2892	2892
q4	1983	1596	1602	1596
q5	5358	5481	5521	5481
q6	216	125	126	125
q7	2247	1844	1841	1841
q8	3222	3391	3411	3391
q9	8589	8765	8723	8723
q10	4056	3711	3833	3711
q11	593	493	499	493
q12	824	617	641	617
q13	16973	3217	3163	3163
q14	308	269	273	269
q15	532	488	484	484
q16	490	439	449	439
q17	1800	1488	1454	1454
q18	7605	7611	7546	7546
q19	1675	1592	1561	1561
q20	2010	1802	1802	1802
q21	4925	4766	4754	4754
q22	558	475	479	475
Total cold run time: 71986 ms
Total hot run time: 55534 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 169590 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a280d9eb37b86c7f166994f8bf3453615de36353, data reload: false

query1	932	374	376	374
query2	6452	2330	2252	2252
query3	6647	204	205	204
query4	19661	17453	17314	17314
query5	4111	421	422	421
query6	253	153	148	148
query7	4586	299	292	292
query8	250	186	186	186
query9	8520	2461	2437	2437
query10	451	289	265	265
query11	10729	10176	10083	10083
query12	147	92	90	90
query13	1637	377	376	376
query14	8675	7659	6902	6902
query15	212	168	168	168
query16	7740	272	269	269
query17	1734	535	528	528
query18	1962	338	273	273
query19	191	153	152	152
query20	94	86	84	84
query21	198	135	130	130
query22	4248	3916	3902	3902
query23	33532	33295	33106	33106
query24	6563	2840	2782	2782
query25	563	354	350	350
query26	704	154	159	154
query27	1959	333	327	327
query28	3663	2085	2102	2085
query29	845	629	593	593
query30	260	150	152	150
query31	931	769	732	732
query32	88	52	53	52
query33	512	268	262	262
query34	872	463	480	463
query35	708	631	591	591
query36	1044	907	906	906
query37	101	65	65	65
query38	2924	2781	2781	2781
query39	846	797	801	797
query40	202	126	121	121
query41	46	45	46	45
query42	102	93	96	93
query43	560	549	551	549
query44	1077	728	752	728
query45	181	164	161	161
query46	1063	732	736	732
query47	1858	1775	1762	1762
query48	372	307	295	295
query49	839	373	376	373
query50	774	378	388	378
query51	6783	6783	6802	6783
query52	109	88	91	88
query53	351	289	293	289
query54	547	431	427	427
query55	73	72	74	72
query56	259	235	241	235
query57	1116	1084	1054	1054
query58	226	210	212	210
query59	3248	3151	3081	3081
query60	285	260	248	248
query61	86	84	86	84
query62	604	473	457	457
query63	304	279	281	279
query64	8454	2205	1789	1789
query65	3179	3100	3101	3100
query66	805	326	323	323
query67	15056	14969	15205	14969
query68	4481	574	564	564
query69	441	263	268	263
query70	1179	1127	1109	1109
query71	358	271	263	263
query72	7121	2711	2524	2524
query73	737	336	329	329
query74	6118	5645	5540	5540
query75	3302	2607	2599	2599
query76	2288	987	982	982
query77	393	271	268	268
query78	10237	9818	9702	9702
query79	2099	521	516	516
query80	792	430	425	425
query81	484	221	218	218
query82	1086	96	94	94
query83	190	169	169	169
query84	242	88	84	84
query85	1220	295	264	264
query86	446	309	318	309
query87	3258	3124	3173	3124
query88	4083	2467	2437	2437
query89	494	397	395	395
query90	2090	187	187	187
query91	123	96	99	96
query92	60	47	50	47
query93	2497	531	512	512
query94	1266	194	185	185
query95	408	311	313	311
query96	582	278	273	273
query97	3175	3035	2979	2979
query98	257	236	206	206
query99	1146	845	849	845
Total cold run time: 256757 ms
Total hot run time: 169590 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.42 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a280d9eb37b86c7f166994f8bf3453615de36353, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.05	0.05
query4	1.68	0.07	0.07
query5	0.51	0.47	0.50
query6	1.14	0.73	0.72
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.55	0.51	0.50
query10	0.55	0.54	0.54
query11	0.17	0.11	0.12
query12	0.14	0.13	0.12
query13	0.60	0.59	0.60
query14	0.78	0.78	0.77
query15	0.85	0.84	0.84
query16	0.37	0.38	0.38
query17	0.96	0.97	0.98
query18	0.21	0.28	0.22
query19	1.84	1.78	1.74
query20	0.01	0.01	0.01
query21	15.56	0.70	0.66
query22	4.62	7.23	1.67
query23	18.28	1.33	1.15
query24	2.06	0.22	0.21
query25	0.15	0.08	0.08
query26	0.27	0.16	0.17
query27	0.08	0.08	0.08
query28	13.30	1.04	1.01
query29	13.71	3.35	3.27
query30	0.24	0.06	0.07
query31	2.85	0.39	0.40
query32	3.28	0.48	0.48
query33	2.96	2.97	2.94
query34	17.06	4.42	4.56
query35	4.49	4.56	4.57
query36	0.67	0.47	0.47
query37	0.18	0.16	0.15
query38	0.15	0.14	0.15
query39	0.05	0.03	0.04
query40	0.16	0.14	0.15
query41	0.09	0.05	0.05
query42	0.05	0.04	0.05
query43	0.05	0.03	0.04
Total cold run time: 111.09 s
Total hot run time: 30.42 s

List<Object> catalogRelationObjs = materializedViewPlan.collectToList(
planTreeNode -> planTreeNode instanceof CatalogRelation);
ImmutableMultimap.Builder<TableIdentifier, CatalogRelation> tableCatalogRelationMultimapBuilder =
ImmutableMultimap.builder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use expectedSize builder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImmutableMultimap.Builder seems doesn't have expectedSize builder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ImmutableMap.builderWithExpectedSize()

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41188 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a497e054abe1b5b0de4e21a152c2ef4fbf3279fb, data reload: false

------ Round 1 ----------------------------------
q1	17617	4236	4233	4233
q2	2024	205	194	194
q3	10444	1200	1172	1172
q4	10206	808	788	788
q5	7501	2707	2703	2703
q6	219	145	130	130
q7	957	614	613	613
q8	9219	2103	2094	2094
q9	9840	6615	6656	6615
q10	9209	3856	3899	3856
q11	463	245	240	240
q12	460	218	219	218
q13	17226	3314	3239	3239
q14	270	226	224	224
q15	524	469	479	469
q16	513	398	403	398
q17	972	719	768	719
q18	8206	7873	7817	7817
q19	5448	1549	1550	1549
q20	641	298	306	298
q21	5152	3982	3351	3351
q22	337	283	268	268
Total cold run time: 117448 ms
Total hot run time: 41188 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4524	4357	4431	4357
q2	395	280	282	280
q3	3101	2882	2940	2882
q4	2019	1731	1743	1731
q5	5364	5389	5471	5389
q6	214	127	127	127
q7	2192	1823	1791	1791
q8	3274	3325	3389	3325
q9	8617	8600	8637	8600
q10	4049	3863	3873	3863
q11	593	481	486	481
q12	774	607	606	606
q13	16146	3148	3172	3148
q14	288	284	278	278
q15	528	484	482	482
q16	485	450	427	427
q17	1832	1505	1533	1505
q18	8228	7424	7364	7364
q19	2758	1580	1579	1579
q20	2031	1759	1822	1759
q21	14022	4633	4837	4633
q22	578	478	509	478
Total cold run time: 82012 ms
Total hot run time: 55085 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 167213 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a497e054abe1b5b0de4e21a152c2ef4fbf3279fb, data reload: false

query1	920	374	365	365
query2	6448	2261	2323	2261
query3	6640	206	204	204
query4	19032	17152	17134	17134
query5	4158	427	411	411
query6	251	153	156	153
query7	4584	304	285	285
query8	243	178	185	178
query9	8455	2372	2349	2349
query10	453	289	277	277
query11	10688	9999	10108	9999
query12	140	89	93	89
query13	1641	376	358	358
query14	9180	6202	6013	6013
query15	216	167	171	167
query16	7227	260	255	255
query17	1355	526	503	503
query18	1927	272	278	272
query19	195	152	156	152
query20	93	89	88	88
query21	193	138	126	126
query22	4140	3950	3826	3826
query23	33604	32945	32994	32945
query24	8840	2793	2826	2793
query25	570	351	356	351
query26	702	156	156	156
query27	2401	317	318	317
query28	5864	2029	2041	2029
query29	851	605	593	593
query30	243	148	150	148
query31	1017	763	726	726
query32	91	52	60	52
query33	611	291	264	264
query34	867	470	479	470
query35	710	619	583	583
query36	1072	917	908	908
query37	101	67	71	67
query38	2846	2773	2753	2753
query39	852	793	777	777
query40	201	129	123	123
query41	46	44	44	44
query42	104	95	96	95
query43	599	535	502	502
query44	1083	726	731	726
query45	177	166	161	161
query46	1065	733	716	716
query47	1814	1737	1736	1736
query48	357	305	295	295
query49	847	385	394	385
query50	772	386	395	386
query51	6884	6710	6740	6710
query52	106	93	92	92
query53	353	294	295	294
query54	671	432	431	431
query55	73	70	74	70
query56	266	242	246	242
query57	1132	1056	1013	1013
query58	220	205	222	205
query59	3503	3292	3045	3045
query60	275	253	261	253
query61	91	90	88	88
query62	598	444	442	442
query63	313	278	294	278
query64	8487	2224	1701	1701
query65	3203	3104	3134	3104
query66	890	343	331	331
query67	15152	14695	14665	14665
query68	4787	541	540	540
query69	499	273	271	271
query70	1146	1088	1111	1088
query71	399	278	267	267
query72	8250	5720	2702	2702
query73	735	325	317	317
query74	5997	5616	5550	5550
query75	3483	2649	2606	2606
query76	3406	915	997	915
query77	590	267	263	263
query78	10106	9915	9648	9648
query79	2384	515	511	511
query80	1728	439	440	439
query81	510	225	217	217
query82	1395	92	93	92
query83	277	179	173	173
query84	264	85	87	85
query85	1412	273	260	260
query86	463	318	304	304
query87	3236	3083	3092	3083
query88	3986	2360	2332	2332
query89	476	401	387	387
query90	2002	188	190	188
query91	124	109	108	108
query92	63	50	52	50
query93	3017	510	491	491
query94	1252	195	259	195
query95	402	303	308	303
query96	587	269	261	261
query97	3198	3032	2984	2984
query98	231	225	217	217
query99	1192	860	836	836
Total cold run time: 266300 ms
Total hot run time: 167213 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.56 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a497e054abe1b5b0de4e21a152c2ef4fbf3279fb, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.04
query3	0.22	0.05	0.06
query4	1.67	0.08	0.08
query5	0.51	0.50	0.52
query6	1.13	0.72	0.73
query7	0.02	0.01	0.02
query8	0.04	0.04	0.04
query9	0.56	0.49	0.49
query10	0.54	0.55	0.54
query11	0.16	0.11	0.10
query12	0.15	0.12	0.12
query13	0.59	0.58	0.59
query14	0.77	0.78	0.76
query15	0.83	0.81	0.80
query16	0.37	0.35	0.37
query17	0.97	1.00	0.99
query18	0.21	0.23	0.26
query19	1.92	1.67	1.70
query20	0.02	0.01	0.02
query21	15.51	0.69	0.70
query22	4.23	6.86	2.16
query23	18.23	1.39	1.14
query24	1.60	0.24	0.31
query25	0.15	0.09	0.08
query26	0.26	0.16	0.17
query27	0.08	0.07	0.08
query28	13.30	1.02	1.00
query29	13.24	3.33	3.24
query30	0.25	0.06	0.05
query31	2.85	0.38	0.39
query32	3.28	0.47	0.47
query33	2.91	2.84	2.95
query34	17.02	4.37	4.43
query35	4.51	4.51	4.69
query36	0.65	0.46	0.46
query37	0.18	0.16	0.16
query38	0.15	0.14	0.14
query39	0.04	0.04	0.04
query40	0.18	0.14	0.14
query41	0.09	0.04	0.04
query42	0.06	0.05	0.04
query43	0.05	0.04	0.03
Total cold run time: 109.62 s
Total hot run time: 30.56 s

Copy link
Contributor

@zfr9527 zfr9527 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 29, 2024
Copy link
Contributor

@zddr zddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morrySnow morrySnow merged commit baf7ea3 into apache:master May 29, 2024
32 of 34 checks passed
yiguolei pushed a commit that referenced this pull request May 29, 2024
…se partition table exists (#34781)

Fix getting related partition table wrongly when multi base partition
table exists
such as base table def is as following:

CREATE TABLE `test1` (
`pre_batch_no` VARCHAR(100) NULL COMMENT 'pre_batch_no',
`batch_no` VARCHAR(100) NULL COMMENT 'batch_no',
`vin_type1` VARCHAR(50) NULL COMMENT 'vin',
`upgrade_day` date COMMENT 'upgrade_day'
) ENGINE=OLAP
unique KEY(`pre_batch_no`,`batch_no`, `vin_type1`, `upgrade_day`)
COMMENT 'OLAP'
PARTITION BY RANGE(`upgrade_day`)
(
FROM ("2024-03-20") TO ("2024-03-31") INTERVAL 1 DAY
)
DISTRIBUTED BY HASH(`vin_type1`) BUCKETS 10
PROPERTIES (
       "replication_num" = "1"
);

CREATE TABLE `test2` (
`batch_no` VARCHAR(100) NULL COMMENT 'batch_no',
`vin_type2` VARCHAR(50) NULL COMMENT 'vin',
`status` VARCHAR(50) COMMENT 'status',
`upgrade_day` date  not null COMMENT 'upgrade_day' 
) ENGINE=OLAP
Duplicate KEY(`batch_no`,`vin_type2`)
COMMENT 'OLAP'
PARTITION BY RANGE(`upgrade_day`)
(
FROM ("2024-01-01") TO ("2024-01-10") INTERVAL 1 DAY
)
DISTRIBUTED BY HASH(`vin_type2`) BUCKETS 10
PROPERTIES (
       "replication_num" = "1"
);

if you create partition mv which partition by ` t1.upgrade_day` as
following it will be successful

select 
  t1.upgrade_day, 
  t1.batch_no, 
  t1.vin_type1 
from 
  (
    SELECT 
      batch_no, 
      vin_type1, 
      upgrade_day 
    FROM 
      test1 
    where 
      batch_no like 'c%' 
    group by 
      batch_no, 
      vin_type1, 
      upgrade_day
  ) t1 
  left join (
    select 
      batch_no, 
      vin_type2, 
      status 
    from 
      test2 
    group by 
      batch_no, 
      vin_type2, 
      status
  ) t2 on t1.vin_type1 = t2.vin_type2;
dataroaring pushed a commit that referenced this pull request May 31, 2024
…se partition table exists (#34781)

Fix getting related partition table wrongly when multi base partition
table exists
such as base table def is as following:

CREATE TABLE `test1` (
`pre_batch_no` VARCHAR(100) NULL COMMENT 'pre_batch_no',
`batch_no` VARCHAR(100) NULL COMMENT 'batch_no',
`vin_type1` VARCHAR(50) NULL COMMENT 'vin',
`upgrade_day` date COMMENT 'upgrade_day'
) ENGINE=OLAP
unique KEY(`pre_batch_no`,`batch_no`, `vin_type1`, `upgrade_day`)
COMMENT 'OLAP'
PARTITION BY RANGE(`upgrade_day`)
(
FROM ("2024-03-20") TO ("2024-03-31") INTERVAL 1 DAY
)
DISTRIBUTED BY HASH(`vin_type1`) BUCKETS 10
PROPERTIES (
       "replication_num" = "1"
);

CREATE TABLE `test2` (
`batch_no` VARCHAR(100) NULL COMMENT 'batch_no',
`vin_type2` VARCHAR(50) NULL COMMENT 'vin',
`status` VARCHAR(50) COMMENT 'status',
`upgrade_day` date  not null COMMENT 'upgrade_day' 
) ENGINE=OLAP
Duplicate KEY(`batch_no`,`vin_type2`)
COMMENT 'OLAP'
PARTITION BY RANGE(`upgrade_day`)
(
FROM ("2024-01-01") TO ("2024-01-10") INTERVAL 1 DAY
)
DISTRIBUTED BY HASH(`vin_type2`) BUCKETS 10
PROPERTIES (
       "replication_num" = "1"
);

if you create partition mv which partition by ` t1.upgrade_day` as
following it will be successful

select 
  t1.upgrade_day, 
  t1.batch_no, 
  t1.vin_type1 
from 
  (
    SELECT 
      batch_no, 
      vin_type1, 
      upgrade_day 
    FROM 
      test1 
    where 
      batch_no like 'c%' 
    group by 
      batch_no, 
      vin_type1, 
      upgrade_day
  ) t1 
  left join (
    select 
      batch_no, 
      vin_type2, 
      status 
    from 
      test2 
    group by 
      batch_no, 
      vin_type2, 
      status
  ) t2 on t1.vin_type1 = t2.vin_type2;
morrySnow pushed a commit that referenced this pull request Jun 5, 2024
… optimize the fail reason (#35562)

this depends on #34781

1. Materialized view partition track supports date_trunc and optimize the fail reason.

2. it supports create partition mv as following:
this mv will be partition updated by day

CREATE MATERIALIZED VIEW mv_6
BUILD IMMEDIATE REFRESH AUTO ON MANUAL
partition by(date_trunc(date_alias, 'day'))
DISTRIBUTED BY RANDOM BUCKETS 2
PROPERTIES ('replication_num' = '1')
AS
SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, 
count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 
from 
  (select * from 
  lineitem 
  where L_SHIPDATE in ('2017-01-30')) t1 
left join 
  (select * from 
  orders 
  where O_ORDERDATE in ('2017-01-30')) t2 
on t1.L_ORDERKEY = t2.O_ORDERKEY 
group by 
t1.L_SHIPDATE, 
t2.O_ORDERDATE, 
t1.L_QUANTITY, 
t2.O_ORDERSTATUS;
dataroaring pushed a commit that referenced this pull request Jun 7, 2024
… optimize the fail reason (#35562)

this depends on #34781

1. Materialized view partition track supports date_trunc and optimize the fail reason.

2. it supports create partition mv as following:
this mv will be partition updated by day

CREATE MATERIALIZED VIEW mv_6
BUILD IMMEDIATE REFRESH AUTO ON MANUAL
partition by(date_trunc(date_alias, 'day'))
DISTRIBUTED BY RANDOM BUCKETS 2
PROPERTIES ('replication_num' = '1')
AS
SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, 
count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 
from 
  (select * from 
  lineitem 
  where L_SHIPDATE in ('2017-01-30')) t1 
left join 
  (select * from 
  orders 
  where O_ORDERDATE in ('2017-01-30')) t2 
on t1.L_ORDERKEY = t2.O_ORDERKEY 
group by 
t1.L_SHIPDATE, 
t2.O_ORDERDATE, 
t1.L_QUANTITY, 
t2.O_ORDERSTATUS;
seawinde added a commit to seawinde/doris that referenced this pull request Jun 7, 2024
… optimize the fail reason (apache#35562)

this depends on apache#34781

1. Materialized view partition track supports date_trunc and optimize the fail reason.

2. it supports create partition mv as following:
this mv will be partition updated by day

CREATE MATERIALIZED VIEW mv_6
BUILD IMMEDIATE REFRESH AUTO ON MANUAL
partition by(date_trunc(date_alias, 'day'))
DISTRIBUTED BY RANDOM BUCKETS 2
PROPERTIES ('replication_num' = '1')
AS
SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, 
count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 
from 
  (select * from 
  lineitem 
  where L_SHIPDATE in ('2017-01-30')) t1 
left join 
  (select * from 
  orders 
  where O_ORDERDATE in ('2017-01-30')) t2 
on t1.L_ORDERKEY = t2.O_ORDERKEY 
group by 
t1.L_SHIPDATE, 
t2.O_ORDERDATE, 
t1.L_QUANTITY, 
t2.O_ORDERSTATUS;
seawinde added a commit to seawinde/doris that referenced this pull request Jun 20, 2024
… optimize the fail reason (apache#35562)

this depends on apache#34781

1. Materialized view partition track supports date_trunc and optimize the fail reason.

2. it supports create partition mv as following:
this mv will be partition updated by day

CREATE MATERIALIZED VIEW mv_6
BUILD IMMEDIATE REFRESH AUTO ON MANUAL
partition by(date_trunc(date_alias, 'day'))
DISTRIBUTED BY RANDOM BUCKETS 2
PROPERTIES ('replication_num' = '1')
AS
SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, 
count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 
from 
  (select * from 
  lineitem 
  where L_SHIPDATE in ('2017-01-30')) t1 
left join 
  (select * from 
  orders 
  where O_ORDERDATE in ('2017-01-30')) t2 
on t1.L_ORDERKEY = t2.O_ORDERKEY 
group by 
t1.L_SHIPDATE, 
t2.O_ORDERDATE, 
t1.L_QUANTITY, 
t2.O_ORDERSTATUS;
seawinde added a commit to seawinde/doris that referenced this pull request Jun 20, 2024
… optimize the fail reason (apache#35562)

this depends on apache#34781

1. Materialized view partition track supports date_trunc and optimize the fail reason.

2. it supports create partition mv as following:
this mv will be partition updated by day

CREATE MATERIALIZED VIEW mv_6
BUILD IMMEDIATE REFRESH AUTO ON MANUAL
partition by(date_trunc(date_alias, 'day'))
DISTRIBUTED BY RANDOM BUCKETS 2
PROPERTIES ('replication_num' = '1')
AS
SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, 
count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 
from 
  (select * from 
  lineitem 
  where L_SHIPDATE in ('2017-01-30')) t1 
left join 
  (select * from 
  orders 
  where O_ORDERDATE in ('2017-01-30')) t2 
on t1.L_ORDERKEY = t2.O_ORDERKEY 
group by 
t1.L_SHIPDATE, 
t2.O_ORDERDATE, 
t1.L_QUANTITY, 
t2.O_ORDERSTATUS;
seawinde added a commit to seawinde/doris that referenced this pull request Jun 27, 2024
… optimize the fail reason (apache#35562)

this depends on apache#34781

1. Materialized view partition track supports date_trunc and optimize the fail reason.

2. it supports create partition mv as following:
this mv will be partition updated by day

CREATE MATERIALIZED VIEW mv_6
BUILD IMMEDIATE REFRESH AUTO ON MANUAL
partition by(date_trunc(date_alias, 'day'))
DISTRIBUTED BY RANDOM BUCKETS 2
PROPERTIES ('replication_num' = '1')
AS
SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, 
count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 
from 
  (select * from 
  lineitem 
  where L_SHIPDATE in ('2017-01-30')) t1 
left join 
  (select * from 
  orders 
  where O_ORDERDATE in ('2017-01-30')) t2 
on t1.L_ORDERKEY = t2.O_ORDERKEY 
group by 
t1.L_SHIPDATE, 
t2.O_ORDERDATE, 
t1.L_QUANTITY, 
t2.O_ORDERSTATUS;
@yiguolei yiguolei mentioned this pull request Jul 19, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.4-merged dev/3.0.0-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants