Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improvement](mtmv) Support to get tables in materialized view when collecting table in plan #32797

Merged
merged 3 commits into from
Mar 28, 2024

Conversation

seawinde
Copy link
Contributor

Proposed changes

Support to get tables in materialized view when collecting table in plan

table scehma as fllowing:

create materialized view mv1 BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL DISTRIBUTED BY RANDOM BUCKETS 1  PROPERTIES ('replication_num' = '1')
 as 
select 
  t1.c1, 
  t3.c2 
from 
  table1 t1 
  inner join table3 t3 on t1.c1 = t3.c2

if get table from the plan as follwoing, we can get [table1, table3, table2], the mv1 is expanded to get base tables;

SELECT 
  mv1.*, 
  uuid() 
FROM 
  mv1 LEFT SEMI 
  JOIN table2 ON mv1.c1 = table2.c1 
WHERE 
  mv1.c1 IN (
    SELECT 
      c1 
    FROM 
      table2
  ) 
  OR mv1.c1 < 10

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 37824 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ff0e44599e3dd76876de388dda7916e14094bb0c, data reload: false

------ Round 1 ----------------------------------
q1	17633	4187	4090	4090
q2	2106	158	150	150
q3	10583	1141	1217	1141
q4	10232	776	779	776
q5	7463	2980	2991	2980
q6	201	123	121	121
q7	1028	585	566	566
q8	9332	1921	1986	1921
q9	7129	6580	6584	6580
q10	8418	3477	3529	3477
q11	438	228	225	225
q12	402	206	197	197
q13	17822	2866	2850	2850
q14	224	200	213	200
q15	507	468	457	457
q16	488	369	370	369
q17	936	508	582	508
q18	7157	6429	6411	6411
q19	1545	1435	1380	1380
q20	562	261	252	252
q21	3703	2935	2876	2876
q22	352	297	304	297
Total cold run time: 108261 ms
Total hot run time: 37824 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4125	4075	4064	4064
q2	325	229	232	229
q3	2907	2845	2814	2814
q4	1846	1505	1525	1505
q5	5294	5308	5339	5308
q6	191	118	115	115
q7	2218	1819	1846	1819
q8	3138	3263	3255	3255
q9	8642	8690	8700	8690
q10	3749	3804	3761	3761
q11	549	449	444	444
q12	711	554	549	549
q13	16928	2839	2854	2839
q14	285	247	257	247
q15	497	462	456	456
q16	472	414	420	414
q17	1733	1497	1482	1482
q18	7442	7109	7154	7109
q19	1610	1472	1505	1472
q20	1906	1726	1683	1683
q21	4698	4805	4745	4745
q22	517	450	453	450
Total cold run time: 69783 ms
Total hot run time: 53450 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181003 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ff0e44599e3dd76876de388dda7916e14094bb0c, data reload: false

query1	936	363	351	351
query2	6553	1914	1843	1843
query3	6709	209	214	209
query4	31811	21425	21283	21283
query5	4349	402	392	392
query6	289	186	176	176
query7	4632	286	285	285
query8	226	167	175	167
query9	9417	2277	2282	2277
query10	569	244	249	244
query11	17389	14171	14248	14171
query12	133	93	93	93
query13	1648	429	410	410
query14	10102	7629	7171	7171
query15	238	203	202	202
query16	8043	248	250	248
query17	1971	572	530	530
query18	2052	274	283	274
query19	260	146	150	146
query20	91	88	89	88
query21	205	127	122	122
query22	5018	4854	4801	4801
query23	33237	32874	32981	32874
query24	10802	2837	2892	2837
query25	618	396	383	383
query26	1186	154	156	154
query27	2732	345	344	344
query28	7576	1829	1850	1829
query29	877	645	614	614
query30	315	152	150	150
query31	959	737	746	737
query32	94	62	60	60
query33	777	253	253	253
query34	1128	478	476	476
query35	816	606	610	606
query36	1003	894	912	894
query37	128	64	64	64
query38	3627	3436	3423	3423
query39	1454	1459	1446	1446
query40	208	120	119	119
query41	51	46	50	46
query42	106	97	96	96
query43	476	443	452	443
query44	1136	719	710	710
query45	265	270	262	262
query46	1117	688	686	686
query47	1940	1868	1846	1846
query48	446	364	357	357
query49	1131	338	331	331
query50	763	370	369	369
query51	6707	6618	6634	6618
query52	105	93	87	87
query53	348	275	276	275
query54	314	236	239	236
query55	90	77	83	77
query56	253	226	230	226
query57	1220	1159	1127	1127
query58	229	212	210	210
query59	2711	2667	2659	2659
query60	279	242	251	242
query61	116	115	113	113
query62	671	455	455	455
query63	303	277	279	277
query64	5816	4153	4217	4153
query65	3068	3013	3032	3013
query66	865	384	380	380
query67	15230	14820	14641	14641
query68	6301	517	524	517
query69	613	403	392	392
query70	1164	1154	1218	1154
query71	487	274	268	268
query72	6442	2733	2577	2577
query73	715	308	309	308
query74	8112	6626	6341	6341
query75	3388	2182	2230	2182
query76	4391	900	899	899
query77	624	278	260	260
query78	10942	10120	10100	10100
query79	9961	515	516	515
query80	2052	390	374	374
query81	574	212	221	212
query82	1611	89	84	84
query83	301	141	151	141
query84	285	80	78	78
query85	1641	326	312	312
query86	488	302	296	296
query87	3733	3570	3494	3494
query88	5188	2272	2270	2270
query89	521	364	362	362
query90	1974	178	180	178
query91	181	139	143	139
query92	62	55	48	48
query93	7209	496	484	484
query94	1203	176	176	176
query95	443	333	332	332
query96	598	267	265	265
query97	2705	2457	2469	2457
query98	231	210	202	202
query99	1193	905	919	905
Total cold run time: 311267 ms
Total hot run time: 181003 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit ff0e44599e3dd76876de388dda7916e14094bb0c with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       21.2 seconds inserted 10000000 Rows, about 471K ops/s

Comment on lines +80 to +82
LOG.error(String.format(
"table collector expand fail, mtmv name is %s, targetTableTypes is %s",
mtmv.getName(), context.targetTableTypes), e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not throw exception again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have fixed it, should throw exception

private final Set<TableType> targetTableTypes;
// if expand the mv or not
private final boolean expand;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always true now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method provides an ability which can set expland true or false. in the scene, it's should always be true

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 37694 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a8c31ee43a78e4ad2a026b96f640a6e542295042, data reload: false

------ Round 1 ----------------------------------
q1	17625	4263	4063	4063
q2	2112	160	150	150
q3	10586	1172	1198	1172
q4	10218	798	783	783
q5	7452	2962	3001	2962
q6	199	124	122	122
q7	1045	575	566	566
q8	9335	1974	1946	1946
q9	7169	6609	6546	6546
q10	8445	3406	3575	3406
q11	426	235	220	220
q12	418	201	197	197
q13	17834	2835	2838	2835
q14	226	204	204	204
q15	508	460	470	460
q16	495	369	367	367
q17	940	616	573	573
q18	7042	6435	6304	6304
q19	2574	1415	1448	1415
q20	552	265	250	250
q21	3546	2862	2985	2862
q22	345	292	291	291
Total cold run time: 109092 ms
Total hot run time: 37694 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4102	4082	4132	4082
q2	328	233	230	230
q3	3002	2865	2811	2811
q4	1849	1558	1575	1558
q5	5298	5271	5276	5271
q6	193	118	117	117
q7	2239	1836	1882	1836
q8	3153	3303	3276	3276
q9	8692	8668	8688	8668
q10	3757	3728	3739	3728
q11	535	451	445	445
q12	728	568	525	525
q13	16927	2846	2831	2831
q14	287	251	250	250
q15	498	454	460	454
q16	478	417	411	411
q17	1727	1507	1464	1464
q18	7437	7209	7075	7075
q19	1585	1543	1540	1540
q20	1904	1704	1709	1704
q21	4748	4714	4678	4678
q22	533	450	474	450
Total cold run time: 70000 ms
Total hot run time: 53404 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181127 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a8c31ee43a78e4ad2a026b96f640a6e542295042, data reload: false

query1	939	356	346	346
query2	6543	1943	1848	1848
query3	6699	222	216	216
query4	31814	21289	21313	21289
query5	4254	397	410	397
query6	263	178	171	171
query7	4628	307	304	304
query8	237	173	172	172
query9	9066	2256	2256	2256
query10	545	238	239	238
query11	17463	14292	14191	14191
query12	130	92	85	85
query13	1623	432	420	420
query14	9882	7609	7940	7609
query15	245	196	183	183
query16	8041	264	261	261
query17	1963	582	545	545
query18	2056	293	293	293
query19	238	156	153	153
query20	93	84	91	84
query21	199	129	129	129
query22	5053	4861	4766	4766
query23	33507	32873	32819	32819
query24	10856	2847	2920	2847
query25	632	384	383	383
query26	1216	156	155	155
query27	2828	355	351	351
query28	7265	1900	1850	1850
query29	896	641	631	631
query30	296	155	148	148
query31	997	746	739	739
query32	94	57	56	56
query33	769	248	251	248
query34	1095	477	501	477
query35	827	617	603	603
query36	1052	888	881	881
query37	125	68	67	67
query38	3547	3414	3380	3380
query39	1456	1436	1417	1417
query40	224	126	122	122
query41	52	48	47	47
query42	107	97	105	97
query43	467	435	443	435
query44	1210	758	742	742
query45	292	264	258	258
query46	1120	717	719	717
query47	1925	1864	1868	1864
query48	442	377	366	366
query49	1129	342	313	313
query50	761	379	374	374
query51	6760	6583	6547	6547
query52	108	94	96	94
query53	343	271	267	267
query54	291	235	242	235
query55	82	76	80	76
query56	230	214	224	214
query57	1209	1120	1134	1120
query58	226	195	202	195
query59	2708	2613	2470	2470
query60	267	233	246	233
query61	99	93	92	92
query62	669	442	457	442
query63	295	269	266	266
query64	5672	3951	4021	3951
query65	3019	3016	3012	3012
query66	870	353	353	353
query67	15116	14740	14637	14637
query68	5977	534	530	530
query69	590	374	370	370
query70	1252	1140	1105	1105
query71	475	264	267	264
query72	6935	2758	2677	2677
query73	710	322	326	322
query74	6951	6354	6372	6354
query75	3221	2195	2240	2195
query76	4262	878	896	878
query77	635	286	260	260
query78	10837	10080	10190	10080
query79	6722	526	527	526
query80	1856	400	388	388
query81	565	219	214	214
query82	1768	91	87	87
query83	340	155	155	155
query84	289	82	91	82
query85	1711	361	359	359
query86	477	312	322	312
query87	3684	3493	3538	3493
query88	4927	2401	2394	2394
query89	474	359	364	359
query90	2008	177	174	174
query91	177	136	139	136
query92	59	48	46	46
query93	5076	505	493	493
query94	1179	178	176	176
query95	427	325	324	324
query96	600	268	284	268
query97	2665	2460	2456	2456
query98	224	226	203	203
query99	1174	916	900	900
Total cold run time: 303372 ms
Total hot run time: 181127 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit a8c31ee43a78e4ad2a026b96f640a6e542295042 with default session variables
Stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       13.7 seconds inserted 10000000 Rows, about 729K ops/s

LOG.error(String.format(
"table collector expand fail, mtmv name is %s, targetTableTypes is %s",
mtmv.getName(), context.targetTableTypes), e);
throw new NereidsException(String.format("expand mv and collect table fail, mv name is %s, mv sql is %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw Nereids' AnalysisException

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have fixed it

@seawinde
Copy link
Contributor Author

run buildall

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 38161 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 83937cd5b9f8f9faad41fb5b49f48bc57f420197, data reload: false

------ Round 1 ----------------------------------
q1	17771	5766	4203	4203
q2	3128	165	160	160
q3	11345	1131	1214	1131
q4	10745	782	831	782
q5	7839	3054	3056	3054
q6	207	128	127	127
q7	1086	609	601	601
q8	9909	2046	2005	2005
q9	7323	6617	6537	6537
q10	8376	3448	3567	3448
q11	431	229	214	214
q12	367	208	197	197
q13	17801	2852	2845	2845
q14	245	197	216	197
q15	510	469	470	469
q16	508	374	374	374
q17	958	603	603	603
q18	7272	6559	6381	6381
q19	1578	1438	1486	1438
q20	547	254	261	254
q21	3531	2901	2836	2836
q22	363	305	309	305
Total cold run time: 111840 ms
Total hot run time: 38161 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4139	4072	4060	4060
q2	330	232	236	232
q3	2974	2819	2817	2817
q4	1827	1548	1579	1548
q5	5318	5317	5359	5317
q6	201	117	120	117
q7	2234	1882	1841	1841
q8	3154	3301	3259	3259
q9	8705	8704	8751	8704
q10	3805	3825	3738	3738
q11	557	445	453	445
q12	702	553	530	530
q13	16561	2844	2846	2844
q14	273	250	248	248
q15	489	461	473	461
q16	461	441	417	417
q17	1722	1497	1462	1462
q18	7345	7147	7008	7008
q19	1617	1525	1529	1525
q20	1916	1733	1695	1695
q21	4873	4634	4612	4612
q22	533	461	445	445
Total cold run time: 69736 ms
Total hot run time: 53325 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181820 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 83937cd5b9f8f9faad41fb5b49f48bc57f420197, data reload: false

query1	939	381	358	358
query2	6538	2072	1982	1982
query3	6703	214	220	214
query4	31772	21349	21272	21272
query5	4261	398	395	395
query6	267	176	172	172
query7	4630	301	306	301
query8	229	168	176	168
query9	9297	2345	2331	2331
query10	561	252	244	244
query11	15395	14339	14266	14266
query12	135	88	85	85
query13	1636	413	422	413
query14	9400	7773	7747	7747
query15	277	202	208	202
query16	8130	258	259	258
query17	1936	569	552	552
query18	2023	300	284	284
query19	307	161	152	152
query20	96	89	91	89
query21	206	125	138	125
query22	4942	4839	4763	4763
query23	33517	32953	32757	32757
query24	10928	2823	2854	2823
query25	611	389	389	389
query26	1606	163	164	163
query27	2991	356	359	356
query28	7616	1868	1909	1868
query29	952	619	591	591
query30	306	148	147	147
query31	967	758	742	742
query32	97	58	53	53
query33	756	247	256	247
query34	1060	492	500	492
query35	824	623	598	598
query36	1014	873	886	873
query37	124	69	66	66
query38	3560	3431	3422	3422
query39	1467	1541	1441	1441
query40	300	113	108	108
query41	48	45	52	45
query42	101	98	97	97
query43	509	469	457	457
query44	1165	738	721	721
query45	268	267	264	264
query46	1117	688	696	688
query47	1897	1838	1830	1830
query48	453	352	353	352
query49	1214	330	327	327
query50	787	371	371	371
query51	6798	6678	6625	6625
query52	102	91	92	91
query53	338	278	282	278
query54	292	239	228	228
query55	83	80	83	80
query56	242	211	213	211
query57	1221	1132	1136	1132
query58	230	203	203	203
query59	2902	2592	2598	2592
query60	266	240	239	239
query61	96	96	96	96
query62	686	441	443	441
query63	311	270	282	270
query64	6698	4006	4108	4006
query65	3136	3039	3062	3039
query66	1429	381	372	372
query67	15184	14858	14829	14829
query68	5293	511	531	511
query69	570	383	377	377
query70	1237	1187	1145	1145
query71	440	273	272	272
query72	6418	2884	2708	2708
query73	715	318	326	318
query74	7145	6363	6524	6363
query75	2985	2191	2209	2191
query76	3492	859	908	859
query77	379	280	250	250
query78	10858	10083	10254	10083
query79	8374	534	528	528
query80	1531	367	373	367
query81	518	214	218	214
query82	1624	87	87	87
query83	204	146	148	146
query84	288	84	74	74
query85	1534	323	315	315
query86	462	289	280	280
query87	3686	3524	3544	3524
query88	5086	2308	2306	2306
query89	521	374	367	367
query90	2034	178	178	178
query91	172	140	140	140
query92	61	52	48	48
query93	6283	502	494	494
query94	1151	173	175	173
query95	431	322	330	322
query96	622	268	269	268
query97	2651	2464	2447	2447
query98	230	228	208	208
query99	1261	908	864	864
Total cold run time: 304205 ms
Total hot run time: 181820 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 83937cd5b9f8f9faad41fb5b49f48bc57f420197 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       13.9 seconds inserted 10000000 Rows, about 719K ops/s

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 28, 2024
@morrySnow morrySnow merged commit 92f32db into apache:master Mar 28, 2024
32 of 34 checks passed
Jibing-Li added a commit that referenced this pull request Mar 29, 2024
* [fix](merge cloud) Fix cloud be set be tag map (#32864)

* [chore] Add gavinchou to collaborators (#32881)

* [chore](show) support statement to show views from table (#32358)

MySQL [test]> show views;
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
| t2_view        |
+----------------+
2 rows in set (0.00 sec)

MySQL [test]> show views like '%t1%';
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
+----------------+
1 row in set (0.01 sec)

MySQL [test]> show views where create_time > '2024-03-18';
+----------------+
| Tables_in_test |
+----------------+
| t2_view        |
+----------------+
1 row in set (0.02 sec)

* [Enhancement](ranger) Disable some permission operations when Ranger or LDAP are enabled (#32538)

Disable some permission operations when Ranger or LDAP are enabled.

* [chore](ci) exclude unstable trino_connector case (#32892)

Co-authored-by: stephen <hello-stephen@qq.com>

* [fix](Nereids) NPE when create table with implicit index type (#32893)

* [improvement](mtmv) Support more join types for query rewriting by materialized view (#32685)

This pattern of rewriting is supported for multi-table joins and supported join types is as following:

INNER JOIN
LEFT OUTER JOIN
RIGHT OUTER JOIN
FULL OUTER JOIN
LEFT SEMI JOIN
RIGHT SEMI JOIN
LEFT ANTI JOIN
RIGHT ANTI JOIN

* [Serde](Variant) support arrow serialization for varint type (#32780)

* [fix](multicatalog) fix no data error when read hive table on cosn (#32815)

Currently, when reading a hive on cosn table, doris return empty result, but the table has data.
iceberg on cosn is ok.
The reason is misuse of cosn's file sytem. according to cosn's doc, its fs.cosn.impl should be org.apache.hadoop.fs.CosFileSystem

* [fix](nereids)EliminateGroupByConstant should replace agg's output after removing constant group by keys (#32878)

* [Fix](executor)Fix regression test for test_active_queries/test_backend_active_tasks #32899

* [fix](iceberg) fix iceberg catalog bug and p2 test cases (#32898)

1. Fix iceberg catalog bug

    This PR #30198 change the logic of `IcebergHMSExternalCatalog.java`,
    to get locationUrl by calling hive metastore's `getCatalog()` method.
    But this method only exists in hive 3+. So it will fail if we using hive 2.x.

    I temporary remove this logic, because this logic is only used from iceberg table writing.
    Which is still under development. We will rethink this logic later.

2. Fix test cases

    Some of P2 test cases missed `order_qt`. And because the output format of the floating point
    type is changed, some result in `out` files need to be regenerated.

* [revert](jni) revert part of #32455 (#32904)

* [fix](spill) Avoid releasing resources while spill tasks are executing (#32783)

* [chore](log) print query id before logging profile in be.INFO (#32922)

* [fix](grace-exit) Stop incorrectly of reportwork cause heap use after free #32929

* [improvement](decommission be) decommission check replica num (#32748)

* [fix](arrow-flight) Fix reach limit of connections error (#32911)

Fix Reach limit of connections error
in fe.conf , arrow_flight_token_cache_size is mandatory less than qe_max_connection/2. arrow flight sql is a stateless protocol, connection is usually not actively disconnected, bearer token is evict from the cache will unregister ConnectContext.

Fix ConnectContext.command not be reset to COM_SLEEP in time, this will result in frequent kill connection after query timeout.

Fix bearer token evict log and exception.

TODO: use arrow flight session: https://mail.google.com/mail/u/0/#inbox/FMfcgzGxRdxBLQLTcvvtRpqsvmhrHpdH

* [bugfix](cloud) few variable not initialized (#32868)

../../cloud/src/recycler/meta_checker.cpp
can cause uninitialised memory read.

* [fix](arrow-flight) Fix arrow flight sql compatible with JDK 17 and upgrade arrow 15.0.2 (#32796)

--add-opens=java.base/java.nio=ALL-UNNAMED, see: https://arrow.apache.org/docs/java/install.html#java-compatibility
groovy use flight sql connection to execute query SUM(MAX(c1) OVER (PARTITION BY)) report error: AGGREGATE clause must not contain analytic expressions, but no problem in Java execute it with jdbc::arrow-flight-sql.
groovy not support print arrow array type, throw IndexOutOfBoundsException.
"arrow_flight_sql" not support two phase read
./run-regression-test.sh --run --clean -g arrow_flight_sql

* [fix](spill) SpillStream's writer maybe may not have been finalized (#32931)

* [improvement](spill) Disable DistinctStreamingAgg when spill is enabled (#32932)

* [Improve](inverted_index) update clucene and improve array inverted index writer  (#32436)

* [Performance](exec) replace SipHash in function by XXHash (#32919)

* [feature](agg) add aggregate function sum0 (#32541)

* [improvement](mtmv) Support to get tables in materialized view when collecting table in plan (#32797)

Support to get tables in materialized view when collecting table in plan

table scehma as fllowing:

create materialized view mv1
BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL
DISTRIBUTED BY RANDOM BUCKETS 1 
PROPERTIES ('replication_num' = '1')
 as 
select 
  t1.c1, 
  t3.c2 
from 
  table1 t1 
  inner join table3 t3 on t1.c1 = t3.c2

if get table from the plan as follwoing, we can get [table1, table3, table2], the mv1 is expanded to get base tables;

SELECT 
  mv1.*, 
  uuid() 
FROM 
  mv1 LEFT SEMI 
  JOIN table2 ON mv1.c1 = table2.c1 
WHERE 
  mv1.c1 IN (
    SELECT 
      c1 
    FROM 
      table2
  ) 
  OR mv1.c1 < 10

* [enhance](mtmv)support olap table partition column is null (#32698)

* [enhancement](cloud) add table version to cloud (#32738)

Add table version to cloud.

In Fe:
Get: If Fe is cloud mode, get table version from meta service.
Update: Op drop/replace temp partition, commit transaction.

In meta service:
Add: create Index. init value is 1.
Remove: by recycler.
Update: commit/drop partition rpc, commit txn rpc. Atomic++.

* [fix](cloud) schema change from not null to null (#32913)

1. Use equals instead of == for type comparing
2. null bitmap size is reisze by size of ref column.

* [feature](Nereids): add ColumnPruningPostProcessor. (#32800)

* [case](rowpolicy)fix row policy has been exist (#32880)

* [fix](pipeline) fix use error row desc when origin block clear (#32803)

* [fix](Nereids) support variant column with index when create table (#32948)

* [opt](Nereids) support create table with variant type (#32953)

* [test](insert-overwrite) Add insert overwrite auto detect concurrency cases (#32935)

* [fix](compile) fe cannot compile in idea (#32955)

* [enhancement](plsql) Support select * from routines (#32866)

Support show of plsql procedure using select * from routines.

* [fix](trino-connector) fix `NoClassDefFoundError` of hudi `Utils` class (#32846)

Due to the change of this PR #32455 , the `trino-connector-scanner` package cannot access the `hudi_scanner` package, so the exception NoclassDeffounderror will appear.

We need to write a separate Utils class.

* [exec](column) change some complex column move to noexcept (#32954)

* [Enhancement](data skew) extends show data skew (#32732)

* [chore](test) let suite compatible with Nereids (#32964)

* Support identical column name in different index. (#32792)

* Limit the max string length to 1024 while collecting column stats to control BE memory usage. (#32470)

* [fix](merge-iterator) fix NOT_IMPLEMENTED_ERROR when read next block view (#32961)

* [improvement](executor)Add tag property for workload group #32874

* [fix](auth)unified workload and resource permission logic (#32907)

- `Grant resource` can no longer grant global `usage_priv`
-  `grant resource %` instead of `grant resource *`

before change:
```
grant usage_priv on resource * to f;
show grants for f\G
*************************** 1. row ***************************
      UserIdentity: 'f'@'%'
           Comment: 
          Password: No
             Roles: 
       GlobalPrivs: Usage_priv 
      CatalogPrivs: NULL
     DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv 
        TablePrivs: NULL
          ColPrivs: NULL
     ResourcePrivs: NULL
 CloudClusterPrivs: NULL
WorkloadGroupPrivs: normal: Usage_priv 
```
after change
```
grant usage_priv on resource '%' to f;
show grants for f\G
*************************** 1. row ***************************
      UserIdentity: 'f'@'%'
           Comment: 
          Password: No
             Roles: 
       GlobalPrivs: NULL
      CatalogPrivs: NULL
     DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv 
        TablePrivs: NULL
          ColPrivs: NULL
     ResourcePrivs: %: Usage_priv 
 CloudClusterPrivs: NULL
WorkloadGroupPrivs: normal: Usage_priv 

```

---------

Co-authored-by: yujun <yu.jun.reach@gmail.com>
Co-authored-by: Gavin Chou <gavineaglechou@gmail.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: yongjinhou <109586248+yongjinhou@users.noreply.github.com>
Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: stephen <hello-stephen@qq.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: seawinde <149132972+seawinde@users.noreply.github.com>
Co-authored-by: lihangyu <15605149486@163.com>
Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: wangbo <wangbo@apache.org>
Co-authored-by: Mingyu Chen <morningman@163.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: zhiqiang <seuhezhiqiang@163.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: Vallish Pai <vallishpai@gmail.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Jensen <czjourney@163.com>
Co-authored-by: zhangdong <493738387@qq.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: jakevin <jakevingoo@gmail.com>
Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
Co-authored-by: zclllyybb <zhaochangle@selectdb.com>
Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
yiguolei pushed a commit that referenced this pull request Apr 1, 2024
…ollecting table in plan (#32797)

Support to get tables in materialized view when collecting table in plan

table scehma as fllowing:

create materialized view mv1
BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL
DISTRIBUTED BY RANDOM BUCKETS 1 
PROPERTIES ('replication_num' = '1')
 as 
select 
  t1.c1, 
  t3.c2 
from 
  table1 t1 
  inner join table3 t3 on t1.c1 = t3.c2

if get table from the plan as follwoing, we can get [table1, table3, table2], the mv1 is expanded to get base tables;

SELECT 
  mv1.*, 
  uuid() 
FROM 
  mv1 LEFT SEMI 
  JOIN table2 ON mv1.c1 = table2.c1 
WHERE 
  mv1.c1 IN (
    SELECT 
      c1 
    FROM 
      table2
  ) 
  OR mv1.c1 < 10
yiguolei pushed a commit that referenced this pull request Apr 10, 2024
…ollecting table in plan (#32797)

Support to get tables in materialized view when collecting table in plan

table scehma as fllowing:

create materialized view mv1
BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL
DISTRIBUTED BY RANDOM BUCKETS 1 
PROPERTIES ('replication_num' = '1')
 as 
select 
  t1.c1, 
  t3.c2 
from 
  table1 t1 
  inner join table3 t3 on t1.c1 = t3.c2

if get table from the plan as follwoing, we can get [table1, table3, table2], the mv1 is expanded to get base tables;

SELECT 
  mv1.*, 
  uuid() 
FROM 
  mv1 LEFT SEMI 
  JOIN table2 ON mv1.c1 = table2.c1 
WHERE 
  mv1.c1 IN (
    SELECT 
      c1 
    FROM 
      table2
  ) 
  OR mv1.c1 < 10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants