Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test](insert-overwrite) Add insert overwrite auto detect concurrency cases #32935

Merged
merged 2 commits into from
Mar 28, 2024

Conversation

zclllyybb
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@zclllyybb
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 37478 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 8b34f2230274a66d5dac4679ad539472af794832, data reload: false

------ Round 1 ----------------------------------
q1	17614	4172	4023	4023
q2	2112	166	155	155
q3	10573	1125	1168	1125
q4	10223	748	782	748
q5	7467	2984	2959	2959
q6	204	125	121	121
q7	1034	588	564	564
q8	9333	1975	1945	1945
q9	7271	6594	6533	6533
q10	8475	3393	3571	3393
q11	433	224	216	216
q12	420	193	193	193
q13	17799	2855	2819	2819
q14	225	213	203	203
q15	508	461	463	461
q16	509	372	373	372
q17	935	549	588	549
q18	7051	6419	6374	6374
q19	3301	1423	1456	1423
q20	536	250	256	250
q21	3522	2877	2765	2765
q22	327	287	295	287
Total cold run time: 109872 ms
Total hot run time: 37478 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4064	4070	4095	4070
q2	338	229	228	228
q3	2945	2875	2827	2827
q4	1818	1553	1535	1535
q5	5299	5300	5319	5300
q6	196	115	116	115
q7	2235	1882	1826	1826
q8	3171	3276	3259	3259
q9	8662	8644	8679	8644
q10	3788	3697	3715	3697
q11	541	440	438	438
q12	718	527	531	527
q13	16920	2836	2922	2836
q14	273	271	263	263
q15	496	458	443	443
q16	466	426	427	426
q17	1751	1517	1490	1490
q18	7581	7227	7046	7046
q19	1597	1501	1504	1501
q20	1871	1735	1726	1726
q21	4646	4609	4576	4576
q22	510	441	453	441
Total cold run time: 69886 ms
Total hot run time: 53214 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181679 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 8b34f2230274a66d5dac4679ad539472af794832, data reload: false

query1	934	369	361	361
query2	6560	1896	1835	1835
query3	6701	208	220	208
query4	31951	21233	21385	21233
query5	4331	392	402	392
query6	268	185	179	179
query7	4680	293	295	293
query8	231	180	167	167
query9	9195	2303	2322	2303
query10	553	232	251	232
query11	15097	14203	14171	14171
query12	143	94	84	84
query13	1618	414	411	411
query14	9681	7756	7647	7647
query15	252	211	196	196
query16	8259	262	261	261
query17	1996	580	557	557
query18	2116	290	289	289
query19	352	156	161	156
query20	92	90	91	90
query21	202	127	126	126
query22	5107	4822	4795	4795
query23	33827	33094	32788	32788
query24	10782	2886	2875	2875
query25	595	393	383	383
query26	848	157	159	157
query27	2306	349	342	342
query28	6692	1889	1880	1880
query29	900	649	627	627
query30	303	150	153	150
query31	1009	727	713	713
query32	95	61	65	61
query33	773	254	268	254
query34	1061	478	490	478
query35	862	612	601	601
query36	1036	928	913	913
query37	116	63	65	63
query38	3547	3479	3468	3468
query39	1491	1466	1453	1453
query40	204	117	111	111
query41	49	48	45	45
query42	101	97	94	94
query43	502	441	447	441
query44	1237	732	740	732
query45	285	266	260	260
query46	1126	712	701	701
query47	1928	1844	1864	1844
query48	454	370	360	360
query49	1078	331	342	331
query50	768	370	378	370
query51	6740	6540	6575	6540
query52	100	92	98	92
query53	340	282	280	280
query54	313	243	239	239
query55	102	78	86	78
query56	246	234	224	224
query57	1201	1147	1146	1146
query58	241	220	212	212
query59	2870	2541	2581	2541
query60	256	234	238	234
query61	107	92	91	91
query62	686	443	438	438
query63	301	284	276	276
query64	5562	3960	3971	3960
query65	3135	3050	3024	3024
query66	864	377	366	366
query67	15423	15009	14896	14896
query68	6524	509	520	509
query69	592	374	374	374
query70	1254	1148	1152	1148
query71	477	256	262	256
query72	6797	2714	2533	2533
query73	713	318	325	318
query74	7870	6463	6419	6419
query75	3273	2182	2179	2179
query76	4017	826	883	826
query77	621	259	259	259
query78	10776	10323	10273	10273
query79	8073	528	530	528
query80	1523	372	378	372
query81	543	216	217	216
query82	1095	90	84	84
query83	203	142	142	142
query84	281	78	81	78
query85	1414	379	363	363
query86	460	289	319	289
query87	3748	3536	3543	3536
query88	5003	2308	2294	2294
query89	538	369	362	362
query90	1942	177	179	177
query91	197	146	150	146
query92	59	48	51	48
query93	6189	499	487	487
query94	1164	181	180	180
query95	440	341	338	338
query96	602	270	267	267
query97	2640	2524	2478	2478
query98	228	222	206	206
query99	1191	908	876	876
Total cold run time: 303228 ms
Total hot run time: 181679 ms

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 8b34f2230274a66d5dac4679ad539472af794832 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       13.8 seconds inserted 10000000 Rows, about 724K ops/s

@zclllyybb
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38291 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 00ee6fe54d8a3e59216cee03612aa7066e9d9783, data reload: false

------ Round 1 ----------------------------------
q1	18133	4318	4111	4111
q2	2111	175	157	157
q3	10568	1147	1175	1147
q4	10957	754	750	750
q5	8256	3027	3009	3009
q6	206	125	122	122
q7	1020	594	563	563
q8	9335	1971	2013	1971
q9	7308	6767	6691	6691
q10	8733	3654	3778	3654
q11	1089	245	262	245
q12	1074	218	209	209
q13	19884	2882	2906	2882
q14	247	215	221	215
q15	517	486	459	459
q16	501	375	383	375
q17	969	516	530	516
q18	7183	6531	6481	6481
q19	6324	1435	1466	1435
q20	523	264	243	243
q21	3561	2849	2754	2754
q22	344	302	312	302
Total cold run time: 118843 ms
Total hot run time: 38291 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4118	4077	4053	4053
q2	324	234	230	230
q3	2967	2817	2811	2811
q4	1807	1539	1605	1539
q5	5277	5341	5317	5317
q6	198	116	115	115
q7	2265	1850	1867	1850
q8	3164	3305	3272	3272
q9	8700	8680	8706	8680
q10	3739	3774	3747	3747
q11	548	452	456	452
q12	734	537	520	520
q13	16931	2859	2854	2854
q14	270	255	251	251
q15	504	466	466	466
q16	476	431	447	431
q17	1692	1488	1481	1481
q18	7409	7136	7086	7086
q19	1599	1524	1488	1488
q20	1888	1725	1725	1725
q21	5741	4689	4626	4626
q22	520	460	469	460
Total cold run time: 70871 ms
Total hot run time: 53454 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181889 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 00ee6fe54d8a3e59216cee03612aa7066e9d9783, data reload: false

query1	953	368	352	352
query2	6543	2089	1928	1928
query3	6712	209	209	209
query4	31823	21336	21314	21314
query5	4294	404	404	404
query6	273	179	171	171
query7	4641	291	299	291
query8	225	181	174	174
query9	9358	2376	2364	2364
query10	561	259	255	255
query11	15247	14297	14265	14265
query12	141	102	90	90
query13	1633	429	442	429
query14	11020	8066	7499	7499
query15	279	196	193	193
query16	8177	260	254	254
query17	1987	563	550	550
query18	2105	281	274	274
query19	336	156	153	153
query20	91	85	85	85
query21	209	128	123	123
query22	4970	4787	4791	4787
query23	33663	32826	32552	32552
query24	11027	2859	2888	2859
query25	614	390	409	390
query26	1141	161	163	161
query27	2379	366	357	357
query28	7161	1939	1884	1884
query29	876	635	642	635
query30	309	149	148	148
query31	989	723	752	723
query32	99	59	61	59
query33	768	265	261	261
query34	981	492	503	492
query35	834	611	632	611
query36	1005	901	906	901
query37	121	67	65	65
query38	3560	3464	3454	3454
query39	1511	1456	1426	1426
query40	216	118	120	118
query41	55	50	51	50
query42	106	97	98	97
query43	506	459	456	456
query44	1160	723	731	723
query45	271	263	268	263
query46	1110	711	708	708
query47	1903	1836	1861	1836
query48	443	365	372	365
query49	1120	330	351	330
query50	783	378	377	377
query51	6739	6586	6622	6586
query52	115	93	99	93
query53	348	280	280	280
query54	314	254	248	248
query55	87	84	81	81
query56	253	237	234	234
query57	1213	1144	1118	1118
query58	236	215	214	214
query59	2744	2576	2675	2576
query60	276	247	268	247
query61	132	115	111	111
query62	676	458	440	440
query63	305	293	280	280
query64	5747	3940	4103	3940
query65	3046	3060	3000	3000
query66	882	379	376	376
query67	15278	15035	15006	15006
query68	7118	525	529	525
query69	637	404	394	394
query70	1227	1204	1135	1135
query71	499	277	276	276
query72	6537	2724	2566	2566
query73	724	327	321	321
query74	8057	6416	6419	6416
query75	3404	2198	2232	2198
query76	4398	867	930	867
query77	610	261	263	261
query78	10948	10242	10145	10145
query79	11000	532	532	532
query80	1880	374	367	367
query81	528	227	216	216
query82	708	84	84	84
query83	221	145	144	144
query84	287	84	79	79
query85	1280	330	316	316
query86	422	304	269	269
query87	3771	3583	3535	3535
query88	4815	2389	2391	2389
query89	500	380	381	380
query90	2049	175	176	175
query91	170	138	141	138
query92	61	51	47	47
query93	6858	514	484	484
query94	1280	178	184	178
query95	437	331	335	331
query96	611	273	277	273
query97	2666	2445	2493	2445
query98	227	216	215	215
query99	1160	893	928	893
Total cold run time: 309770 ms
Total hot run time: 181889 ms

@zclllyybb
Copy link
Contributor Author

run p1 5

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 00ee6fe54d8a3e59216cee03612aa7066e9d9783 with default session variables
Stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       13.8 seconds inserted 10000000 Rows, about 724K ops/s

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Mar 28, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

@HappenLee HappenLee merged commit affee13 into apache:master Mar 28, 2024
27 of 30 checks passed
Jibing-Li added a commit that referenced this pull request Mar 29, 2024
* [fix](merge cloud) Fix cloud be set be tag map (#32864)

* [chore] Add gavinchou to collaborators (#32881)

* [chore](show) support statement to show views from table (#32358)

MySQL [test]> show views;
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
| t2_view        |
+----------------+
2 rows in set (0.00 sec)

MySQL [test]> show views like '%t1%';
+----------------+
| Tables_in_test |
+----------------+
| t1_view        |
+----------------+
1 row in set (0.01 sec)

MySQL [test]> show views where create_time > '2024-03-18';
+----------------+
| Tables_in_test |
+----------------+
| t2_view        |
+----------------+
1 row in set (0.02 sec)

* [Enhancement](ranger) Disable some permission operations when Ranger or LDAP are enabled (#32538)

Disable some permission operations when Ranger or LDAP are enabled.

* [chore](ci) exclude unstable trino_connector case (#32892)

Co-authored-by: stephen <hello-stephen@qq.com>

* [fix](Nereids) NPE when create table with implicit index type (#32893)

* [improvement](mtmv) Support more join types for query rewriting by materialized view (#32685)

This pattern of rewriting is supported for multi-table joins and supported join types is as following:

INNER JOIN
LEFT OUTER JOIN
RIGHT OUTER JOIN
FULL OUTER JOIN
LEFT SEMI JOIN
RIGHT SEMI JOIN
LEFT ANTI JOIN
RIGHT ANTI JOIN

* [Serde](Variant) support arrow serialization for varint type (#32780)

* [fix](multicatalog) fix no data error when read hive table on cosn (#32815)

Currently, when reading a hive on cosn table, doris return empty result, but the table has data.
iceberg on cosn is ok.
The reason is misuse of cosn's file sytem. according to cosn's doc, its fs.cosn.impl should be org.apache.hadoop.fs.CosFileSystem

* [fix](nereids)EliminateGroupByConstant should replace agg's output after removing constant group by keys (#32878)

* [Fix](executor)Fix regression test for test_active_queries/test_backend_active_tasks #32899

* [fix](iceberg) fix iceberg catalog bug and p2 test cases (#32898)

1. Fix iceberg catalog bug

    This PR #30198 change the logic of `IcebergHMSExternalCatalog.java`,
    to get locationUrl by calling hive metastore's `getCatalog()` method.
    But this method only exists in hive 3+. So it will fail if we using hive 2.x.

    I temporary remove this logic, because this logic is only used from iceberg table writing.
    Which is still under development. We will rethink this logic later.

2. Fix test cases

    Some of P2 test cases missed `order_qt`. And because the output format of the floating point
    type is changed, some result in `out` files need to be regenerated.

* [revert](jni) revert part of #32455 (#32904)

* [fix](spill) Avoid releasing resources while spill tasks are executing (#32783)

* [chore](log) print query id before logging profile in be.INFO (#32922)

* [fix](grace-exit) Stop incorrectly of reportwork cause heap use after free #32929

* [improvement](decommission be) decommission check replica num (#32748)

* [fix](arrow-flight) Fix reach limit of connections error (#32911)

Fix Reach limit of connections error
in fe.conf , arrow_flight_token_cache_size is mandatory less than qe_max_connection/2. arrow flight sql is a stateless protocol, connection is usually not actively disconnected, bearer token is evict from the cache will unregister ConnectContext.

Fix ConnectContext.command not be reset to COM_SLEEP in time, this will result in frequent kill connection after query timeout.

Fix bearer token evict log and exception.

TODO: use arrow flight session: https://mail.google.com/mail/u/0/#inbox/FMfcgzGxRdxBLQLTcvvtRpqsvmhrHpdH

* [bugfix](cloud) few variable not initialized (#32868)

../../cloud/src/recycler/meta_checker.cpp
can cause uninitialised memory read.

* [fix](arrow-flight) Fix arrow flight sql compatible with JDK 17 and upgrade arrow 15.0.2 (#32796)

--add-opens=java.base/java.nio=ALL-UNNAMED, see: https://arrow.apache.org/docs/java/install.html#java-compatibility
groovy use flight sql connection to execute query SUM(MAX(c1) OVER (PARTITION BY)) report error: AGGREGATE clause must not contain analytic expressions, but no problem in Java execute it with jdbc::arrow-flight-sql.
groovy not support print arrow array type, throw IndexOutOfBoundsException.
"arrow_flight_sql" not support two phase read
./run-regression-test.sh --run --clean -g arrow_flight_sql

* [fix](spill) SpillStream's writer maybe may not have been finalized (#32931)

* [improvement](spill) Disable DistinctStreamingAgg when spill is enabled (#32932)

* [Improve](inverted_index) update clucene and improve array inverted index writer  (#32436)

* [Performance](exec) replace SipHash in function by XXHash (#32919)

* [feature](agg) add aggregate function sum0 (#32541)

* [improvement](mtmv) Support to get tables in materialized view when collecting table in plan (#32797)

Support to get tables in materialized view when collecting table in plan

table scehma as fllowing:

create materialized view mv1
BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL
DISTRIBUTED BY RANDOM BUCKETS 1 
PROPERTIES ('replication_num' = '1')
 as 
select 
  t1.c1, 
  t3.c2 
from 
  table1 t1 
  inner join table3 t3 on t1.c1 = t3.c2

if get table from the plan as follwoing, we can get [table1, table3, table2], the mv1 is expanded to get base tables;

SELECT 
  mv1.*, 
  uuid() 
FROM 
  mv1 LEFT SEMI 
  JOIN table2 ON mv1.c1 = table2.c1 
WHERE 
  mv1.c1 IN (
    SELECT 
      c1 
    FROM 
      table2
  ) 
  OR mv1.c1 < 10

* [enhance](mtmv)support olap table partition column is null (#32698)

* [enhancement](cloud) add table version to cloud (#32738)

Add table version to cloud.

In Fe:
Get: If Fe is cloud mode, get table version from meta service.
Update: Op drop/replace temp partition, commit transaction.

In meta service:
Add: create Index. init value is 1.
Remove: by recycler.
Update: commit/drop partition rpc, commit txn rpc. Atomic++.

* [fix](cloud) schema change from not null to null (#32913)

1. Use equals instead of == for type comparing
2. null bitmap size is reisze by size of ref column.

* [feature](Nereids): add ColumnPruningPostProcessor. (#32800)

* [case](rowpolicy)fix row policy has been exist (#32880)

* [fix](pipeline) fix use error row desc when origin block clear (#32803)

* [fix](Nereids) support variant column with index when create table (#32948)

* [opt](Nereids) support create table with variant type (#32953)

* [test](insert-overwrite) Add insert overwrite auto detect concurrency cases (#32935)

* [fix](compile) fe cannot compile in idea (#32955)

* [enhancement](plsql) Support select * from routines (#32866)

Support show of plsql procedure using select * from routines.

* [fix](trino-connector) fix `NoClassDefFoundError` of hudi `Utils` class (#32846)

Due to the change of this PR #32455 , the `trino-connector-scanner` package cannot access the `hudi_scanner` package, so the exception NoclassDeffounderror will appear.

We need to write a separate Utils class.

* [exec](column) change some complex column move to noexcept (#32954)

* [Enhancement](data skew) extends show data skew (#32732)

* [chore](test) let suite compatible with Nereids (#32964)

* Support identical column name in different index. (#32792)

* Limit the max string length to 1024 while collecting column stats to control BE memory usage. (#32470)

* [fix](merge-iterator) fix NOT_IMPLEMENTED_ERROR when read next block view (#32961)

* [improvement](executor)Add tag property for workload group #32874

* [fix](auth)unified workload and resource permission logic (#32907)

- `Grant resource` can no longer grant global `usage_priv`
-  `grant resource %` instead of `grant resource *`

before change:
```
grant usage_priv on resource * to f;
show grants for f\G
*************************** 1. row ***************************
      UserIdentity: 'f'@'%'
           Comment: 
          Password: No
             Roles: 
       GlobalPrivs: Usage_priv 
      CatalogPrivs: NULL
     DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv 
        TablePrivs: NULL
          ColPrivs: NULL
     ResourcePrivs: NULL
 CloudClusterPrivs: NULL
WorkloadGroupPrivs: normal: Usage_priv 
```
after change
```
grant usage_priv on resource '%' to f;
show grants for f\G
*************************** 1. row ***************************
      UserIdentity: 'f'@'%'
           Comment: 
          Password: No
             Roles: 
       GlobalPrivs: NULL
      CatalogPrivs: NULL
     DatabasePrivs: internal.information_schema: Select_priv ; internal.mysql: Select_priv 
        TablePrivs: NULL
          ColPrivs: NULL
     ResourcePrivs: %: Usage_priv 
 CloudClusterPrivs: NULL
WorkloadGroupPrivs: normal: Usage_priv 

```

---------

Co-authored-by: yujun <yu.jun.reach@gmail.com>
Co-authored-by: Gavin Chou <gavineaglechou@gmail.com>
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Co-authored-by: yongjinhou <109586248+yongjinhou@users.noreply.github.com>
Co-authored-by: Dongyang Li <hello_stephen@qq.com>
Co-authored-by: stephen <hello-stephen@qq.com>
Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
Co-authored-by: seawinde <149132972+seawinde@users.noreply.github.com>
Co-authored-by: lihangyu <15605149486@163.com>
Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com>
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
Co-authored-by: wangbo <wangbo@apache.org>
Co-authored-by: Mingyu Chen <morningman@163.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: zhiqiang <seuhezhiqiang@163.com>
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
Co-authored-by: Vallish Pai <vallishpai@gmail.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Jensen <czjourney@163.com>
Co-authored-by: zhangdong <493738387@qq.com>
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
Co-authored-by: jakevin <jakevingoo@gmail.com>
Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
Co-authored-by: zclllyybb <zhaochangle@selectdb.com>
Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
Co-authored-by: Xin Liao <liaoxinbit@126.com>
@zclllyybb zclllyybb deleted the overwrite_cases branch April 11, 2024 07:30
zclllyybb added a commit to zclllyybb/doris that referenced this pull request Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.3-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants