Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve](txn insert) txn insert support update stmt #33034

Merged
merged 1 commit into from
Mar 29, 2024

Conversation

mymeiyi
Copy link
Contributor

@mymeiyi mymeiyi commented Mar 29, 2024

Proposed changes

support:

begin;
insert into t1 select * from t2;
update t2 set ...;
commit;

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@mymeiyi
Copy link
Contributor Author

mymeiyi commented Mar 29, 2024

run buildall

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 38602 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e2af7919693ad4e0a7f7921099ed87cc2be87362, data reload: false

------ Round 1 ----------------------------------
q1	17609	5030	4229	4229
q2	2094	195	194	194
q3	10417	1201	1185	1185
q4	10201	753	744	744
q5	7459	2768	2802	2768
q6	219	129	131	129
q7	1039	591	605	591
q8	9209	1994	2083	1994
q9	9156	6584	6605	6584
q10	8662	3562	3520	3520
q11	452	239	235	235
q12	497	218	217	217
q13	19372	2990	2966	2966
q14	275	240	239	239
q15	520	478	460	460
q16	535	384	391	384
q17	982	611	643	611
q18	7234	6809	6749	6749
q19	1558	1491	1487	1487
q20	687	305	297	297
q21	3450	2863	2719	2719
q22	367	300	304	300
Total cold run time: 111994 ms
Total hot run time: 38602 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4275	4206	4217	4206
q2	373	273	265	265
q3	3006	2726	2688	2688
q4	1906	1510	1523	1510
q5	5311	5243	5288	5243
q6	204	120	120	120
q7	2247	1847	1904	1847
q8	3206	3322	3287	3287
q9	8616	8531	8637	8531
q10	3973	3797	3841	3797
q11	610	505	495	495
q12	790	632	630	630
q13	17689	3153	3183	3153
q14	301	278	281	278
q15	527	483	487	483
q16	517	452	450	450
q17	1813	1460	1483	1460
q18	8018	7955	7752	7752
q19	1636	1521	1569	1521
q20	2057	1818	1828	1818
q21	5162	5015	5068	5015
q22	538	467	456	456
Total cold run time: 72775 ms
Total hot run time: 55005 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182947 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e2af7919693ad4e0a7f7921099ed87cc2be87362, data reload: false

query1	1239	1121	1117	1117
query2	6409	1981	1902	1902
query3	6412	222	224	222
query4	32295	21529	21481	21481
query5	3922	415	420	415
query6	224	176	182	176
query7	3808	304	311	304
query8	227	180	188	180
query9	5434	2319	2298	2298
query10	471	248	268	248
query11	14851	14289	14313	14289
query12	149	106	91	91
query13	932	371	378	371
query14	9890	7072	7006	7006
query15	241	187	177	177
query16	7245	274	272	272
query17	1297	618	575	575
query18	1805	291	294	291
query19	214	169	163	163
query20	95	94	89	89
query21	200	133	135	133
query22	4989	4846	4894	4846
query23	33416	32756	32962	32756
query24	7094	2853	2872	2853
query25	564	394	422	394
query26	717	163	166	163
query27	2718	334	340	334
query28	4142	1873	1858	1858
query29	885	651	635	635
query30	300	156	157	156
query31	965	727	733	727
query32	65	57	59	57
query33	441	264	271	264
query34	858	489	501	489
query35	776	732	708	708
query36	1052	869	904	869
query37	113	72	72	72
query38	3508	3404	3468	3404
query39	1587	1557	1545	1545
query40	203	134	135	134
query41	50	48	49	48
query42	105	98	103	98
query43	488	454	459	454
query44	1075	725	722	722
query45	276	267	286	267
query46	1084	722	702	702
query47	1924	1838	1858	1838
query48	388	306	306	306
query49	876	387	378	378
query50	783	405	398	398
query51	6774	6688	6665	6665
query52	112	94	102	94
query53	351	281	290	281
query54	272	261	267	261
query55	88	84	83	83
query56	256	244	237	237
query57	1201	1152	1138	1138
query58	231	213	222	213
query59	2782	2645	2580	2580
query60	270	259	268	259
query61	118	111	111	111
query62	587	461	434	434
query63	317	291	289	289
query64	4991	4164	4186	4164
query65	3115	3042	3069	3042
query66	836	338	341	338
query67	15732	14705	15091	14705
query68	8983	545	552	545
query69	628	343	338	338
query70	1286	1201	1170	1170
query71	524	275	275	275
query72	6641	2649	2480	2480
query73	850	326	337	326
query74	7164	6495	6293	6293
query75	3956	2351	2364	2351
query76	5529	983	1088	983
query77	665	271	267	267
query78	11039	10299	10251	10251
query79	12496	527	532	527
query80	2245	444	449	444
query81	529	225	1071	225
query82	765	94	97	94
query83	224	174	176	174
query84	261	82	87	82
query85	958	275	272	272
query86	414	286	300	286
query87	3773	3572	3534	3534
query88	6143	2421	2412	2412
query89	522	379	378	378
query90	2087	181	184	181
query91	130	100	101	100
query92	66	48	49	48
query93	7489	519	503	503
query94	1274	183	185	183
query95	416	312	318	312
query96	614	263	276	263
query97	2686	2474	2455	2455
query98	236	226	209	209
query99	1141	821	817	817
Total cold run time: 298569 ms
Total hot run time: 182947 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.75 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e2af7919693ad4e0a7f7921099ed87cc2be87362, data reload: false

query1	0.05	0.04	0.03
query2	0.08	0.04	0.04
query3	0.23	0.06	0.06
query4	1.64	0.10	0.09
query5	0.54	0.49	0.49
query6	1.11	0.66	0.68
query7	0.02	0.01	0.01
query8	0.06	0.04	0.04
query9	0.55	0.49	0.50
query10	0.55	0.54	0.53
query11	0.15	0.09	0.11
query12	0.14	0.11	0.10
query13	0.62	0.58	0.59
query14	0.77	0.79	0.77
query15	0.83	0.82	0.81
query16	0.37	0.39	0.38
query17	1.00	1.03	0.98
query18	0.24	0.23	0.22
query19	1.74	1.68	1.66
query20	0.01	0.02	0.01
query21	15.59	0.67	0.65
query22	2.67	2.46	2.03
query23	16.96	0.94	0.79
query24	1.11	0.23	0.20
query25	0.09	0.08	0.09
query26	0.22	0.17	0.17
query27	0.08	0.07	0.08
query28	14.13	0.94	0.93
query29	12.49	3.25	3.20
query30	0.30	0.08	0.08
query31	2.84	0.38	0.38
query32	3.27	0.44	0.45
query33	2.77	2.88	2.87
query34	16.57	4.37	4.44
query35	4.44	4.47	4.44
query36	0.63	0.46	0.46
query37	0.17	0.14	0.14
query38	0.14	0.13	0.13
query39	0.05	0.03	0.03
query40	0.18	0.16	0.14
query41	0.08	0.04	0.05
query42	0.05	0.04	0.05
query43	0.05	0.04	0.03
Total cold run time: 105.58 s
Total hot run time: 29.75 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit e2af7919693ad4e0a7f7921099ed87cc2be87362 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       13.6 seconds inserted 10000000 Rows, about 735K ops/s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 29, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@dataroaring dataroaring merged commit 347b048 into apache:master Mar 29, 2024
28 of 31 checks passed
dataroaring pushed a commit that referenced this pull request Jun 5, 2024
## Proposed changes

### Purpose

The user doc:
https://doris.apache.org/zh-CN/docs/dev/data-operate/import/transaction-load-manual

We have supported insert into
select(#31666),
update(#33034) and
delete(#33100) in transaction load.

#32980 implements one txn write to
one partition more than one rowsets.

This pr implements to cloud mode of
#32980

### Implementation

#### sub_txn_id

see #32980

#### Meta service supports commit txn

This process is generally the same as commit_txn, the difference is that
he partitions version will plus 1 in multi sub txns.

One example:
Suppose the table, partition, tablet and version info is:
```
--------------------------------------------
| table | partition | tablet    | version |
--------------------------------------------
| t1    | t1_p1     | t1_p1.1   | 1       |
| t1    | t1_p1     | t1_p1.2   | 1       |
| t1    | t1_p2     | t1_p2.1   | 2       |
| t2    | t2_p3     | t2_p3.1   | 3       |
| t2    | t2_p4     | t2_p4.1   | 4       |
--------------------------------------------
```

Now we commit a txn with 3 sub txns and the tablets are:
 *  sub_txn1: t1_p1.1, t1_p1.2, t1_p2.1
 *  sub_txn2: t2_p3.1
 *  sub_txn3: t1_p1.1, t1_p1.2

When commit, the partitions version will be:
 *  sub_txn1: t1_p1(1 -> 2), t1_p2(2 -> 3)
 *  sub_txn2: t2_p3(3 -> 4)
 *  sub_txn3: t1_p1(2 -> 3)

After commit, the partitions version will be:
 *  t1: t1_p1(3), t1_p2(3)
 *  t2: t2_p3(4), t2_p4(4)

#### Meta service support generate sub_txn_id by `begin_sub_txn`
dataroaring pushed a commit that referenced this pull request Jun 7, 2024
## Proposed changes

### Purpose

The user doc:
https://doris.apache.org/zh-CN/docs/dev/data-operate/import/transaction-load-manual

We have supported insert into
select(#31666),
update(#33034) and
delete(#33100) in transaction load.

#32980 implements one txn write to
one partition more than one rowsets.

This pr implements to cloud mode of
#32980

### Implementation

#### sub_txn_id

see #32980

#### Meta service supports commit txn

This process is generally the same as commit_txn, the difference is that
he partitions version will plus 1 in multi sub txns.

One example:
Suppose the table, partition, tablet and version info is:
```
--------------------------------------------
| table | partition | tablet    | version |
--------------------------------------------
| t1    | t1_p1     | t1_p1.1   | 1       |
| t1    | t1_p1     | t1_p1.2   | 1       |
| t1    | t1_p2     | t1_p2.1   | 2       |
| t2    | t2_p3     | t2_p3.1   | 3       |
| t2    | t2_p4     | t2_p4.1   | 4       |
--------------------------------------------
```

Now we commit a txn with 3 sub txns and the tablets are:
 *  sub_txn1: t1_p1.1, t1_p1.2, t1_p2.1
 *  sub_txn2: t2_p3.1
 *  sub_txn3: t1_p1.1, t1_p1.2

When commit, the partitions version will be:
 *  sub_txn1: t1_p1(1 -> 2), t1_p2(2 -> 3)
 *  sub_txn2: t2_p3(3 -> 4)
 *  sub_txn3: t1_p1(2 -> 3)

After commit, the partitions version will be:
 *  t1: t1_p1(3), t1_p2(3)
 *  t2: t2_p3(4), t2_p4(4)

#### Meta service support generate sub_txn_id by `begin_sub_txn`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants