Skip to content

branch-3.0: [opt](compaction) Don't check missed rows in cumu compaction if input rowsets are not in tablet #45279#45303

Merged
dataroaring merged 1 commit intobranch-3.0from
auto-pick-45279-branch-3.0
Dec 19, 2024
Merged

branch-3.0: [opt](compaction) Don't check missed rows in cumu compaction if input rowsets are not in tablet #45279#45303
dataroaring merged 1 commit intobranch-3.0from
auto-pick-45279-branch-3.0

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #45279

… rowsets are not in tablet (#45279)

Problem Summary:

Suppose a heavy schema change process on BE converting tablet A to
tablet B.
1. during schema change double write, new loads write [X-Y] on tablet B.
2. rowsets with version [a],[a+1],...,[b-1],[b] on tablet B are picked
for cumu compaction(X<=a<b<=Y).(cumu compaction on new tablet during
schema change double write is allowed after
#16470)
3. schema change remove all rowsets on tablet B before version
Z(b<=Z<=Y) before it begins to convert historical rowsets.
4. schema change finishes.
5. cumu compation begins on new tablet with version [a],...,[b]. If
there are duplicate keys between these rowsets, the compaction check
will fail because these rowsets have skipped to calculate delete bitmap
in commit phase and publish phase because tablet B is in NOT_READY state
when writing.

This PR makes cumu compaction skip to do missed rows check when it finds
out that the input rowsets are not exist in tablet.

Note that the compaction will fail because `Tablet::modify_rowsets` will
check if rowsets in `to_delete` still exist in tablet's
`_rs_version_map`.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Dec 11, 2024
@doris-robot
Copy link

run buildall

@github-actions
Copy link
Contributor Author

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 40679 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c1c6705be04248c1a7e781750806d61cd83c652a, data reload: false

------ Round 1 ----------------------------------
q1	17560	7464	7371	7371
q2	2045	172	164	164
q3	10716	1072	1118	1072
q4	10537	752	725	725
q5	7754	2814	2764	2764
q6	236	148	146	146
q7	953	616	623	616
q8	9584	1971	2061	1971
q9	7788	6452	6452	6452
q10	6961	2301	2277	2277
q11	452	266	279	266
q12	402	227	220	220
q13	17762	2989	3023	2989
q14	240	228	207	207
q15	571	528	505	505
q16	655	598	620	598
q17	983	524	585	524
q18	7233	6610	6521	6521
q19	1378	1004	1097	1004
q20	461	195	189	189
q21	3970	3136	3148	3136
q22	1094	962	981	962
Total cold run time: 109335 ms
Total hot run time: 40679 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7325	7368	7245	7245
q2	314	227	224	224
q3	2987	2887	2887	2887
q4	2059	1771	1837	1771
q5	5684	5673	5674	5673
q6	233	151	149	149
q7	2168	1773	1759	1759
q8	3325	3542	3427	3427
q9	8956	8940	8891	8891
q10	3572	3522	3563	3522
q11	620	502	505	502
q12	802	602	596	596
q13	16474	3131	3113	3113
q14	298	271	271	271
q15	575	526	524	524
q16	746	647	662	647
q17	1869	1663	1603	1603
q18	8315	7828	7521	7521
q19	5863	1602	1394	1394
q20	2072	1874	1844	1844
q21	5319	5230	5156	5156
q22	1173	1043	1042	1042
Total cold run time: 80749 ms
Total hot run time: 59761 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196624 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c1c6705be04248c1a7e781750806d61cd83c652a, data reload: false

query1	1251	959	902	902
query2	6238	2022	2009	2009
query3	10940	4379	4279	4279
query4	66428	29298	23339	23339
query5	5356	455	435	435
query6	446	169	175	169
query7	5664	308	303	303
query8	309	232	233	232
query9	9356	2686	2672	2672
query10	507	286	265	265
query11	17841	15293	15815	15293
query12	163	97	107	97
query13	1521	432	432	432
query14	10737	6998	7118	6998
query15	222	174	173	173
query16	6951	516	504	504
query17	1270	586	586	586
query18	1775	338	320	320
query19	214	158	155	155
query20	117	113	113	113
query21	66	45	44	44
query22	4799	4631	4827	4631
query23	34626	34162	34051	34051
query24	6276	3009	2808	2808
query25	540	400	390	390
query26	693	173	172	172
query27	1953	293	301	293
query28	4061	2529	2497	2497
query29	683	457	426	426
query30	249	163	156	156
query31	994	821	847	821
query32	65	61	56	56
query33	489	303	271	271
query34	900	513	510	510
query35	829	749	748	748
query36	1074	961	925	925
query37	125	70	77	70
query38	4149	3989	4027	3989
query39	1499	1437	1484	1437
query40	146	81	80	80
query41	48	46	45	45
query42	108	94	102	94
query43	537	490	497	490
query44	1172	804	797	797
query45	183	176	162	162
query46	1138	735	737	735
query47	2026	1896	1897	1896
query48	454	389	376	376
query49	723	373	366	366
query50	835	419	419	419
query51	7298	7253	7222	7222
query52	96	95	96	95
query53	257	185	184	184
query54	575	464	459	459
query55	75	74	78	74
query56	274	265	244	244
query57	1183	1110	1108	1108
query58	210	213	216	213
query59	3155	3109	2932	2932
query60	287	275	266	266
query61	132	129	124	124
query62	766	666	657	657
query63	209	187	201	187
query64	2004	774	724	724
query65	3286	3202	3164	3164
query66	721	301	293	293
query67	15773	15499	15425	15425
query68	4563	557	553	553
query69	457	264	256	256
query70	1135	1089	1140	1089
query71	440	260	255	255
query72	6577	3965	3909	3909
query73	759	348	341	341
query74	10277	8943	8953	8943
query75	3480	2616	2831	2616
query76	2670	1089	1117	1089
query77	517	268	264	264
query78	10781	9834	9531	9531
query79	9289	597	592	592
query80	1982	413	431	413
query81	555	238	234	234
query82	1348	119	118	118
query83	311	143	142	142
query84	287	81	80	80
query85	1720	297	287	287
query86	453	299	284	284
query87	4350	4292	4154	4154
query88	5568	2403	2412	2403
query89	550	290	290	290
query90	2127	185	185	185
query91	175	142	139	139
query92	67	49	48	48
query93	6616	541	528	528
query94	826	290	301	290
query95	341	252	257	252
query96	618	286	283	283
query97	3328	3181	3154	3154
query98	224	213	198	198
query99	1617	1302	1304	1302
Total cold run time: 339055 ms
Total hot run time: 196624 ms

@dataroaring dataroaring merged commit 8fc26d5 into branch-3.0 Dec 19, 2024
@github-actions github-actions bot deleted the auto-pick-45279-branch-3.0 branch December 19, 2024 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants