Skip to content

[fix](iow) force drop partition in INSERT OVERWRITE#62510

Merged
starocean999 merged 1 commit into
apache:masterfrom
morrySnow:opt-iow
May 14, 2026
Merged

[fix](iow) force drop partition in INSERT OVERWRITE#62510
starocean999 merged 1 commit into
apache:masterfrom
morrySnow:opt-iow

Conversation

@morrySnow
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Problem Summary:

"Insert overwrite" can be achieved by using "replace partition". By default, the replaced partition will be placed in the recycle bin. However, in the "insert overwrite" scenario, the partitions placed in the recycle bin make it difficult to restore the table to its previous state. And if a large number of partitions are placed in the recycle bin, it will significantly increase the load of the recycle bin. Therefore, now the "insert overwrite" operation no longer places the replaced partitions into the recycle bin.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@morrySnow
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morrySnow morrySnow changed the title [opt](iow) force drop partition in INSERT OVERWRITE [fix](iow) force drop partition in INSERT OVERWRITE Apr 15, 2026
@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@morrySnow
Copy link
Copy Markdown
Contributor Author

run buildall

@morrySnow
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29256 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9d72474d5512484b711d574832b8bcc9139984a4, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17625	3806	3808	3806
q2	q3	10709	856	606	606
q4	4661	467	347	347
q5	7437	1318	1122	1122
q6	186	169	138	138
q7	932	937	744	744
q8	9362	1367	1278	1278
q9	5614	5376	5312	5312
q10	6299	2074	1795	1795
q11	484	262	257	257
q12	690	412	292	292
q13	18195	3327	2764	2764
q14	292	285	263	263
q15	q16	906	874	791	791
q17	1045	997	768	768
q18	6375	5669	5455	5455
q19	1231	1215	1086	1086
q20	516	396	266	266
q21	4882	2249	1868	1868
q22	417	350	298	298
Total cold run time: 97858 ms
Total hot run time: 29256 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4161	4081	4068	4068
q2	q3	4637	4731	4194	4194
q4	2077	2163	1375	1375
q5	4951	4958	5221	4958
q6	187	164	130	130
q7	2041	1772	1783	1772
q8	3501	3210	3189	3189
q9	8457	8449	8393	8393
q10	4466	4483	4221	4221
q11	577	436	412	412
q12	704	765	538	538
q13	3139	3625	2896	2896
q14	298	297	272	272
q15	q16	778	797	682	682
q17	1349	1283	1235	1235
q18	8204	7183	7044	7044
q19	1180	1162	1145	1145
q20	2215	2269	1944	1944
q21	6049	5392	4850	4850
q22	533	490	427	427
Total cold run time: 59504 ms
Total hot run time: 53745 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171233 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9d72474d5512484b711d574832b8bcc9139984a4, data reload: false

query5	4316	646	524	524
query6	329	225	195	195
query7	4245	536	295	295
query8	318	222	216	216
query9	8811	4002	4019	4002
query10	475	333	294	294
query11	6091	2432	2180	2180
query12	192	138	131	131
query13	1290	619	452	452
query14	6832	5374	5106	5106
query14_1	4399	4358	4370	4358
query15	208	206	197	197
query16	1041	460	438	438
query17	1386	784	647	647
query18	2755	494	361	361
query19	318	206	171	171
query20	146	138	143	138
query21	226	143	122	122
query22	13675	14112	14310	14112
query23	17344	16545	16187	16187
query23_1	16369	16366	16436	16366
query24	7811	1838	1366	1366
query24_1	1391	1322	1362	1322
query25	552	487	420	420
query26	1280	304	171	171
query27	2638	603	335	335
query28	4296	1951	1938	1938
query29	974	637	513	513
query30	311	232	195	195
query31	1095	1073	936	936
query32	96	72	71	71
query33	529	340	281	281
query34	1148	1127	643	643
query35	785	770	670	670
query36	1335	1374	1152	1152
query37	151	106	88	88
query38	3132	3143	3093	3093
query39	940	918	894	894
query39_1	867	863	871	863
query40	237	157	146	146
query41	67	60	60	60
query42	112	108	105	105
query43	317	329	280	280
query44	
query45	210	201	195	195
query46	1073	1215	735	735
query47	2336	2362	2235	2235
query48	415	427	294	294
query49	731	520	431	431
query50	709	289	213	213
query51	4335	4280	4214	4214
query52	105	102	96	96
query53	254	277	201	201
query54	309	263	248	248
query55	92	90	82	82
query56	304	314	316	314
query57	1421	1392	1306	1306
query58	303	279	266	266
query59	1551	1678	1428	1428
query60	360	328	323	323
query61	160	160	156	156
query62	666	618	575	575
query63	248	203	204	203
query64	2291	834	670	670
query65	
query66	1663	504	392	392
query67	30004	29898	29859	29859
query68	
query69	444	342	305	305
query70	1016	1002	1033	1002
query71	298	278	269	269
query72	3047	2739	2459	2459
query73	803	778	446	446
query74	5070	4903	4740	4740
query75	2773	2670	2339	2339
query76	2289	1140	768	768
query77	411	413	363	363
query78	12847	12937	12343	12343
query79	1530	983	710	710
query80	1355	568	480	480
query81	527	286	247	247
query82	986	161	119	119
query83	322	274	241	241
query84	269	149	111	111
query85	913	518	463	463
query86	440	314	348	314
query87	3421	3342	3237	3237
query88	3527	2655	2626	2626
query89	442	374	340	340
query90	1940	182	182	182
query91	182	166	142	142
query92	78	77	73	73
query93	1238	945	555	555
query94	744	330	297	297
query95	669	379	427	379
query96	995	759	332	332
query97	2697	2681	2594	2594
query98	235	228	224	224
query99	1137	1116	982	982
Total cold run time: 256111 ms
Total hot run time: 171233 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 9.09% (1/11) 🎉
Increment coverage report
Complete coverage report

@morrySnow
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: request changes. The PR goal is to ensure INSERT OVERWRITE no longer leaves replaced partitions in the recycle bin, but the change only affects the non-auto-detect path. Auto-detect insert overwrite (PARTITION(*)) still uses InsertOverwriteManager.taskGroupSuccess(), which calls InsertOverwriteUtil.replacePartition(targetTable, oldNames, newNames) without force, so replaced partitions remain recoverable and continue loading the recycle bin.

Critical checkpoint conclusions:

  • Goal/test: The stated goal is not fully accomplished; no replacement test was added for the new non-recoverable behavior, and all existing recover-related tests were deleted.
  • Scope/clarity: The code change is small, but too narrow because it misses a parallel commit path.
  • Concurrency/lifecycle/config/compatibility: No new concurrency, lifecycle, config, or serialization compatibility risk found in the edited line.
  • Parallel paths: Blocking issue found in the auto-detect task-group path.
  • Tests: Coverage is insufficient; deleted tests should be replaced with tests that assert recovery fails after INSERT OVERWRITE for both normal and auto-detect paths.
  • Observability/performance/transaction/persistence: No additional issue found beyond the incorrect force-drop propagation.
  • User focus: No additional user-provided review focus was supplied.

@morrySnow
Copy link
Copy Markdown
Contributor Author

run buildall

@morrySnow
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one blocking issue. The code path now propagates forceDropPartition through both direct and auto-detect insert-overwrite replacement paths, so the previously raised auto-detect implementation gap appears addressed. However, the PR removes the recycle-bin recovery regression tests without adding replacement coverage for the new behavior.

Critical checkpoint conclusions:

  • Goal/test proof: the implementation aims to keep INSERT OVERWRITE replaced partitions out of the recycle bin, but the PR does not include a regression test proving recover now fails or is impossible for direct, whole-table, unpartitioned, and auto-detect paths.
  • Scope: the code change is small and focused, but deleting the existing tests leaves the behavior unguarded.
  • Concurrency/lifecycle/config: no new concurrency, special lifecycle, or config concerns found beyond existing insert-overwrite manager behavior.
  • Compatibility/protocol: the added optional Thrift field is wire-compatible; no additional FE-BE variable path found.
  • Parallel paths: the prior direct vs auto-detect path gap is now handled in code, but needs coverage.
  • Test coverage/results: blocking gap; deleted tests are not replaced with negative/positive assertions for the new recycle-bin semantics.
  • Observability/transactions/persistence/performance: no additional issue found in the changed lines.

User focus: no additional user-provided review focus was supplied.

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.03% (1850/2371)
Line Coverage 64.69% (33228/51362)
Region Coverage 65.23% (16448/25214)
Branch Coverage 55.72% (8781/15758)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.56% (20612/38482)
Line Coverage 37.20% (194865/523865)
Region Coverage 33.60% (152462/453707)
Branch Coverage 34.61% (66496/192116)

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29896 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f4c7d37f483af9a775b616508c542b369865baf6, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17632	3868	3840	3840
q2	q3	10654	889	611	611
q4	4657	476	345	345
q5	7428	1345	1161	1161
q6	191	169	143	143
q7	924	999	770	770
q8	9306	1407	1326	1326
q9	5713	5427	5415	5415
q10	6321	2068	1788	1788
q11	473	276	266	266
q12	634	428	294	294
q13	18091	3516	2790	2790
q14	292	281	266	266
q15	q16	907	881	798	798
q17	1012	1059	785	785
q18	6461	5640	5619	5619
q19	1247	1249	1083	1083
q20	516	424	264	264
q21	4588	2411	1988	1988
q22	479	433	344	344
Total cold run time: 97526 ms
Total hot run time: 29896 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4764	4719	4872	4719
q2	q3	4664	4831	4199	4199
q4	2171	2221	1428	1428
q5	5025	5005	5251	5005
q6	192	172	134	134
q7	2093	1830	1612	1612
q8	3348	3195	3157	3157
q9	8438	8479	8491	8479
q10	4467	4493	4277	4277
q11	606	426	393	393
q12	697	759	515	515
q13	3124	3534	2994	2994
q14	299	311	277	277
q15	q16	754	957	691	691
q17	1364	1291	1262	1262
q18	7933	7142	7158	7142
q19	1174	1175	1151	1151
q20	2232	2223	1972	1972
q21	6148	5444	4923	4923
q22	562	511	422	422
Total cold run time: 60055 ms
Total hot run time: 54752 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172117 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f4c7d37f483af9a775b616508c542b369865baf6, data reload: false

query5	4322	658	509	509
query6	336	230	211	211
query7	4390	535	308	308
query8	326	232	227	227
query9	8835	4128	4037	4037
query10	457	376	310	310
query11	5833	2387	2224	2224
query12	179	130	121	121
query13	1289	578	439	439
query14	6537	5412	5089	5089
query14_1	4408	4395	4367	4367
query15	217	209	186	186
query16	1028	466	460	460
query17	1360	811	611	611
query18	2729	482	355	355
query19	268	193	165	165
query20	138	131	126	126
query21	212	136	123	123
query22	13873	14817	14255	14255
query23	17501	16502	16278	16278
query23_1	16375	16244	16084	16084
query24	7414	1743	1353	1353
query24_1	1366	1329	1359	1329
query25	562	478	412	412
query26	1291	328	172	172
query27	2684	587	332	332
query28	4326	1973	1968	1968
query29	980	633	522	522
query30	304	246	197	197
query31	1117	1064	945	945
query32	85	75	73	73
query33	532	350	299	299
query34	1158	1115	644	644
query35	811	780	671	671
query36	1333	1311	1128	1128
query37	149	100	84	84
query38	3189	3075	3066	3066
query39	914	926	882	882
query39_1	895	884	890	884
query40	232	154	137	137
query41	65	68	59	59
query42	113	105	107	105
query43	324	335	290	290
query44	
query45	211	205	192	192
query46	1077	1214	744	744
query47	2285	2280	2131	2131
query48	398	433	304	304
query49	624	537	431	431
query50	715	291	222	222
query51	4296	4254	4237	4237
query52	104	106	93	93
query53	254	284	209	209
query54	319	278	258	258
query55	93	89	86	86
query56	290	298	301	298
query57	1396	1374	1309	1309
query58	287	273	272	272
query59	1534	1685	1412	1412
query60	350	335	327	327
query61	162	151	156	151
query62	670	619	565	565
query63	241	201	203	201
query64	2389	834	665	665
query65	
query66	1709	511	400	400
query67	30077	29979	29809	29809
query68	
query69	461	354	322	322
query70	1062	976	1019	976
query71	327	288	281	281
query72	3168	2874	2414	2414
query73	864	741	404	404
query74	5087	4922	4731	4731
query75	2776	2649	2312	2312
query76	2268	1124	768	768
query77	425	431	342	342
query78	13010	12944	12379	12379
query79	1475	1030	752	752
query80	713	605	482	482
query81	461	278	239	239
query82	1363	155	131	131
query83	353	267	248	248
query84	277	138	112	112
query85	841	497	434	434
query86	392	330	311	311
query87	3433	3354	3236	3236
query88	3537	2663	2711	2663
query89	438	379	338	338
query90	1883	191	184	184
query91	181	172	142	142
query92	78	79	70	70
query93	989	949	564	564
query94	559	338	301	301
query95	684	457	372	372
query96	1015	754	321	321
query97	2669	2692	2560	2560
query98	248	228	232	228
query99	1106	1115	976	976
Total cold run time: 255042 ms
Total hot run time: 172117 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.70% (27772/37684)
Line Coverage 57.58% (300884/522510)
Region Coverage 54.85% (251308/458141)
Branch Coverage 56.35% (108671/192856)

@starocean999 starocean999 merged commit b30fbf9 into apache:master May 14, 2026
30 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants