Skip to content

[fix](paimon) Pin snapshot ID for Paimon external table queries to avoid compaction issues#63406

Draft
suxiaogang223 wants to merge 1 commit into
apache:masterfrom
suxiaogang223:fix-paimon-compaction-opensource-263
Draft

[fix](paimon) Pin snapshot ID for Paimon external table queries to avoid compaction issues#63406
suxiaogang223 wants to merge 1 commit into
apache:masterfrom
suxiaogang223:fix-paimon-compaction-opensource-263

Conversation

@suxiaogang223
Copy link
Copy Markdown
Member

@suxiaogang223 suxiaogang223 commented May 19, 2026

What problem does this PR solve?

Issue Number: close #xxx

Related PR: None

Problem Summary:
Doris previously used an unpinned Paimon table handle for normal queries, relying on the Paimon SDK to discover the latest snapshot. This could lead to empty results if a compaction occurred during the query, as the SDK might fail to resolve the new snapshot correctly.

This change modifies PaimonExternalTable to always use the pinned table from the snapshot cache, ensuring consistency across all query types (Normal, MTMV, and MVCC).

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This logic is consistent with the existing MTMV path which is already verified.
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

…oid compaction issues

Doris previously used an unpinned Paimon table handle for normal queries, relying on the Paimon SDK to discover the latest snapshot. This could lead to empty results if a compaction occurred during the query, as the SDK might fail to resolve the new snapshot correctly.

This change modifies PaimonExternalTable to always use the pinned table from the snapshot cache, ensuring consistency across all query types (Normal, MTMV, and MVCC).

Jira: OPENSOURCE-263
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@suxiaogang223
Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 30892 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4a00013b0fe76ffd30a57752e54b5dcea7371fa7, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17677	3844	3861	3844
q2	q3	10815	1339	807	807
q4	4687	471	349	349
q5	7717	2264	2151	2151
q6	236	176	136	136
q7	926	768	657	657
q8	9403	1749	1572	1572
q9	5437	4897	4895	4895
q10	6384	2063	1786	1786
q11	435	269	247	247
q12	667	435	294	294
q13	18116	3381	2781	2781
q14	262	253	233	233
q15	q16	815	770	705	705
q17	937	990	1003	990
q18	6899	5938	5491	5491
q19	1175	1251	1016	1016
q20	509	398	263	263
q21	5793	2608	2371	2371
q22	434	356	304	304
Total cold run time: 99324 ms
Total hot run time: 30892 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4136	4082	4075	4075
q2	q3	4537	4916	4292	4292
q4	2088	2184	1408	1408
q5	4402	4245	4285	4245
q6	222	173	128	128
q7	1749	2218	1765	1765
q8	2508	2102	2141	2102
q9	7862	8099	7622	7622
q10	4531	4483	4033	4033
q11	551	411	372	372
q12	732	725	522	522
q13	3254	3593	2897	2897
q14	312	299	283	283
q15	q16	736	735	621	621
q17	1312	1309	1284	1284
q18	8105	7490	7218	7218
q19	1154	1139	1095	1095
q20	2209	2212	1925	1925
q21	5273	4600	4393	4393
q22	535	462	399	399
Total cold run time: 56208 ms
Total hot run time: 50679 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170723 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4a00013b0fe76ffd30a57752e54b5dcea7371fa7, data reload: false

query5	4324	665	514	514
query6	330	234	207	207
query7	4313	570	312	312
query8	335	242	222	222
query9	8855	4148	4107	4107
query10	455	343	306	306
query11	5811	2515	2236	2236
query12	182	132	129	129
query13	1274	585	433	433
query14	5984	5376	5083	5083
query14_1	4386	4396	4355	4355
query15	211	209	185	185
query16	1027	472	413	413
query17	1164	737	620	620
query18	2753	488	369	369
query19	224	205	185	185
query20	141	134	132	132
query21	218	145	124	124
query22	13693	13588	13409	13409
query23	17315	16430	16043	16043
query23_1	16174	16196	16298	16196
query24	7451	1748	1312	1312
query24_1	1322	1317	1335	1317
query25	602	520	447	447
query26	1305	318	178	178
query27	2684	581	341	341
query28	4461	1990	1960	1960
query29	1030	646	529	529
query30	325	236	203	203
query31	1123	1077	947	947
query32	94	79	76	76
query33	556	371	317	317
query34	1180	1148	655	655
query35	798	773	679	679
query36	1320	1313	1146	1146
query37	145	104	92	92
query38	3235	3139	3067	3067
query39	943	928	897	897
query39_1	873	887	873	873
query40	235	146	130	130
query41	65	63	62	62
query42	116	111	108	108
query43	329	324	284	284
query44	
query45	211	207	199	199
query46	1048	1179	727	727
query47	2283	2304	2159	2159
query48	391	388	283	283
query49	640	498	384	384
query50	1023	357	242	242
query51	4345	4278	4251	4251
query52	107	106	95	95
query53	265	301	209	209
query54	312	276	256	256
query55	96	94	86	86
query56	299	295	304	295
query57	1409	1415	1349	1349
query58	301	275	271	271
query59	1545	1638	1418	1418
query60	328	329	318	318
query61	161	162	156	156
query62	677	620	570	570
query63	239	211	214	211
query64	2406	811	661	661
query65	
query66	1715	480	356	356
query67	30109	29844	29788	29788
query68	
query69	471	334	311	311
query70	1040	976	907	907
query71	311	280	270	270
query72	2958	2746	2445	2445
query73	851	749	418	418
query74	5073	4897	4741	4741
query75	2681	2583	2258	2258
query76	2314	1157	803	803
query77	402	414	339	339
query78	12075	12057	11563	11563
query79	1481	1017	763	763
query80	928	536	485	485
query81	507	284	241	241
query82	1364	163	125	125
query83	357	280	263	263
query84	264	142	110	110
query85	934	556	465	465
query86	446	366	320	320
query87	3458	3321	3208	3208
query88	3518	2675	2673	2673
query89	448	396	342	342
query90	1791	182	183	182
query91	185	169	142	142
query92	82	81	78	78
query93	1487	1473	927	927
query94	637	340	303	303
query95	687	494	344	344
query96	1059	767	332	332
query97	2724	2706	2579	2579
query98	241	229	237	229
query99	1110	1152	985	985
Total cold run time: 254079 ms
Total hot run time: 170723 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/105) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants