Skip to content

[fix](txn) make TransactionState.publishVersionTasks thread-safe#63170

Open
Larborator wants to merge 1 commit into
apache:masterfrom
Larborator:fix/parallel-publish-version
Open

[fix](txn) make TransactionState.publishVersionTasks thread-safe#63170
Larborator wants to merge 1 commit into
apache:masterfrom
Larborator:fix/parallel-publish-version

Conversation

@Larborator
Copy link
Copy Markdown
Contributor

@Larborator Larborator commented May 12, 2026

With parallel publish version (enable_parallel_publish_version=true) the per-transaction publishVersionTasks map is touched by both the master daemon thread (addPublishVersionTask, iterate in tryFinishOneTxn) and the PUBLISH_VERSION_EXEC worker threads (iterate in tryFinishTxnSync, clear in pruneAfterVisible). The plain HashMap raises ConcurrentModificationException; the affected publish silently fails and the txn stays in COMMITTED state. Backlog grows each round and eventually trips max_publishing_txn_num_per_table (default 500), blocking stream load on hot tables.

Switch the map to ConcurrentHashMap. The inner List does not need to be thread-safe: addPublishVersionTask runs at most once per txn during the initial dispatch (guarded by hasSendTask) and worker threads only iterate the list later, so there is no read-during-write on the list.

What problem does this PR solve?

Issue Number: close #63169

Related PR: N/A

Problem Summary:

When parallel publish version is enabled (Config.enable_parallel_publish_version, default true since 4.0), the per-transaction publishVersionTasks map is touched concurrently by:

  • the master PUBLISH_VERSION daemon, which iterates the map in tryFinishOneTxn; and
  • the PUBLISH_VERSION_EXEC worker for that txn's db -- routed by dbId % publish_thread_pool_num to a single-thread pool, which iterates the map in tryFinishTxnSync and clears it via pruneAfterVisible after the txn becomes VISIBLE.

The plain HashMap throws ConcurrentModificationException. The CME is caught at the outer layer so FE does not crash, but that publish round aborts and the txn stays in COMMITTED. Backlog grows each daemon round and eventually blocks new stream load via max_running_txn_num_per_db (default 10000) or max_publishing_txn_num_per_table (default 500),
whichever trips first.

Sample stack trace from a production FE:

java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1597)
at java.util.HashMap$EntryIterator.next(HashMap.java:1630)
at org.apache.doris.transaction.PublishVersionDaemon.tryFinishOneTxn(PublishVersionDaemon.java:191)

Fix: switch publishVersionTasks to ConcurrentHashMap. The inner List<PublishVersionTask> does not need to be thread-safe -- addPublishVersionTask is invoked by the master daemon only during the initial dispatch in traverseReadyTxnAndDispatchPublishVersionTask (guarded by hasSendTask) and runs strictly before any worker iteration, so there is no
read-during-write on the list.

Release note:

Fix ConcurrentModificationException on TransactionState.publishVersionTasks under parallel publish version, which could stall publish and eventually block stream load on hot dbs/tables.

Check List (For Author)

  • Test

    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.

    Manual test steps:

    1. Start a cluster with enable_parallel_publish_version = true (default in 4.0).
    2. Sustain concurrent stream load across many dbs/tables so that multiple txns enter tryFinishOneTxn in the same daemon round while previous-round workers are still running pruneAfterVisible.
    3. Before the fix: fe.warn.log eventually shows ConcurrentModificationException with the stack java.util.HashMap$HashIterator.nextNode -> PublishVersionDaemon.tryFinishOneTxn (the iteration aborts for that round; the txn is normally re-published in a later round).
    4. After the fix: under the same workload the CME no longer appears in fe.warn.log.
  • Behavior changed:

    • No.
  • Does this need documentation?

    • No.

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

When parallel publish version is enabled (Config.enable_parallel_publish_version), the per-transaction publishVersionTasks map is touched by both the master PUBLISH_VERSION daemon (iterate in tryFinishOneTxn) and the PUBLISH_VERSION_EXEC worker for that txn's db -- routed by `dbId % publish_thread_pool_num` to a single-thread pool, which iterates the map in tryFinishTxnSync and clears it via pruneAfterVisible after the txn becomes VISIBLE. The plain HashMap raises ConcurrentModificationException; that publish round aborts and the txn stays in COMMITTED. Backlog grows each round and eventually blocks new stream load vi  max_running_txn_num_per_db (default 10000) or max_publishing_txn_num_per_table (default 500), whichever trips first.

Switch the map to ConcurrentHashMap. The inner List does not need to be thread-safe: addPublishVersionTask is invoked by the master daemon only during the initial dispatch (guarded by hasSendTask) and runs strictly before any worker iteration, so there is no read-during-write on the list.
@Larborator Larborator force-pushed the fix/parallel-publish-version branch from 5783ff6 to 9ca2282 Compare May 13, 2026 08:36
@liaoxin01
Copy link
Copy Markdown
Contributor

run buildall

Copy link
Copy Markdown
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@liaoxin01
Copy link
Copy Markdown
Contributor

/review

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking issues found.

Critical checkpoint conclusions:

  • Goal and proof: The PR targets ConcurrentModificationException on TransactionState.publishVersionTasks during parallel publish. Changing the map to ConcurrentHashMap addresses the observed concurrent iteration/clear path. The PR includes manual test evidence, but no automated regression/unit test was added.
  • Scope: The modification is small and focused on the shared map; no unrelated code is changed in the authoritative PR diff.
  • Concurrency: The changed state is shared between the PUBLISH_VERSION daemon and PUBLISH_VERSION_EXEC workers. ConcurrentHashMap protects map iteration and clear from CME. The inner List remains non-concurrent, which is consistent with the current dispatch flow where additions happen before hasSendTask prevents redispatch and before worker-side finish iterations.
  • Lifecycle/static initialization: No new static/global lifecycle risk introduced. Field initialization is safe for constructors; publishVersionTasks is not serialized because fields without @SerializedName are excluded during Gson serialization, so deserialization does not appear to overwrite this initializer from normal persisted images.
  • Configuration: No new config added.
  • Compatibility/persistence: No storage format, thrift, or edit-log format change. The field remains non-persisted runtime state.
  • Parallel paths: Reviewed the publish generation, finish, and prune paths in PublishVersionDaemon and DatabaseTransactionMgr; no additional map instance needing the same change was found.
  • Error handling/data correctness: Publish-finish semantics and transaction state transitions are unchanged; the change only prevents failure while observing task state.
  • Observability: Existing publish error logs remain sufficient for this fix.
  • Tests: Manual test only. An automated concurrency test would be useful but is not required for approval given the minimal container change.

User focus: No additional user-provided review focus was supplied.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29636 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9ca228248f86ad8382442ed2e62fa28a82981443, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17712	3973	4008	3973
q2	q3	10717	871	642	642
q4	4659	447	341	341
q5	7438	1329	1126	1126
q6	188	172	146	146
q7	915	951	759	759
q8	9295	1380	1237	1237
q9	5632	5385	5329	5329
q10	6298	2056	1818	1818
q11	467	259	248	248
q12	682	414	294	294
q13	18219	3268	2767	2767
q14	291	283	262	262
q15	q16	900	853	786	786
q17	899	1014	725	725
q18	6463	5664	5560	5560
q19	1223	1259	1103	1103
q20	515	383	255	255
q21	5037	2323	1937	1937
q22	451	384	328	328
Total cold run time: 98001 ms
Total hot run time: 29636 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4859	4564	4588	4564
q2	q3	4679	4779	4189	4189
q4	2124	2170	1385	1385
q5	4953	5033	5274	5033
q6	203	171	139	139
q7	2030	1845	1604	1604
q8	3301	3055	3088	3055
q9	8493	8438	8360	8360
q10	4458	4463	4247	4247
q11	601	429	423	423
q12	697	765	526	526
q13	3260	3588	3010	3010
q14	296	312	276	276
q15	q16	764	825	686	686
q17	1370	1346	1264	1264
q18	7883	7074	7128	7074
q19	1163	1171	1157	1157
q20	2248	2225	1990	1990
q21	6051	5397	4876	4876
q22	522	468	399	399
Total cold run time: 59955 ms
Total hot run time: 54257 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170509 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9ca228248f86ad8382442ed2e62fa28a82981443, data reload: false

query5	4313	640	527	527
query6	342	213	201	201
query7	4273	542	308	308
query8	325	227	223	223
query9	8859	3980	3961	3961
query10	445	346	305	305
query11	5859	2423	2204	2204
query12	180	125	122	122
query13	1277	573	437	437
query14	6646	5305	5023	5023
query14_1	4335	4324	4303	4303
query15	205	208	182	182
query16	982	399	425	399
query17	1119	737	620	620
query18	2759	488	350	350
query19	217	207	170	170
query20	138	131	131	131
query21	215	141	116	116
query22	13606	13466	13332	13332
query23	17108	16382	15937	15937
query23_1	16131	16166	16109	16109
query24	7493	1759	1359	1359
query24_1	1341	1351	1364	1351
query25	586	533	464	464
query26	1311	309	170	170
query27	2720	559	343	343
query28	4463	1918	1966	1918
query29	1005	665	545	545
query30	307	232	201	201
query31	1120	1076	966	966
query32	93	77	77	77
query33	561	366	295	295
query34	1156	1099	639	639
query35	759	792	654	654
query36	1313	1355	1179	1179
query37	153	108	95	95
query38	3211	3196	3046	3046
query39	923	940	916	916
query39_1	925	919	881	881
query40	241	162	142	142
query41	79	67	68	67
query42	114	116	112	112
query43	333	335	300	300
query44	
query45	212	208	202	202
query46	1109	1250	751	751
query47	2320	2429	2192	2192
query48	383	430	303	303
query49	646	563	437	437
query50	703	291	223	223
query51	4349	4260	4198	4198
query52	107	108	94	94
query53	250	275	208	208
query54	323	292	272	272
query55	101	96	99	96
query56	330	309	327	309
query57	1420	1420	1347	1347
query58	327	290	270	270
query59	1556	1612	1423	1423
query60	398	338	317	317
query61	160	151	159	151
query62	669	615	560	560
query63	237	201	208	201
query64	2358	806	675	675
query65	
query66	1693	519	395	395
query67	29977	29940	29739	29739
query68	
query69	451	333	304	304
query70	1026	1019	990	990
query71	297	262	264	262
query72	2945	2690	2450	2450
query73	829	766	424	424
query74	5060	4867	4723	4723
query75	2763	2710	2339	2339
query76	2287	1106	781	781
query77	413	424	346	346
query78	12938	12916	12238	12238
query79	1488	956	731	731
query80	666	582	471	471
query81	450	278	240	240
query82	1268	156	121	121
query83	352	282	253	253
query84	265	146	111	111
query85	842	527	436	436
query86	385	343	319	319
query87	3451	3350	3198	3198
query88	3527	2671	2657	2657
query89	434	381	334	334
query90	1941	183	192	183
query91	175	160	141	141
query92	74	78	73	73
query93	955	974	562	562
query94	524	342	293	293
query95	667	464	366	366
query96	1064	794	351	351
query97	2714	2723	2561	2561
query98	233	226	225	225
query99	1116	1128	993	993
Total cold run time: 253727 ms
Total hot run time: 170509 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x dev/4.1.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] ConcurrentModificationException on TransactionState.publishVersionTasks under parallel publish version

3 participants