[fix](txn) make TransactionState.publishVersionTasks thread-safe#63170
Open
Larborator wants to merge 1 commit into
Open
[fix](txn) make TransactionState.publishVersionTasks thread-safe#63170Larborator wants to merge 1 commit into
Larborator wants to merge 1 commit into
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
When parallel publish version is enabled (Config.enable_parallel_publish_version), the per-transaction publishVersionTasks map is touched by both the master PUBLISH_VERSION daemon (iterate in tryFinishOneTxn) and the PUBLISH_VERSION_EXEC worker for that txn's db -- routed by `dbId % publish_thread_pool_num` to a single-thread pool, which iterates the map in tryFinishTxnSync and clears it via pruneAfterVisible after the txn becomes VISIBLE. The plain HashMap raises ConcurrentModificationException; that publish round aborts and the txn stays in COMMITTED. Backlog grows each round and eventually blocks new stream load vi max_running_txn_num_per_db (default 10000) or max_publishing_txn_num_per_table (default 500), whichever trips first. Switch the map to ConcurrentHashMap. The inner List does not need to be thread-safe: addPublishVersionTask is invoked by the master daemon only during the initial dispatch (guarded by hasSendTask) and runs strictly before any worker iteration, so there is no read-during-write on the list.
5783ff6 to
9ca2282
Compare
Contributor
|
run buildall |
Contributor
|
/review |
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
Contributor
There was a problem hiding this comment.
Review result: no blocking issues found.
Critical checkpoint conclusions:
- Goal and proof: The PR targets ConcurrentModificationException on TransactionState.publishVersionTasks during parallel publish. Changing the map to ConcurrentHashMap addresses the observed concurrent iteration/clear path. The PR includes manual test evidence, but no automated regression/unit test was added.
- Scope: The modification is small and focused on the shared map; no unrelated code is changed in the authoritative PR diff.
- Concurrency: The changed state is shared between the PUBLISH_VERSION daemon and PUBLISH_VERSION_EXEC workers. ConcurrentHashMap protects map iteration and clear from CME. The inner List remains non-concurrent, which is consistent with the current dispatch flow where additions happen before hasSendTask prevents redispatch and before worker-side finish iterations.
- Lifecycle/static initialization: No new static/global lifecycle risk introduced. Field initialization is safe for constructors; publishVersionTasks is not serialized because fields without @SerializedName are excluded during Gson serialization, so deserialization does not appear to overwrite this initializer from normal persisted images.
- Configuration: No new config added.
- Compatibility/persistence: No storage format, thrift, or edit-log format change. The field remains non-persisted runtime state.
- Parallel paths: Reviewed the publish generation, finish, and prune paths in PublishVersionDaemon and DatabaseTransactionMgr; no additional map instance needing the same change was found.
- Error handling/data correctness: Publish-finish semantics and transaction state transitions are unchanged; the change only prevents failure while observing task state.
- Observability: Existing publish error logs remain sufficient for this fix.
- Tests: Manual test only. An automated concurrency test would be useful but is not required for approval given the minimal container change.
User focus: No additional user-provided review focus was supplied.
Contributor
TPC-H: Total hot run time: 29636 ms |
Contributor
TPC-DS: Total hot run time: 170509 ms |
Contributor
FE UT Coverage ReportIncrement line coverage |
Contributor
FE Regression Coverage ReportIncrement line coverage |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
With parallel publish version (enable_parallel_publish_version=true) the per-transaction publishVersionTasks map is touched by both the master daemon thread (addPublishVersionTask, iterate in tryFinishOneTxn) and the PUBLISH_VERSION_EXEC worker threads (iterate in tryFinishTxnSync, clear in pruneAfterVisible). The plain HashMap raises ConcurrentModificationException; the affected publish silently fails and the txn stays in COMMITTED state. Backlog grows each round and eventually trips max_publishing_txn_num_per_table (default 500), blocking stream load on hot tables.
Switch the map to ConcurrentHashMap. The inner List does not need to be thread-safe: addPublishVersionTask runs at most once per txn during the initial dispatch (guarded by hasSendTask) and worker threads only iterate the list later, so there is no read-during-write on the list.
What problem does this PR solve?
Issue Number: close #63169
Related PR: N/A
Problem Summary:
When parallel publish version is enabled (
Config.enable_parallel_publish_version, defaulttruesince 4.0), the per-transactionpublishVersionTasksmap is touched concurrently by:PUBLISH_VERSIONdaemon, which iterates the map intryFinishOneTxn; andPUBLISH_VERSION_EXECworker for that txn's db -- routed bydbId % publish_thread_pool_numto a single-thread pool, which iterates the map intryFinishTxnSyncand clears it viapruneAfterVisibleafter the txn becomesVISIBLE.The plain
HashMapthrowsConcurrentModificationException. The CME is caught at the outer layer so FE does not crash, but that publish round aborts and the txn stays inCOMMITTED. Backlog grows each daemon round and eventually blocks new stream load viamax_running_txn_num_per_db(default 10000) ormax_publishing_txn_num_per_table(default 500),whichever trips first.
Sample stack trace from a production FE:
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1597)
at java.util.HashMap$EntryIterator.next(HashMap.java:1630)
at org.apache.doris.transaction.PublishVersionDaemon.tryFinishOneTxn(PublishVersionDaemon.java:191)
Fix: switch
publishVersionTaskstoConcurrentHashMap. The innerList<PublishVersionTask>does not need to be thread-safe --addPublishVersionTaskis invoked by the master daemon only during the initial dispatch intraverseReadyTxnAndDispatchPublishVersionTask(guarded byhasSendTask) and runs strictly before any worker iteration, so there is noread-during-write on the list.
Release note:
Fix ConcurrentModificationException on
TransactionState.publishVersionTasksunder parallel publish version, which could stall publish and eventually block stream load on hot dbs/tables.Check List (For Author)
Test
Manual test steps:
enable_parallel_publish_version = true(default in 4.0).tryFinishOneTxnin the same daemon round while previous-round workers are still runningpruneAfterVisible.fe.warn.logeventually showsConcurrentModificationExceptionwith the stackjava.util.HashMap$HashIterator.nextNode->PublishVersionDaemon.tryFinishOneTxn(the iteration aborts for that round; the txn is normally re-published in a later round).fe.warn.log.Behavior changed:
Does this need documentation?