Skip to content

2.27.0.0-b361

@spolitov spolitov tagged this 22 Jul 13:55
Summary:
Transaction status resolver could stuck in FailOnConflict/SnapshotTxnTest.DeleteOnErroredLoad/0 due to 2 reasons.

1) When TransactionCoordinator::DoAbort is invoked with callback specified, it forgets to execute postponed leader actions on exit.
Fixed by introducing CoordinatorLock that should be used when postponed leader actions could be added. In automatically execute them during unlock.

2) During cluster shutdown transaction status resolver are not be able to resolve status since other nodes were shutdown and there is no leader for some particular coordinator tablet.
Fixed by starting shut down for `Rpcs` before resolver. So rpcs used by resolver will be aborted in non-retryable way.
Jira: DB-17651

Test Plan: ./yb_build.sh tsan --cxx-test snapshot-txn-test --gtest_filter FailOnConflict/SnapshotTxnTest.DeleteOnErroredLoad/0 -n 100 -- -p 16

Reviewers: bkolagani

Reviewed By: bkolagani

Subscribers: ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D45493
Assets 2
Loading