-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docdb: fix deadlock issue with RepartitionTable #10304
Comments
Actually, holding |
Here is an outline of the deadlock. t1 is RepartitionTable; t2 is ProcessTabletReportBatch. t1: table->LockForWrite I'm not sure how to solve this. Grabbing |
t2 could LockForWrite instead, but that's only for deadlock purposes (since there can only be one writer at a time), so it's hacky. |
Summary: Since commit b14485a ([#8229] backup: repartition table if needed on YSQL restore), there is a rare deadlock issue between ProcessTabletReportBatch and RepartitionTable. It can be hit when - thread 1 (RepartitionTable): import a YSQL snapshot where the number of tablets for a table mismatch between the cluster and the external snapshot - thread 2 (ProcessTabletReportBatch): process tablets of that table from heartbeat A more in-depth sequence of steps follows: 1. t1: table->LockForWrite (WriteLock) 1. t2: table->LockForRead (ReadLock) 1. t1: tablet->StartMutation (WriteLock) 2. t1: table_lock.Commit (UpgradeToCommitLock; blocks on t2 ReadLock) 3. t2: tablet->LockForWrite (WriteLock; blocks on t1 WriteLock) To fix, for ProcessTabletReportBatch, take table write lock instead of read lock. The table metadata isn't mutated, so this is purely for deadlock avoidance reasons (since only one writer is allowed at a time). Bogdan thinks we should expect table write lock to be taken whenever tablet write lock is taken. Original Commit: cc90f01 Original Differential Revision: https://phabricator.dev.yugabyte.com/D13459 Test Plan: ./yb_build.sh \ --cxx-test tools_yb-backup-test_ent \ --gtest_filter YBBackupTest.TestYSQLChangeDefaultNumTablets \ -n 1000 \ --tp 1 \ fastdebug Reviewers: nicolas Reviewed By: nicolas Subscribers: bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D13730
Summary: Since commit b14485a ([#8229] backup: repartition table if needed on YSQL restore), there is a rare deadlock issue between ProcessTabletReportBatch and RepartitionTable. It can be hit when - thread 1 (RepartitionTable): import a YSQL snapshot where the number of tablets for a table mismatch between the cluster and the external snapshot - thread 2 (ProcessTabletReportBatch): process tablets of that table from heartbeat A more in-depth sequence of steps follows: 1. t1: table->LockForWrite (WriteLock) 1. t2: table->LockForRead (ReadLock) 1. t1: tablet->StartMutation (WriteLock) 2. t1: table_lock.Commit (UpgradeToCommitLock; blocks on t2 ReadLock) 3. t2: tablet->LockForWrite (WriteLock; blocks on t1 WriteLock) To fix, for ProcessTabletReportBatch, take table write lock instead of read lock. The table metadata isn't mutated, so this is purely for deadlock avoidance reasons (since only one writer is allowed at a time). Bogdan thinks we should expect table write lock to be taken whenever tablet write lock is taken. For backport to 2.6, since commit afd8775 ([#9182] Fix return from CatalogManager::ProcessTabletReport) is not present, do changes to ProcessTabletReport instead. Depends on D13851 Original Commit: cc90f01 Original Differential Revision: https://phabricator.dev.yugabyte.com/D13459 Test Plan: ./yb_build.sh \ --cxx-test tools_yb-backup-test_ent \ --gtest_filter YBBackupTest.TestYSQLChangeDefaultNumTablets \ -n 1000 \ --tp 1 \ fastdebug Reviewers: nicolas Reviewed By: nicolas Subscribers: bogdan, ybase Differential Revision: https://phabricator.dev.yugabyte.com/D13852
RepartitionTable
may get into deadlock on table rwc lock ifProcessTabletReportBatch
takes read lock before the write lock held inRepartitionTable
is committed. This can probably be fixed by holding catalog managermutex_
for longer (unfortunately).The following logs are based from in-progress changes:
The text was updated successfully, but these errors were encountered: