Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Fix the deadlock issue between CreateTable and ProcessTabletReportBatch #15346

Closed
yifanguan opened this issue Dec 15, 2022 · 1 comment
Assignees
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@yifanguan
Copy link
Contributor

yifanguan commented Dec 15, 2022

Jira Link: DB-4483

Description

There is a deadlock issue related to CatalogManager::CreateTable and CatalogManager::ProcessTabletReportBatch uncovered by tests: org.yb.pgsql.TestYbBackup#testLegacyColocatedDBMigration and org.yb.pgsql.TestYbBackup#testColocatedDBWithColocationIdAlreadySet.

Test failure output for org.yb.pgsql.TestYbBackup#testLegacyColocatedDBMigration is attached.
org.yb.pgsql.TestYbBackup#testLegacyColocatedDBMigration.txt

Note: org.yb.pgsql.TestYbBackup#testLegacyColocatedDBMigration is a test included in an in-progress diff for #14887.

@yifanguan yifanguan added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Dec 15, 2022
@yifanguan yifanguan self-assigned this Dec 15, 2022
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Dec 15, 2022
yifanguan added a commit that referenced this issue Dec 21, 2022
…sTabletReportBatch

Summary:
There is a rare deadlock issue between ProcessTabletReportBatch and CreateTable during the creation of a colocated index.

Symbols used in the following example:
Colocated table: `tbl_col`
Colocated Parent table: `parent_table`
Colocated tablet of `parent_table`: `colocated_tablet`
The colocated index of `tbl_col` to be created in thread `t1`: `tbl_col_idx`
Thread CreateTable to create the colocated index `tbl_col_idx`: `t1`
Thread ProcessTabletReportBatch: `t2`

One possible sequence of steps to produce the deadlock is:
t1: `tbl_col_idx`->StartMutation
t2: `parent_table`->LockForWrite (first pass)
t2: `tbl_col`->LockForWrite
t1: `colocated_tablet`->StartMutation
t2: `colocated_tablet`->LockForWrite (second pass; blocked on t1 WriteLock)
t1: `tbl_col`->LockForWrite (blocked on t2 WriteLock)

This diff solves the deadlock issue by acquiring the write lock of the colocated indexed table before acquiring the write lock of the colocated tablet in `CreateTable` when creating a colocated index.

Test Plan:
./yb_build.sh --java-test 'org.yb.pgsql.TestYbBackup#testColocatedDBWithColocationIdAlreadySet' -n 100 --tp 1
./yb_build.sh --java-test 'org.yb.pgsql.TestYbBackup#testLegacyColocatedDBMigration' -n 100 --tp 1

Reviewers: tverona, alex, skedia, nicolas, sergei

Reviewed By: sergei

Subscribers: ybase, yql, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D21781
@yifanguan
Copy link
Contributor Author

Issue resolved by commit 0da9a37

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants