Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: resolve deadlock in waitForCacheState #29291

Merged
merged 1 commit into from
Sep 12, 2018

Conversation

vivekmenezes
Copy link
Contributor

change #25313 introduced getDatabaseID as the callback
used in waitForCacheState. However, waitForCacheState holds
on to a lock while calling its callback and can get stuck
on an intent written by another transaction. The other
transaction having written an intent can be trying to acquire
the databaseCacheHolder lock, resulting in the system getting
stuck in a deadlock.

Originally, the deadlock was thought to be cause by another
bug which got fixed through #28381, but now we know of
an actual legitimate situation in which the deadlock
can happen.

The fix is to use another callback that will fix #25313
while not going to the store.

fixes #29090

Release note: None

@vivekmenezes vivekmenezes requested review from andreimatei and a team August 29, 2018 19:13
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@vivekmenezes
Copy link
Contributor Author

@andreimatei do you have any comments for this PR? Thanks

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/database.go, line 257 at r1 (raw file):

This method never goes to the store to resolve
// the name to id mapping.

Returns 0 if the name to id mapping or the database descriptor are not in the cache.

change cockroachdb#25313 introduced getDatabaseID as the callback
used in waitForCacheState. However, waitForCacheState holds
on to a lock while calling its callback and can get stuck
on an intent written by another transaction. The other
transaction having written an intent can be trying to acquire
the databaseCacheHolder lock, resulting in the system getting
stuck in a deadlock.

Originally, the deadlock was thought to be cause by another
bug which got fixed through cockroachdb#28381, but now we know of
an actual legitimate situation in which the deadlock
can happen.

The fix is to use another callback that will fix cockroachdb#25313
while not going to the store.

fixes cockroachdb#29090

Release note: None
Copy link
Contributor Author

@vivekmenezes vivekmenezes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/database.go, line 257 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…
This method never goes to the store to resolve
// the name to id mapping.

Returns 0 if the name to id mapping or the database descriptor are not in the cache.

Done.

@vivekmenezes
Copy link
Contributor Author

bors r=andreimatei

@craig
Copy link
Contributor

craig bot commented Sep 12, 2018

👎 Rejected by code reviews

@vivekmenezes
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Sep 12, 2018

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request Sep 12, 2018
29291: sql: resolve deadlock in waitForCacheState r=vivekmenezes a=vivekmenezes

change #25313 introduced getDatabaseID as the callback
used in waitForCacheState. However, waitForCacheState holds
on to a lock while calling its callback and can get stuck
on an intent written by another transaction. The other
transaction having written an intent can be trying to acquire
the databaseCacheHolder lock, resulting in the system getting
stuck in a deadlock.

Originally, the deadlock was thought to be cause by another
bug which got fixed through #28381, but now we know of
an actual legitimate situation in which the deadlock
can happen.

The fix is to use another callback that will fix #25313
while not going to the store.

fixes #29090

Release note: None

Co-authored-by: Vivek Menezes <vivek@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Sep 12, 2018

Build succeeded

@craig craig bot merged commit 3297375 into cockroachdb:master Sep 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: combination of DATABASE and USER statements wedges server
3 participants