-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Never block on key in LiveVersionMap#pruneTombstones
#28736
Conversation
Pruning tombstones is best effort and should not block if a key is currently locked. This can cause a deadlock in rare situations if we switch of append only optimization while heavily updating the same key in the engine while the LiveVersionMap is locked. This is very rare since this code patch only executed every 15 seconds by default since that is the interval we try to prune the deletes in the version map. Closes elastic#28714
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine overall. I left a few comments and suggestions.
} | ||
if (perNodeLock.tryLock()) { // ok we got it - make sure we increment it accordingly otherwise release it again | ||
int i; | ||
while ((i = perNodeLock.count.get()) > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? Shouldn't we be able to return the ReleasableLock
once we have acquired perNodeLock
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment regarding this
if (decrementAndGet == 0) { | ||
map.remove(key, lock); | ||
} | ||
assert decrementAndGet >= 0 : decrementAndGet + " must be >= 0 but wasn't"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
removeTombstoneUnderLock(uid); | ||
Releasable lock = keyedLock.tryAcquire(uid); | ||
if (lock != null) { | ||
try (Releasable ignored = lock) { // can we do it without this lock on each value? maybe batch to a set and get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to call tryAcquire
in the try-with-resources block and do the null check in the block? i.e.:
try (Releasable lock = keyedLock.tryAcquire(uid)) {
if (lock != null) {
DeleteVersionValue versionValue = tombstones.get(uid);
// ...
}
}
That implementation would not need a dummy ignored
variable and instead actually use the returned value (i.e. lock
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does I didn't know you can do that with a null value.
@@ -125,6 +153,7 @@ public AcquireAndReleaseThread(CountDownLatch startLatch, KeyedLock<String> conn | |||
this.names = names; | |||
this.counter = counter; | |||
this.safeCounter = safeCounter; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Empty line
final boolean isNotTrackedByCurrentMaps = versionValue.time < maps.getMinDeleteTimestamp(); | ||
if (isNotTrackedByCurrentMaps) { | ||
removeTombstoneUnderLock(uid); | ||
Releasable lock = keyedLock.tryAcquire(uid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to admit that it took me a moment to understand why this is avoiding the deadlock as there need to be two locks involved that are acquired by two threads in opposite order. The reason why this fix works is that this is the inner of the two involved locks. Then there are two cases:
- We get this lock, remove the tombstone and return the lock. If another thread is trying to acquire this lock, it can do so after we have left the try-with-resources block.
- We do not get the lock. Then we will not wait but rather give up (this time).
(just wrote this down for my own reference.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
@danielmitterdorfer I pushed some changes to make things more clear. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @s1monw. LGTM!
I will merge this once the build is green |
@ywelsch @danielmitterdorfer I added a new test that reproduces the issue on the engine level and fails 100% of the time without the fix. |
// holding the lock that pruneTombstones needs and we have a deadlock | ||
CountDownLatch awaitStared = new CountDownLatch(1); | ||
Thread thread = new Thread(() -> { | ||
awaitStared.countDown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: awaitStarted
engine.index(new Engine.Index(newUid(document2), document2, SequenceNumbers.UNASSIGNED_SEQ_NO, 0, | ||
Versions.MATCH_ANY, VersionType.INTERNAL, Engine.Operation.Origin.PRIMARY, System.nanoTime(), 0, false)); | ||
engine.refresh("test"); | ||
ParsedDocument dummyDocument = testParsedDocument(Integer.toString(3), null, testDocumentWithTextField(), SOURCE, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: should be named document3
for consistency?
Thanks for the test case @s1monw! I left two very minor comments but otherwise the test looks fine to me. |
@bleskes ping just FYI |
Pruning tombstones is best effort and should not block if a key is currently locked. This can cause a deadlock in rare situations if we switch of append only optimization while heavily updating the same key in the engine while the LiveVersionMap is locked. This is very rare since this code patch only executed every 15 seconds by default since that is the interval we try to prune the deletes in the version map. Closes #28714
Pruning tombstones is best effort and should not block if a key is currently locked. This can cause a deadlock in rare situations if we switch of append only optimization while heavily updating the same key in the engine while the LiveVersionMap is locked. This is very rare since this code patch only executed every 15 seconds by default since that is the interval we try to prune the deletes in the version map. Closes #28714
@s1monw thanks for the ping. Good catch. |
Pruning tombstones is best effort and should not block if a key is currently locked. This can cause a deadlock in rare situations if we switch of append only optimization while heavily updating the same key in the engine while the LiveVersionMap is locked. This is very rare since this code patch only executed every 15 seconds by default since that is the interval we try to prune the deletes in the version map. Closes elastic#28714
Pruning tombstones is best effort and should not block if a key is currently
locked. This can cause a deadlock in rare situations if we switch of append
only optimization while heavily updating the same key in the engine
while the LiveVersionMap is locked. This is very rare since this code
patch only executed every 15 seconds by default since that is the interval
we try to prune the deletes in the version map.
Closes #28714