Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Never block on key in LiveVersionMap#pruneTombstones #28736

Merged
merged 5 commits into from
Feb 20, 2018

Conversation

s1monw
Copy link
Contributor

@s1monw s1monw commented Feb 19, 2018

Pruning tombstones is best effort and should not block if a key is currently
locked. This can cause a deadlock in rare situations if we switch of append
only optimization while heavily updating the same key in the engine
while the LiveVersionMap is locked. This is very rare since this code
patch only executed every 15 seconds by default since that is the interval
we try to prune the deletes in the version map.

Closes #28714

Pruning tombstones is best effort and should not block if a key is currently
locked. This can cause a deadlock in rare situations if we switch of append
only optimization while heavily updating the same key in the engine
while the LiveVersionMap is locked. This is very rare since this code
patch only executed every 15 seconds by default since that is the interval
we try to prune the deletes in the version map.

Closes elastic#28714
@s1monw s1monw added >bug v7.0.0 v6.3.0 v6.2.3 :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. labels Feb 19, 2018
Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine overall. I left a few comments and suggestions.

}
if (perNodeLock.tryLock()) { // ok we got it - make sure we increment it accordingly otherwise release it again
int i;
while ((i = perNodeLock.count.get()) > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? Shouldn't we be able to return the ReleasableLock once we have acquired perNodeLock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment regarding this

if (decrementAndGet == 0) {
map.remove(key, lock);
}
assert decrementAndGet >= 0 : decrementAndGet + " must be >= 0 but wasn't";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

removeTombstoneUnderLock(uid);
Releasable lock = keyedLock.tryAcquire(uid);
if (lock != null) {
try (Releasable ignored = lock) { // can we do it without this lock on each value? maybe batch to a set and get
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to call tryAcquire in the try-with-resources block and do the null check in the block? i.e.:

try (Releasable lock = keyedLock.tryAcquire(uid)) {
    if (lock != null) {
        DeleteVersionValue versionValue = tombstones.get(uid);
        // ...
    }
}

That implementation would not need a dummy ignored variable and instead actually use the returned value (i.e. lock).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does I didn't know you can do that with a null value.

@@ -125,6 +153,7 @@ public AcquireAndReleaseThread(CountDownLatch startLatch, KeyedLock<String> conn
this.names = names;
this.counter = counter;
this.safeCounter = safeCounter;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Empty line

final boolean isNotTrackedByCurrentMaps = versionValue.time < maps.getMinDeleteTimestamp();
if (isNotTrackedByCurrentMaps) {
removeTombstoneUnderLock(uid);
Releasable lock = keyedLock.tryAcquire(uid);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit that it took me a moment to understand why this is avoiding the deadlock as there need to be two locks involved that are acquired by two threads in opposite order. The reason why this fix works is that this is the inner of the two involved locks. Then there are two cases:

  1. We get this lock, remove the tombstone and return the lock. If another thread is trying to acquire this lock, it can do so after we have left the try-with-resources block.
  2. We do not get the lock. Then we will not wait but rather give up (this time).

(just wrote this down for my own reference.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

@s1monw
Copy link
Contributor Author

s1monw commented Feb 20, 2018

@danielmitterdorfer I pushed some changes to make things more clear.

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @s1monw. LGTM!

@s1monw
Copy link
Contributor Author

s1monw commented Feb 20, 2018

I will merge this once the build is green

@s1monw
Copy link
Contributor Author

s1monw commented Feb 20, 2018

@ywelsch @danielmitterdorfer I added a new test that reproduces the issue on the engine level and fails 100% of the time without the fix.

// holding the lock that pruneTombstones needs and we have a deadlock
CountDownLatch awaitStared = new CountDownLatch(1);
Thread thread = new Thread(() -> {
awaitStared.countDown();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: awaitStarted

engine.index(new Engine.Index(newUid(document2), document2, SequenceNumbers.UNASSIGNED_SEQ_NO, 0,
Versions.MATCH_ANY, VersionType.INTERNAL, Engine.Operation.Origin.PRIMARY, System.nanoTime(), 0, false));
engine.refresh("test");
ParsedDocument dummyDocument = testParsedDocument(Integer.toString(3), null, testDocumentWithTextField(), SOURCE, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should be named document3 for consistency?

@danielmitterdorfer
Copy link
Member

Thanks for the test case @s1monw! I left two very minor comments but otherwise the test looks fine to me.

@s1monw s1monw merged commit b008706 into elastic:master Feb 20, 2018
@s1monw s1monw deleted the issues/28714 branch February 20, 2018 15:35
@s1monw
Copy link
Contributor Author

s1monw commented Feb 20, 2018

@bleskes ping just FYI

s1monw added a commit that referenced this pull request Feb 20, 2018
Pruning tombstones is best effort and should not block if a key is currently
locked. This can cause a deadlock in rare situations if we switch of append
only optimization while heavily updating the same key in the engine
while the LiveVersionMap is locked. This is very rare since this code
patch only executed every 15 seconds by default since that is the interval
we try to prune the deletes in the version map.

Closes #28714
s1monw added a commit that referenced this pull request Feb 20, 2018
Pruning tombstones is best effort and should not block if a key is currently
locked. This can cause a deadlock in rare situations if we switch of append
only optimization while heavily updating the same key in the engine
while the LiveVersionMap is locked. This is very rare since this code
patch only executed every 15 seconds by default since that is the interval
we try to prune the deletes in the version map.

Closes #28714
@bleskes
Copy link
Contributor

bleskes commented Feb 28, 2018

@s1monw thanks for the ping. Good catch.

sebasjm pushed a commit to sebasjm/elasticsearch that referenced this pull request Mar 10, 2018
Pruning tombstones is best effort and should not block if a key is currently
locked. This can cause a deadlock in rare situations if we switch of append
only optimization while heavily updating the same key in the engine
while the LiveVersionMap is locked. This is very rare since this code
patch only executed every 15 seconds by default since that is the interval
we try to prune the deletes in the version map.

Closes elastic#28714
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. v6.2.3 v6.3.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants