Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISPN-8533 Deadlock in pessimistic transaction #6075

Merged
merged 2 commits into from Jun 22, 2018

Conversation

pruivo
Copy link
Member

@pruivo pruivo commented Jun 19, 2018

@pruivo
Copy link
Member Author

pruivo commented Jun 19, 2018

run performance tests please

@ghost
Copy link

ghost commented Jun 19, 2018

Performance tests didn't finish successfully. @diegolovison, can you review it?

Additional info:
Commit: ea8dd12
Build number: #254
Comment body: run performance tests please

@diegolovison
Copy link
Contributor

run performance tests please

@ghost
Copy link

ghost commented Jun 19, 2018

Performance tests didn't finish successfully. @diegolovison, can you review it?

Additional info:
Commit: ea8dd12
Build number: #255
Comment body: run performance tests please

@diegolovison
Copy link
Contributor

diegolovison commented Jun 19, 2018

@pruivo, @mgencur is running some tests in our perf lab.
We will need to wait a little bit

@pruivo
Copy link
Member Author

pruivo commented Jun 19, 2018

@diegolovison ok! thanks!

@diegolovison
Copy link
Contributor

run performance tests please

@ghost
Copy link

ghost commented Jun 19, 2018

Performance tests run successfully. Link to the results here.

Additional info:
Commit: ea8dd12
Build number: #256
Comment body: run performance tests please

@tristantarrant
Copy link
Member

Perf looks good.

@galderz
Copy link
Member

galderz commented Jun 22, 2018

Some connection here too (like in #6045), restarted build

@gustavocoding
Copy link

Gotta love the CI stability...

Copy link
Member

@danberindei danberindei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry Pedro, I forgot about this yesterday.

I'm still not sure what this change does TBH, I see you've split backupLockReleased into multiple CFs, but it's not clear how that fixes the problem :)

// TODO: should we check the write skew configuration here?
// TODO: version seen or looked up remote version?
if (lastVersion != null && lastVersion.compareTo(version) != InequalVersionComparisonResult.EQUAL) {
throw log.writeSkewOnRead(key, key, lastVersion, version);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? The JIRA only talks about pessimistic transactions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lastVersion is always null since getLookedUpRemoteVersion() is deprecated since 9.0 and always return null.

@@ -88,6 +88,7 @@ private KeyAwareLockPromise acquireLocalLock(InvocationContext ctx, DataCommand
final TxInvocationContext txContext = (TxInvocationContext) ctx;
Object key = command.getKey();
txContext.addAffectedKey(key);
((TxInvocationContext) ctx).getCacheTransaction().cleanupBackupLocksForKey(key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

txContext

//clear the backup locks
context.getCacheTransaction().cleanupBackupLocks();
keysToLock.removeAll(context.getLockedKeys());
}
if (command instanceof LockControlCommand) {
//the lock command is only issue if the transaction doesn't have the lock for the keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issued?

And I'd like to see a comment above these 2 blocks explaining why we need to remove backup locks first

}
//avoids the warning about synchronizing in a local variable.
//and allows us to change the CacheTransaction internals without having to worry about it
tx.collectBackupLockKeysForSegments(segments, keyPartitioner, filteredLockedKeys);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here instead I don't think a comment isn't necessary, it's just common sense. However, I would have liked the extracted method to handle both regular locks and backup locks, and I would have preferred a more generic method like forEachLock.

// we need a synchronized collection to be able to get a valid snapshot from another thread during state transfer
final Set<Object> keys = backupKeyLocks.updateAndGet((value) -> value == null ? Collections.synchronizedSet(new HashSet<>(INITIAL_LOCK_CAPACITY)) : value);
keys.add(key);
if (backupKeyLocks == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above is no longer true

/**
* testing purpose only!
*/
@Deprecated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should deprecate the implementation as well, and replace the javadoc with something like @deprecated Since 9.3, please use ...

public class PessimisticDeadlockTest extends MultipleCacheManagersTest {

public void testDeadlock() throws Exception {
assertEquals(4, managers().length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, maybe you should move createCacheManagers() so it's obvious without having to assert :)

ConfigurationBuilder builder = getDefaultClusteredCacheConfig(CacheMode.DIST_SYNC, true);
builder.transaction().lockingMode(LockingMode.PESSIMISTIC);
builder.clustering().hash().consistentHashFactory(new ControlledConsistentHashFactory.Default(1, 2))
.numSegments(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't really need MagicKey if you only have 1 segment. But this got me thinking, should this test have more scenarios to test merging locks into an existing transaction (if the originator was backup/non-owner and becomes primary owner).

Iterator<Address> iterator = writeOwners.iterator();
assertEquals(address(1), iterator.next());
assertEquals(address(2), iterator.next());
assertFalse(writeOwners.contains(address(0)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about assertEquals(Arrays.asList(address(1), address(2)), writeOwners)?

@@ -59,7 +61,9 @@
private final AtomicReference<Set<Object>> lockedKeys = new AtomicReference<>();

/** Holds all the locks for which the local node is a secondary data owner. */
private final AtomicReference<Set<Object>> backupKeyLocks = new AtomicReference<>();
@GuardedBy("this")
private Map<Object, CompletableFuture<Void>> backupKeyLocks;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you modify the comment to say what these CompletableFutures actually represent?

if (entry instanceof RepeatableReadEntry) {
return ((RepeatableReadEntry) entry).isRead();
} else {
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When possible, I prefer keeping deprecated methods working as before instead of keeping only a skeleton for compilation reasons.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I didn't understand what you mean :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if the method still works, I don't see any reason to replace it with an empty method.

//clear the backup locks
context.getCacheTransaction().cleanupBackupLocks();
keysToLock.removeAll(context.getLockedKeys());
}
if (command instanceof LockControlCommand) {
//the lock command is only issued if the transaction doesn't have the lock for the keys.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't know why we have to remove the backup locks here...

}
//avoids the warning about synchronizing in a local variable.
//and allows us to change the CacheTransaction internals without having to worry about it
tx.forEachBackupLock(key -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not forEachLock?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it isn't related to this PR (?) Do you want to replace all the getLockedKeys() with a forEachLock() ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll wait for https://issues.jboss.org/browse/ISPN-3927 then :)

* <p>
* A {@link CompletableFuture} is created for each key and it is completed when the backup lock is release for that
* key. A transaction, before acquiring the locks, must wait for all the backup locks (i.e. the {@link
* CompletableFuture}) for all transaction created in the previous topology.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking: is released, all transactions

@danberindei danberindei merged commit 682e6f0 into infinispan:master Jun 22, 2018
@danberindei
Copy link
Member

Integrated, thanks Pedro!

@pruivo pruivo deleted the t_8533 branch July 5, 2022 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants