Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Ensures all lock clients closed during role switch
Observed issue: After an instance switching role in HA, for some reason some transactions just wouldn't respect the locks that other transactions held. This would lead to some transactions making changes on top of stale data, effectively overwriting changes of other transactions. Cause(s): Among other things a new lock manager instance is created during role switching. Transaction instances pooled and created with a locks client instance which is kept throughout its life. While there may be transactions still executing at the time of switching role, all open transactions are marked for termination and transactions pool cleared. Transactions marked for termination cannot commit, but when closed they were happily returned to the transaction pool and would be reused again and again. These transactions would have its termination flag reset at the point of being reused and would still contain a locks client for the previous lock manager. These transaction instances would forever disrespect the locks that all other transactions respected and would continue to do so until the next role switch, where they at least would have a chance of being properly disposed of. Although the next role switch would have a chance of introducing new rogue transactions as well... Solution(s) Many small issues here and there caused this to happen. There are a number of changes that will prevent this from happening. Some of them are belt-and-suspenders additions, although at virtually no cost: - Disposing kernel transactions as part of switching to pending. Previously they were disposed of as late as after creating the new ones, which would increase the chance of transactions using locks clients from the old lock manager to be used and cause harm. - Mark _all_ transactions, not just open transactions, for termination when switching role. No transactions referring to the previous lock manager can be around after the role switch. - Do not return transaction (KernelTransactionImplementation) instances that are marked for termination to the pool. - Have Locks instances know when they are closed and refuse to hand out new clients if closed. On top of those changes, with accompanied unit tests, there's an added stress test `TransactionThroughMasterSwitchStressIT` which quite deterministically could reproduce the problem. It can be set to run for a longer period of time (default 30s in the main build) using system property, like: `-Dorg.neo4j.kernel.ha.transaction.TransactionThroughMasterSwitchStressIT.duration=10m` Currently that stress test provokes only one role switch scenario: master --> pending --> master, although more scenarios could be added later on for an even broader net to catch these sorts of problems.
- Loading branch information
Showing
16 changed files
with
523 additions
and
65 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
61 changes: 61 additions & 0 deletions
61
community/kernel/src/test/java/org/neo4j/kernel/impl/locking/CloseCompatibility.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
/* | ||
* Copyright (c) 2002-2015 "Neo Technology," | ||
* Network Engine for Objects in Lund AB [http://neotechnology.com] | ||
* | ||
* This file is part of Neo4j. | ||
* | ||
* Neo4j is free software: you can redistribute it and/or modify | ||
* it under the terms of the GNU General Public License as published by | ||
* the Free Software Foundation, either version 3 of the License, or | ||
* (at your option) any later version. | ||
* | ||
* This program is distributed in the hope that it will be useful, | ||
* but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
* GNU General Public License for more details. | ||
* | ||
* You should have received a copy of the GNU General Public License | ||
* along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
*/ | ||
package org.neo4j.kernel.impl.locking; | ||
|
||
import org.junit.Ignore; | ||
import org.junit.Test; | ||
|
||
import org.neo4j.kernel.impl.locking.Locks.Client; | ||
|
||
import static org.junit.Assert.fail; | ||
|
||
@Ignore("Not a test. This is a compatibility suite, run from LockingCompatibilityTestSuite.") | ||
public class CloseCompatibility extends LockingCompatibilityTestSuite.Compatibility | ||
{ | ||
public CloseCompatibility( LockingCompatibilityTestSuite suite ) | ||
{ | ||
super( suite ); | ||
} | ||
|
||
@Test | ||
public void shouldNotBeAbleToHandOutClientsIfShutDown() throws Throwable | ||
{ | ||
// GIVEN a lock manager and working clients | ||
try ( Client client = locks.newClient() ) | ||
{ | ||
client.acquireExclusive( ResourceTypes.NODE, 0 ); | ||
} | ||
|
||
// WHEN | ||
locks.stop(); | ||
locks.shutdown(); | ||
|
||
// THEN | ||
try | ||
{ | ||
locks.newClient(); | ||
fail( "Should fail" ); | ||
} | ||
catch ( IllegalStateException e ) | ||
{ | ||
// Good | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.