Introduce shard level locks to prevent concurrent shard modifications #8436

s1monw · 2014-11-11T13:01:43Z

Today it's possible that the data directory for a single shard is used by more than on
IndexShard->Store instances. While one shard is already closed but has a concurrent recovery
running and a new shard is creating it's engine files can conflict and data can potentially
be lost. We also remove shards data without checking if there are still users of the files
or if files are still open which can cause pending writes / flushes or the delete operation
to fail. If the latter is the case the index might be treated as a dangeling index and is brought
back to life at a later point in time.

This commit introduces a shard level lock that prevents modifications to the shard data
while it's still in use. Locks are created per shard and maintained in NodeEnvironment.java.
In contrast to most java concurrency primitives those locks are not reentrant.

This commit also adds infrastructure that checks if all shard locks are released after tests.

dakrone · 2014-11-11T13:38:20Z

src/main/java/org/elasticsearch/env/NodeEnvironment.java

+     * @param index the index to delete
+     * @throws Exception if any of the shards data directories can't be locked or deleted
+     */
+    public void deleteIndexDirecotrySafe(Index index) throws Exception {


"Directory" is misspelled as "Direcotry" here

s1monw · 2014-11-11T15:21:42Z

pushed more changes @dakrone

kimchy · 2014-11-11T19:00:27Z

@s1monw quick question regarding the path of the lock file, does it make sense to have it encapsulated in the shard location itself (similar to write.lock in Lucene)? I think its nicer since then there is a single place that holds the shard data.

Update: I think I get why its a different directory, the ability to delete while holding the lock, if thats the case, then it makes sense. On crappy internet, will try and complete the review later tonight....

kimchy · 2014-11-11T19:10:17Z

src/main/java/org/elasticsearch/gateway/local/state/meta/LocalGatewayMetaState.java

@@ -246,7 +245,7 @@ public void clusterChanged(ClusterChangedEvent event) {
                    logger.debug("[{}] deleting index that is no longer part of the metadata (indices: [{}])", current.index(), newMetaData.indices().keys());
                    if (nodeEnv.hasNodeFile()) {
                        try {
-                            IOUtils.rm(FileSystemUtils.toPaths(nodeEnv.indexLocations(new Index(current.index()))));
+                            nodeEnv.deleteIndexDirectorySafe(new Index(current.index()));


should we warn on this failure now? because of we have a bug and we can't obtain the lock, we won't delete anything, which is a different semantics than deleting the files and letting the OS handle dangling open file handles? This in theory should not happen because a lock is there

yeah I made it a warn... btw. I think it's a preexisting but ie on windows you might have this behaviour already since somebody can hold a ref to the files preventing the deletion. I wonder if we should actually schedule a deletion task or use a list of pending deletions that are executed before we apply new clusterstates and before we import dangeling indeices?

dakrone · 2014-11-13T15:50:22Z

src/test/java/org/elasticsearch/index/store/StoreTest.java

+            @Override
+            public void onClose(ShardId shardId) {
+                assertFalse(called.get());
+                called.set(true);


This can be assertTrue(called.compareAndSet(false, true)) to avoid the race condition between the assert and the set.

dakrone · 2014-11-13T15:55:43Z

Left lots of comments, @s1monw does it make sense to have ShardLock implement java.util.concurrent.locks.Lock so it can use the .lock, .tryLock and .unlock naming conventions?

s1monw · 2014-11-13T22:08:06Z

Left lots of comments, @s1monw does it make sense to have ShardLock implement java.util.concurrent.locks.Lock so it can use the .lock, .tryLock and .unlock naming conventions?

thanks for the comments - I think I addressed them all... Regarding Lock I don't really want anybody to lock this outside of NodeEnironment it really has the semantics of a file based lock not like monitor. not sure how this would look like to be honest so no I don't think we should make it more complicated.

dakrone · 2014-11-14T13:49:43Z

src/main/java/org/elasticsearch/env/NodeEnvironment.java

+            synchronized (shardLocks) {
+                assert waitCount > 0 : "waitCount is " + waitCount + " but should be > 0";
+                if (--waitCount == 0) {
+                    shardLocks.remove(shardId);


Do we want to assert that .remove didn't return null here?

s1monw · 2014-11-15T14:05:09Z

I fixed all comments - I think it's ready.. refactorings can happen later this really goes out of date quickly...

bleskes · 2014-11-15T15:28:19Z

src/main/java/org/elasticsearch/gateway/local/state/meta/LocalGatewayMetaState.java

                        } catch (Exception ex) {
-                            logger.debug("[{}] failed to delete index", ex, current.index());
+                            logger.warn("[{}] failed to delete index", ex, current.index());


I have second thoughts about this warn - it might be "normal" in the case that an ongoing recovery is holding a reference to the store while the index is deleted. The new recovery refcount and locking this is expected behavior. Should we log debug if LockObtainFailedException is caught (expected) and only do warn o.w.?

bleskes · 2014-11-15T15:34:28Z

Left two comments about log levels. LGTM other wise. Agreed we need to push it and continue from here.

dakrone · 2014-11-15T19:58:06Z

LGTM!

kimchy · 2014-11-15T20:41:21Z

LGTM, I love the move to in memory locks.

…ications Today it's possible that the data directory for a single shard is used by more than on IndexShard->Store instances. While one shard is already closed but has a concurrent recovery running and a new shard is creating it's engine files can conflict and data can potentially be lost. We also remove shards data without checking if there are still users of the files or if files are still open which can cause pending writes / flushes or the delete operation to fail. If the latter is the case the index might be treated as a dangeling index and is brought back to life at a later point in time. This commit introduces a shard level lock that prevents modifications to the shard data while it's still in use. Locks are created per shard and maintined in NodeEnvironment.java. In contrast to most java concurrency primitives those locks are not reentrant. This commit also adds infrastructure that checks if all shard locks are released after tests.

s1monw · 2014-11-16T15:39:46Z

thanks everybody for the intense & time consuming reviews it's a pretty low level change and lots of places are involved.

elastic#8436 has introduced shard level locks in order to prevent directory reuse during fast deletion & creation of indices. As part for the change, close listeners were introduced to delete the folders once all out standing references were released. The new change has created race conditions causing shard folders not to be deleted (causing test failures due to left over corruption markers). This commit removes the listeners in favor of a simple timeout based solution to be use until a better listener based solution is ready ( elastic#8608 ).

elastic#8436 has introduced shard level locks in order to prevent directory reuse during fast deletion & creation of indices. As part for the change, close listeners were introduced to delete the folders once all out standing references were released. The new change has created race conditions causing shard folders not to be deleted (causing test failures due to left over corruption markers). This commit removes the listeners in favour of a simple timeout based solution to be use until a better listener based solution is ready ( elastic#8608 ). Closes elastic#9009

s1monw added v2.0.0-beta1 v1.5.0 >enhancement review resiliency labels Nov 11, 2014

dakrone reviewed Nov 11, 2014
View reviewed changes

kimchy reviewed Nov 11, 2014
View reviewed changes

dakrone reviewed Nov 13, 2014
View reviewed changes

dakrone reviewed Nov 14, 2014
View reviewed changes

bleskes reviewed Nov 15, 2014
View reviewed changes

s1monw force-pushed the lock_store branch from 5251f7b to 1c64a11 Compare November 16, 2014 13:27

s1monw merged commit 1c64a11 into elastic:master Nov 16, 2014

s1monw removed the review label Nov 16, 2014

s1monw deleted the lock_store branch November 16, 2014 15:38

bleskes mentioned this pull request Dec 19, 2014

Remove IndexCloseListener & Store.OnCloseListener #9009

Closed

clintongormley added the :Core/Infra/Core Core issues without another label label Mar 19, 2015

clintongormley changed the title ~~[CORE] Intorduce shards level locks to prevent concurrent shard modifications~~ [CORE] Introduce shard level locks to prevent concurrent shard modifications Mar 19, 2015

clintongormley changed the title ~~[CORE] Introduce shard level locks to prevent concurrent shard modifications~~ Introduce shard level locks to prevent concurrent shard modifications Jun 6, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce shard level locks to prevent concurrent shard modifications #8436

Introduce shard level locks to prevent concurrent shard modifications #8436

s1monw commented Nov 11, 2014

dakrone Nov 11, 2014

s1monw commented Nov 11, 2014

kimchy commented Nov 11, 2014

kimchy Nov 11, 2014

s1monw Nov 12, 2014

dakrone Nov 13, 2014

dakrone commented Nov 13, 2014

s1monw commented Nov 13, 2014

dakrone Nov 14, 2014

s1monw commented Nov 15, 2014

bleskes Nov 15, 2014

bleskes commented Nov 15, 2014

dakrone commented Nov 15, 2014

kimchy commented Nov 15, 2014

s1monw commented Nov 16, 2014

Introduce shard level locks to prevent concurrent shard modifications #8436

Introduce shard level locks to prevent concurrent shard modifications #8436

Conversation

s1monw commented Nov 11, 2014

dakrone Nov 11, 2014

Choose a reason for hiding this comment

s1monw commented Nov 11, 2014

kimchy commented Nov 11, 2014

kimchy Nov 11, 2014

Choose a reason for hiding this comment

s1monw Nov 12, 2014

Choose a reason for hiding this comment

dakrone Nov 13, 2014

Choose a reason for hiding this comment

dakrone commented Nov 13, 2014

s1monw commented Nov 13, 2014

dakrone Nov 14, 2014

Choose a reason for hiding this comment

s1monw commented Nov 15, 2014

bleskes Nov 15, 2014

Choose a reason for hiding this comment

bleskes commented Nov 15, 2014

dakrone commented Nov 15, 2014

kimchy commented Nov 15, 2014

s1monw commented Nov 16, 2014