Ensure shards are deleted under lock on close #8579

Merged
merged 1 commit into from Nov 21, 2014

Projects

None yet

3 participants

@s1monw
Contributor
s1monw commented Nov 20, 2014

Today there is a race condition between the actual deletion of
the shard and the release of the lock in the store. This race can cause
rare imports of dangeling indices if the cluster state update loop
tires to import the dangeling index in that particular windonw. This commit
adds more safety to the import of dangeling indices and removes the race
condition by holding on to the lock on store closing while the listener
is notified.

@dakrone dakrone commented on the diff Nov 21, 2014
...org/elasticsearch/indices/InternalIndicesService.java
@@ -357,7 +358,8 @@ public void onAllShardsClosed(Index index, List<Throwable> failures) {
@Override
public void onShardClosed(ShardId shardId) {
try {
- nodeEnv.deleteShardDirectorySafe(shardId);
+ // this is called under the shard lock - we can safely delete it
@dakrone
dakrone Nov 21, 2014 Member

Can you add that this method is called while the shard lock is held to the javadocs for IndexCloseListener.onShardClosed? Currently it says "Invoked once the last resource using the given shard ID is released" which isn't accurate since the lock is still held

@dakrone dakrone and 1 other commented on an outdated diff Nov 21, 2014
...h/gateway/local/state/meta/LocalGatewayMetaState.java
@@ -287,6 +288,19 @@ public void clusterChanged(ClusterChangedEvent event) {
}
final IndexMetaData indexMetaData = loadIndexState(indexName);
if (indexMetaData != null) {
+ final Index index = new Index(indexName);
+ try { // the index deletion might not have worked due to shards still being locked
+ final List<ShardLock> shardLocks = nodeEnv.lockAllForIndex(index);
+ if (shardLocks.isEmpty()) {
+ // no shards - try to remove the directory
+ nodeEnv.deleteIndexDirectorySafe(index);
@dakrone
dakrone Nov 21, 2014 Member

I'm not sure I follow here, if there are no locks for an index and it's not part of the index metadata, isn't it supposed to be a candidate for auto-import?

Or, will there always be locks in the shardLocks list if the candidate is a dangling index we should import?

@s1monw
s1monw Nov 21, 2014 Contributor

I will add a comment to the code

@dakrone
Member
dakrone commented Nov 21, 2014

@s1monw left a couple of comments

@s1monw
Contributor
s1monw commented Nov 21, 2014

@dakrone updated based on your comments

@dakrone
Member
dakrone commented Nov 21, 2014

LGTM, thanks for adding the documentation

@s1monw s1monw [CORE] Ensure shards are deleted under lock on close
Today there is a race condition between the actual deletion of
the shard and the release of the lock in the store. This race can cause
rare imports of dangeling indices if the cluster state update loop
tires to import the dangeling index in that particular windonw. This commit
adds more safety to the import of dangeling indices and removes the race
condition by holding on to the lock on store closing while the listener
is notified.
a6e6c4e
@s1monw s1monw merged commit a6e6c4e into elastic:master Nov 21, 2014
@clintongormley clintongormley changed the title from [CORE] Ensure shards are deleted under lock on close to Ensure shards are deleted under lock on close Jun 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment