Primary relocation handoff #15900

Merged
merged 2 commits into from Feb 2, 2016

Projects

None yet

5 participants

@ywelsch
Contributor
ywelsch commented Jan 11, 2016 edited

When primary relocation completes, a cluster state is propagated that deactivates the old primary and marks the new primary as active. As cluster state changes are not applied synchronously on all nodes, there can be a time interval where the relocation target has processed the cluster state and believes to be the active primary and the relocation source has not yet processed the cluster state update and still believes itself to be the active primary. This PR ensures that, before completing the relocation, the relocation source deactivates writing to its store and delegates requests to the relocation target.

The change is motivated as follows:

  1. We need to ensure that we only start writing data into the new primary once all the writes into the old primary have been completely replicated (among others to the new primary). This ensures that the new primary operates on the proper document version numbers. Document versions are increased when writing to the primary and then used on the replica to make sure that newer documents are not overridden by older documents (in the presence of concurrent replication). A scenario for this would be: Write document with id "K" and value "X" to old primary (gets version 1) and replicate it to new primary as well as replica. Assume that another document with id "K" but value "Y" is written on the new primary before the new primary gets the replicated write of "K" with value "X". Unaware of the other write it will then assign the same version number (namely 1) to the document with value "Y" and replicate it to the replica. Depending on the order in which replicated writes from old and new primary arrive at the replica, it will then either store "X" or "Y", which means that the new primary and the replica can become out of sync.

  2. We have to ensure that no new writes are done on the old primary once we start writing into the new primary. This helps with the following scenario. Assume primary relocation completes and master broadcasts cluster state which now only contains the new primary. Due to the distributed nature of Elasticsearch, cluster states are not applied in full synchrony on all nodes. For a brief moment nodes in the cluster have a different view of which node is the primary. In particular, it's possible that the node holding the old primary (node A) still believes to be the primary whereas the node holding the new primary (node B) believes to be the primary as well. If we send a document to node B, it will get indexed into the new primary and acknowledged (but won't exist on the old primary). If we then issue a delete request for the same document to the node A (which can happen if we send requests round-robin to nodes), then that node will not find the document in its old primary and fail the request.

This PR (in combination with #19013) implements the following solution:

Before completing the relocation, node A (holding the primary relocation source) deactivates writing to its shard copy (and temporarily puts all new incoming requests for that shard into a queue), then waits for all ongoing operations to be fully replicated. Once that is done, it delegates all new incoming requests to node B (holding the new primary) and also sends all the elements in the queue there. It uses a special action to delegate requests to node B, which bypasses the standard reroute phase when accepting requests as standard rerouting is based on the current cluster state on the node. At that moment, indexing requests that directly go to the node B will still be rerouted back to node A with the old primary. This means that node A is still in charge of indexing, but will use the physical shard copy on node B to do so. Node B finally asks the master to activate the new primary (finish the relocation). The master then broadcasts a new cluster state where the old primary on node A is removed and the new primary on node B is active. It doesn't matter now in which order the cluster state is applied on the nodes A and B:

  1. If the cluster state is first applied on the node B, both nodes will send their index requests to the shard copy that is on node B.
  2. If the cluster state is first applied on node A, requests to node A will be rerouted to node B and requests to node B will be rerouted to node A. To prevent redirect loops during the time period where cluster states on node A and node B differ, #16274 makes requests that are coming from node A wait on node B until B has processed the cluster state where relocation is completed.

supersedes #15532

@bleskes bleskes was assigned by ywelsch Jan 11, 2016
@bleskes
Member
bleskes commented Jan 11, 2016

Thanks @ywelsch . Can we open a PR with just the operation counter, extracted to it's own java file? We can work on the primary relocation after wards. Also I would love it if we can keep the counter naming to the universe out of IndexShard (for now). Let's try to keep the change small.

@bleskes bleskes commented on an outdated diff Jan 21, 2016
...n/support/replication/TransportReplicationAction.java
@@ -995,18 +995,17 @@ protected boolean shouldExecuteReplication(Settings settings) {
static class IndexShardReference implements Releasable {
- final private IndexShard counter;
+ private final Releasable operationLock;
@bleskes
bleskes Jan 21, 2016 Member

can we keep the shard reference? I know it's not needed here - but it serves as a proxy to the shard allowing to do more things - see https://github.com/elastic/elasticsearch/pull/15485/files#diff-a8aefbf42f29dc0fcc7c0a144863948eR1104

@bleskes bleskes commented on an outdated diff Jan 21, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
@@ -388,6 +388,16 @@ public void updateRoutingEntry(final ShardRouting newRouting, final boolean pers
}
}
}
+
+ if (state == IndexShardState.RELOCATED && newRouting.relocating() == false) {
@bleskes
bleskes Jan 21, 2016 Member

this is tricky - it leaves us in a potentially scenario where there are two active primaries - the target and the source. I don't have a clean solution for this. My suggestion is to fail the shard and let the master promote another replica.

@bleskes bleskes commented on an outdated diff Jan 21, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
}
} else {
// for replicas, we allow to write also while recovering, since we index also during recovery to replicas
// and rely on version checks to make sure its consistent
- if (state != IndexShardState.STARTED && state != IndexShardState.RELOCATED && state != IndexShardState.RECOVERING && state != IndexShardState.POST_RECOVERY) {
+ if (state != IndexShardState.STARTED && state != IndexShardState.RECOVERING && state != IndexShardState.POST_RECOVERY) {
@bleskes
bleskes Jan 21, 2016 Member

why is relocated removed here?

@bleskes bleskes commented on an outdated diff Jan 21, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
}
-
- @Override
- protected void alreadyClosed() {
- throw new IndexShardClosedException(shardId, "could not increment operation counter. shard is closed.");
+ // <-- position X
+ Releasable releasable;
+ try {
+ releasable = suspendableRefContainer.tryAcquire();
+ } catch (InterruptedException e) {
+ logger.warn("operation lock was interrupted");
@bleskes
bleskes Jan 21, 2016 Member

can you add the exception?

@bleskes bleskes commented on an outdated diff Jan 21, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
}
-
- @Override
- protected void alreadyClosed() {
- throw new IndexShardClosedException(shardId, "could not increment operation counter. shard is closed.");
+ // <-- position X
+ Releasable releasable;
+ try {
+ releasable = suspendableRefContainer.tryAcquire();
+ } catch (InterruptedException e) {
+ logger.warn("operation lock was interrupted");
+ Thread.currentThread().interrupt(); // Mark thread as interrupted again
@bleskes
bleskes Jan 21, 2016 Member

we throw the exception and thus take care of the interrupt. We don't need to set it...

@bleskes bleskes commented on an outdated diff Jan 21, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
}
-
- @Override
- protected void alreadyClosed() {
- throw new IndexShardClosedException(shardId, "could not increment operation counter. shard is closed.");
+ // <-- position X
+ Releasable releasable;
+ try {
+ releasable = suspendableRefContainer.tryAcquire();
+ } catch (InterruptedException e) {
+ logger.warn("operation lock was interrupted");
+ Thread.currentThread().interrupt(); // Mark thread as interrupted again
+ throw new IllegalIndexShardStateException(shardId, state, "operation lock was interrupted");
@bleskes
bleskes Jan 21, 2016 Member

can you add the suppressed interrupted exception? Also, I'm not sure about using IllegalIndexShardStateException here. It's ignore by the replication logic assuming the shard is not yet started or has finished relocating and was shut down. Interruption is a bigger problem which should never happen?

@ywelsch
Contributor
ywelsch commented Jan 25, 2016

@bleskes I've updated the PR according to our discussion and considered the following 4 scenarios:

  • source node and target node on cluster state before relocation target is marked as started and shard on source node not yet marked as RELOCATED. This means that source node knows it is active primary but not yet relocated and target node knows it is primary and relocation target. Index requests to source node are indexed on source node and replicated to target node. Index requests to target node are rerouted to source node.
  • source node and target node on cluster state before relocation target is marked as started and shard on source node marked as RELOCATED. This means that source node knows it is relocated and target node knows it is a primary relocation target. Index requests to source node are sent in primary phase to target node and replicated back to source node. Index requests to target are rerouted back to source node.
  • source node on cluster state before shard of relocation target is marked as started and target node on cluster state with shard marked as started. This means that source node knows it is still active primary, but its shard has been marked as RELOCATED. Target node knows it is active primary as well. Index requests to source node are sent in primary phase to target node but not replicated back to source node. Index requests to target node are indexed directly on target node and not replicated to source node.
  • source node on cluster state where relocation target is marked as started and target node on cluster state where it is not yet started. This means that source node has closed its shard. Requests to source node are rerouted to target node. Requests to target node are rerouted to source node which are rerouted to target node and back and forth. This is addressed by subsequent patch.
@bleskes bleskes commented on an outdated diff Jan 28, 2016
...n/support/replication/TransportReplicationAction.java
try {
- indexShardReference = getIndexShardOperationsCounter(shardId);
- Tuple<Response, ReplicaRequest> primaryResponse = shardOperationOnPrimary(state.metaData(), request);
- if (logger.isTraceEnabled()) {
- logger.trace("action [{}] completed on shard [{}] for request [{}] with cluster state version [{}]", transportPrimaryAction, shardId, request, state.version());
+ indexShardReference = getIndexShardOperationsCounter(shardId, true);
@bleskes
bleskes Jan 28, 2016 Member

get we call the method getIndexShardReference? Also I like explicit naming here better then a primary boolean. See seq_no branch for an example

@bleskes bleskes and 1 other commented on an outdated diff Jan 28, 2016
...n/support/replication/TransportReplicationAction.java
@@ -617,7 +639,44 @@ protected void doRun() throws Exception {
finishAsFailed(e);
return;
}
- finishAndMoveToReplication(replicationPhase);
+
@bleskes
bleskes Jan 28, 2016 Member

This method is becoming a bit of a monster. How about simplifying it like this?

https://gist.github.com/bleskes/84c46a548d2782e6064a

Note that I removed the non-existent nodes check - I really wonder when this can happen these days. Also, I rather take the shardrouting from the indexShard and know it's consistent with the IndexShard#relocated()

@ywelsch
ywelsch Jan 28, 2016 Contributor

I have added your changes but kept the exception type of AbstractRunnable.doRun() as before (Exception, not Throwable).

@bleskes bleskes and 1 other commented on an outdated diff Jan 28, 2016
...n/support/replication/TransportReplicationAction.java
@@ -765,7 +824,7 @@ public ReplicationPhase(ReplicaRequest replicaRequest, Response finalResponse, S
if (shard.currentNodeId().equals(nodes.localNodeId()) == false) {
numberOfPendingShardInstances++;
}
- if (shard.relocating()) {
+ if (shard.relocating() && shard.relocatingNodeId().equals(nodes.localNodeId()) == false) {
@bleskes
bleskes Jan 28, 2016 Member

can we change this block to use the same flow as the doRun (it's equivalent, but I have to double check every time)

     if (shard.primary() == false && executeOnReplica == false) {
                    numberOfIgnoredShardInstances++;
                    continue;
                }
                if (shard.unassigned()) {
                    numberOfIgnoredShardInstances++;
                    continue;
                }
                if (nodes.localNodeId().equals(shard.currentNodeId()) == false) {
                    numberOfPendingShardInstances++;
                }
                if (shard.relocating() && shard.relocatingNodeId().equals(nodes.localNodeId()) == false) {
                    numberOfPendingShardInstances++;
                }
@ywelsch
ywelsch Jan 28, 2016 Contributor

done.

alternatively, the list of shards to replicate to could be built in the constructor.
We would then only iterate over the list here.

@bleskes bleskes commented on an outdated diff Jan 28, 2016
...n/support/replication/TransportReplicationAction.java
@@ -840,7 +899,7 @@ protected void doRun() {
performOnReplica(shard);
}
// send operation to relocating shard
- if (shard.relocating()) {
+ if (shard.relocating() && shard.relocatingNodeId().equals(nodes.localNodeId()) == false) {
@bleskes
bleskes Jan 28, 2016 Member

can we also change the comment to say that the local shard can be a relocation target of the primary ?

@bleskes bleskes and 2 others commented on an outdated diff Jan 28, 2016
...h/common/util/concurrent/SuspendableRefContainer.java
+ */
+
+package org.elasticsearch.common.util.concurrent;
+
+import org.elasticsearch.common.lease.Releasable;
+
+import java.util.concurrent.Semaphore;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Container that represents a resource with reference counting capabilities. Provides operations to suspend acquisition of new references.
+ * This is useful for resource management when resources are intermittently unavailable.
+ *
+ * Assumes less than Integer.MAX_VALUE references are concurrently being held at one point in time.
+ */
+public class SuspendableRefContainer {
@bleskes
bleskes Jan 28, 2016 Member

Did you change anything here from the dedicate PR? (we should have comitted the other one, so it will be clear)

@ywelsch
ywelsch Jan 28, 2016 Contributor

I added the method acquireUninterruptibly. It's in the second commit ;-) As said before, I prefer to commit data structures when we know how to use them.

@bleskes
bleskes Jan 28, 2016 Member

Thanks. I'm with you, but this is just a big change I would prefered to get the counter in to mimic the old behavior first and then build on it (and change it) here. No big one - water under the bridge :)

@s1monw
s1monw Feb 1, 2016 Contributor

this class can be final no?

@bleskes bleskes commented on the diff Jan 28, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
@@ -408,8 +418,17 @@ public IndexShard relocated(String reason) throws IndexShardNotStartedException
if (state != IndexShardState.STARTED) {
@bleskes
bleskes Jan 28, 2016 Member

we can remove the mutex here, but tbh this whole check can go away - It's just an extra safety mechanism. We can keep it simple

@ywelsch
ywelsch Jan 28, 2016 Contributor

removed it.

@bleskes
bleskes Jan 30, 2016 Member

For future readers - this comment refers to another iteration - github got confused. things look good to me now.

@bleskes bleskes and 1 other commented on an outdated diff Jan 28, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
} else {
- // for replicas, we allow to write also while recovering, since we index also during recovery to replicas
- // and rely on version checks to make sure its consistent
- if (state != IndexShardState.STARTED && state != IndexShardState.RELOCATED && state != IndexShardState.RECOVERING && state != IndexShardState.POST_RECOVERY) {
- throw new IllegalIndexShardStateException(shardId, state, "operation only allowed when started/recovering, origin [" + origin + "]");
+ assert origin == Engine.Operation.Origin.REPLICA;
+
+ // replication is also allowed while recovering, since we index also during recovery to replicas and rely on version checks to make sure its consistent
+ // a relocated shard can also be target of a replication if the relocation target has not been marked as active yet and is syncing it's changes back to the relocation source
+ if (state != IndexShardState.STARTED && state != IndexShardState.RECOVERING && state != IndexShardState.POST_RECOVERY && state != IndexShardState.RELOCATED ) {
@bleskes
bleskes Jan 28, 2016 Member

can we make this an enumSet called writeAllowedStateForReplica (and the same for primary). It's getting out of hand :) also please right them in order recovering, post_recovery, started, relocated

@ywelsch
ywelsch Jan 28, 2016 Contributor

done

@bleskes bleskes and 1 other commented on an outdated diff Jan 28, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
-
- public IndexShardOperationCounter(ESLogger logger, ShardId shardId) {
- super("index-shard-operations-counter");
- this.logger = logger;
- this.shardId = shardId;
- }
-
- @Override
- protected void closeInternal() {
- logger.debug("operations counter reached 0, will not accept any further writes");
- }
-
- @Override
- protected void alreadyClosed() {
- throw new IndexShardClosedException(shardId, "could not increment operation counter. shard is closed.");
+ public Releasable newPrimaryPhaseOperationLock() {
@bleskes
bleskes Jan 28, 2016 Member

we don't really sue the term phases here. how about aquirePrimaryOpertaionLock?

@ywelsch
ywelsch Jan 28, 2016 Contributor

done

@bleskes bleskes commented on an outdated diff Jan 28, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
}
- public void incrementOperationCounter() {
- indexShardOperationCounter.incRef();
- }
-
- public void decrementOperationCounter() {
- indexShardOperationCounter.decRef();
+ public Releasable newReplicationPhaseOperationLock() {
@bleskes
bleskes Jan 28, 2016 Member

acquireReplicaOpertaionLock

@bleskes bleskes and 1 other commented on an outdated diff Jan 28, 2016
...earch/indices/cluster/IndicesClusterStateService.java
@@ -493,7 +494,12 @@ private void applyNewOrUpdatedShards(final ClusterChangedEvent event) {
// shadow replicas do not support primary promotion. The master would reinitialize the shard, giving it a new allocation, meaning we should be there.
assert (shardRouting.primary() && currentRoutingEntry.primary() == false) == false || indexShard.allowsPrimaryPromotion() :
"shard for doesn't support primary promotion but master promoted it with changing allocation. New routing " + shardRouting + ", current routing " + currentRoutingEntry;
- indexShard.updateRoutingEntry(shardRouting, event.state().blocks().disableStatePersistence() == false);
+ try {
+ indexShard.updateRoutingEntry(shardRouting, event.state().blocks().disableStatePersistence() == false);
+ } catch (IndexShardRelocatedException e) {
@bleskes
bleskes Jan 28, 2016 Member

I think we can fail the shard on any failure here. failAndRemoveShard already does logging under WARN

@ywelsch
ywelsch Jan 28, 2016 Contributor

agree.

In the future, we should also fail the shard if the shardstatemetadata cannot be written. Data without metadata is useless ;-)

@bleskes bleskes and 1 other commented on an outdated diff Jan 28, 2016
...ticsearch/indices/recovery/RecoverySourceHandler.java
@@ -408,7 +407,11 @@ public void run() throws InterruptedException {
}
stopWatch.stop();
logger.trace("[{}][{}] finalizing recovery to {}: took [{}]",
- indexName, shardId, request.targetNode(), stopWatch.totalTime());
+ indexName, shardId, request.targetNode(), stopWatch.totalTime());
+ }
+
+ protected boolean isPrimaryRelocation() {
+ return request.recoveryType() == RecoveryState.Type.RELOCATION && shard.routingEntry().primary();
@bleskes
bleskes Jan 28, 2016 Member

the shard.routingEntry is always primary. How about renaming the RecoveryState.Type to PRIMARY_RELOCATION and just check that?

@ywelsch
ywelsch Jan 28, 2016 Contributor

I like this change a lot! done

@bleskes bleskes commented on an outdated diff Jan 28, 2016
...port/replication/TransportReplicationActionTests.java
+ PlainActionFuture<Response> listener = new PlainActionFuture<>();
+ TransportReplicationAction.PrimaryPhase primaryPhase = action.new PrimaryPhase(request, createTransportChannel(listener));
+ isRelocated.set(true);
+ primaryPhase.run();
+ assertThat("request was processed on primary", request.processedOnPrimary.get(), equalTo(false));
+ final String relocatingNodeId = clusterService.state().getRoutingTable().shardRoutingTable(index, shardId.id()).primaryShard().relocatingNodeId();
+ final List<CapturingTransport.CapturedRequest> requests = transport.capturedRequestsByTargetNode().get(relocatingNodeId);
+ assertThat(requests, notNullValue());
+ assertThat(requests.size(), equalTo(1));
+ assertThat("primary request was not sent", requests.get(0).action, equalTo("testAction[p]"));
+ }
+
+ public void testPrimaryPhaseExecutesDelegatedRequest() throws InterruptedException, ExecutionException {
+ final String index = "test";
+ final ShardId shardId = new ShardId(index, 0);
+ ClusterState state = state(index, true, ShardRoutingState.RELOCATING, ShardRoutingState.STARTED);
@bleskes
bleskes Jan 28, 2016 Member

can we add some randomization to the replica states?

@bleskes bleskes and 1 other commented on an outdated diff Jan 28, 2016
...port/replication/TransportReplicationActionTests.java
+ state = ClusterState.builder(state).nodes(DiscoveryNodes.builder(state.nodes()).localNodeId(primaryTargetNodeId)).build();
+ clusterService.setState(state);
+ Request request = new Request(shardId).timeout("1ms");
+ PlainActionFuture<Response> listener = new PlainActionFuture<>();
+ TransportReplicationAction.PrimaryPhase primaryPhase = action.new PrimaryPhase(request, createTransportChannel(listener));
+ primaryPhase.run();
+ assertThat("request was not processed on primary", request.processedOnPrimary.get(), equalTo(true));
+
+ // check that normal replication to replicas works
+ List<CapturingTransport.CapturedRequest> requests = transport.capturedRequestsByTargetNode().get(replicaNodeId);
+ assertThat(requests, notNullValue());
+ assertThat(requests.size(), equalTo(1));
+ assertThat("replica request was not sent to replica", requests.get(0).action, equalTo("testAction[r]"));
+
+ // check that we also replicate back to primary relocation source
+ requests = transport.capturedRequestsByTargetNode().get(primarySourceNodeId);
@bleskes
bleskes Jan 28, 2016 Member

this should be part of the tests for the replication phase - with random cluster state and all..

@ywelsch
ywelsch Jan 28, 2016 Contributor

done

@bleskes bleskes commented on an outdated diff Jan 28, 2016
...va/org/elasticsearch/index/shard/IndexShardTests.java
+ Settings.builder().put("index.number_of_shards", 1).put("index.number_of_replicas", 0)
+ ).get());
+ ensureGreen();
+ IndicesService indicesService = getInstanceFromNode(IndicesService.class);
+ IndexService test = indicesService.indexService("test");
+ final IndexShard shard = test.getShardOrNull(0);
+ CountDownLatch latch = new CountDownLatch(1);
+ Thread recoveryThread = new Thread(() -> {
+ latch.countDown();
+ shard.relocated("simulated recovery");
+ });
+
+ try (Releasable ignored = shard.newPrimaryPhaseOperationLock()) {
+ recoveryThread.start();
+ latch.await();
+ assertThat(shard.state(), equalTo(IndexShardState.STARTED));
@bleskes
bleskes Jan 28, 2016 Member

can we add some comments like // shard should be relocated until we are done?

@bleskes bleskes commented on an outdated diff Jan 28, 2016
...va/org/elasticsearch/index/shard/IndexShardTests.java
+ throw new RuntimeException(e);
+ }
+ }
+ };
+ indexThreads[i].start();
+ }
+ AtomicBoolean success = new AtomicBoolean();
+ final Thread recoveryThread = new Thread(() -> {
+ shard.relocated("simulated recovery");
+ success.set(true);
+ });
+ recoveryThread.start();
+ // ensure we block for pending operations to complete
+ assertBusy(() -> {
+ assertThat(success.get(), equalTo(false));
+ assertThat(shard.getActiveOperationsCount(), greaterThan(0));
@bleskes
bleskes Jan 28, 2016 Member

why do we need assertBusy here? I rather use two signaling methods - the one you have now and the one signaling the some/all the threads have acquired the ops lock. Note that now I think you have a race condition with the recovery thread can sneak in first.

@bleskes bleskes commented on an outdated diff Jan 28, 2016
...va/org/elasticsearch/index/shard/IndexShardTests.java
+ success.set(true);
+ });
+ recoveryThread.start();
+ // ensure we block for pending operations to complete
+ assertBusy(() -> {
+ assertThat(success.get(), equalTo(false));
+ assertThat(shard.getActiveOperationsCount(), greaterThan(0));
+ });
+ // ensure we only transition to RELOCATED state after pending operations completed
+ assertBusy(() -> {
+ assertThat(shard.state(), equalTo(IndexShardState.STARTED));
+ });
+ // complete pending operations
+ barrier.await();
+ // ensure relocated successfully once pending operations are done
+ assertBusy(() -> {
@bleskes
bleskes Jan 28, 2016 Member

use recoveryThread.join()?

@bleskes bleskes commented on an outdated diff Jan 28, 2016
...search/indices/recovery/IndexPrimaryRelocationIT.java
+import org.elasticsearch.test.disruption.MultiServiceDisruptionScheme;
+import org.elasticsearch.test.disruption.ServiceDisruptionScheme;
+import org.junit.Before;
+
+import java.util.concurrent.ExecutionException;
+
+import static org.elasticsearch.common.settings.Settings.settingsBuilder;
+import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked;
+import static org.hamcrest.Matchers.equalTo;
+import static org.hamcrest.Matchers.notNullValue;
+
+/**
+ * This class tests 3 scenarios for primary relocation (each a test method)
+ */
+@ClusterScope(scope = Scope.TEST, numDataNodes = 0)
+public class IndexPrimaryRelocationIT extends ESIntegTestCase {
@bleskes
bleskes Jan 28, 2016 Member

can we turn on debug logging for this one?

@bleskes bleskes commented on an outdated diff Jan 28, 2016
...search/indices/recovery/IndexPrimaryRelocationIT.java
+ @Before
+ public void setUp() throws Exception {
+ super.setUp();
+
+ logger.info("--> start master node");
+ internalCluster().startMasterOnlyNode(Settings.EMPTY);
+ logger.info("--> start node A");
+ nodeA = internalCluster().startDataOnlyNode(Settings.EMPTY);
+
+ createAndPopulateIndex(INDEX_NAME, 1, SHARD_COUNT, REPLICA_COUNT);
+
+ logger.info("--> start node B");
+ nodeB = internalCluster().startDataOnlyNode(Settings.EMPTY);
+ ensureGreen();
+
+ client().admin().cluster().prepareState().get().getState().getNodes().forEach(node -> {
@bleskes
bleskes Jan 28, 2016 Member

you can do : internalCluster().getInstance(ClusterService.class, nodeA).localNode().id();

@bleskes bleskes and 1 other commented on an outdated diff Jan 28, 2016
...search/indices/recovery/IndexPrimaryRelocationIT.java
+ *
+ * Scenario 1:
+ *
+ * source node and target node on cluster state before relocation target is marked as started and shard on source node not yet marked as
+ * RELOCATED. This means that source node knows it is active primary but not yet relocated and target node knows it is primary and
+ * relocation target. Index requests to source node are indexed on source node and replicated to target node. Index requests to target
+ * node are rerouted to source node.
+ *
+ * Scenario 2:
+ *
+ * source node and target node on cluster state before relocation target is marked as started and shard on source node marked as
+ * RELOCATED. This means that source node knows it is relocated and target node knows it is a primary relocation target. Index
+ * requests to source node are sent in primary phase to target node and replicated back to source node. Index requests to
+ * target are rerouted back to source node.
+ */
+ public void testPrimaryRecoveryBothNodesOnOldClusterState() throws Exception {
@bleskes
bleskes Jan 28, 2016 Member

can we replace these test with a simpler to understand test, paying the price of things being less targeted? experience have shown that this type of tests are very hard to maintain and often don't reproduce exactly what was intended anyway (because it's so hard)..

@ywelsch
ywelsch Jan 28, 2016 Contributor

done

@bleskes
Member
bleskes commented Jan 28, 2016

Thanks @ywelsch . this looks solid. I left some comments, mostly around simplifying things...

@ywelsch
Contributor
ywelsch commented Jan 28, 2016

@bleskes I've updated the PR again. Can you have another look? Reviewing this time might be easier by commenting directly on the newly added commits.

After all is done, I will put SuspendableRefContainer and its tests into a standalone commit.

@areek areek commented on the diff Jan 29, 2016
...n/support/replication/TransportReplicationAction.java
- logger.debug("failed to execute [{}] on [{}]", e, request, shardId);
- }
- }
- finishAsFailed(e);
- return;
+ ReplicationPhase replicationPhase = new ReplicationPhase(primaryResponse.v2(), primaryResponse.v1(), shardId, channel, indexShardReference);
+ finishAndMoveToReplication(replicationPhase);
+ } else {
+ // delegate primary phase to relocation target
+ // it is safe to execute primary phase on relocation target as there are no more in-flight operations where primary
+ // phase is executed on local shard and all subsequent operations are executed on relocation target as primary phase.
+ final ShardRouting primary = indexShardReference.routingEntry();
+ indexShardReference.close();
+ assert primary.relocating() : "indexShard is marked as relocated but routing isn't" + primary;
+ DiscoveryNode relocatingNode = state.nodes().get(primary.relocatingNodeId());
+ transportService.sendRequest(relocatingNode, transportPrimaryAction, request, transportOptions,
@areek
areek Jan 29, 2016 Contributor

++ I like how this turned out but doesn't it mean the primary phase retry on the relocating node would be non-local (from the source primary)?

@ywelsch
ywelsch Jan 29, 2016 Contributor

yes, exactly. As retry is handled by reroute phase and that one occurs only on source primary, retry can only be done there. This makes it (imo) easier to reason about the system, as less code paths are explored. Do you have any specific concerns with that? For example, a situation where this might hurt us?

@areek
areek Jan 29, 2016 Contributor

Agreed on the simplicity and think it is the right trade-off. I think delegating the primary phase to relocation target and have a retryable exception would be relatively rare, so the non-local retry is ok. @bleskes thoughts on this?

@bleskes
bleskes Jan 30, 2016 Member

++ on this. The ReroutePhase owns retrying across the board.

@areek areek and 1 other commented on an outdated diff Jan 29, 2016
...rch/test/disruption/MultiServiceDisruptionScheme.java
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.elasticsearch.test.disruption;
+
+import org.elasticsearch.common.unit.TimeValue;
+import org.elasticsearch.test.InternalTestCluster;
+
+public class MultiServiceDisruptionScheme implements ServiceDisruptionScheme {
@areek
areek Jan 29, 2016 Contributor

was this intended? Its not used anywhere.

@ywelsch
ywelsch Jan 29, 2016 Contributor

oops, that was used in a test from an earlier revision. I will remove it.

@bleskes bleskes commented on an outdated diff Jan 30, 2016
...n/support/replication/TransportReplicationAction.java
@@ -701,10 +716,16 @@ void finishBecauseUnavailable(ShardId shardId, String message) {
}
}
- protected Releasable getIndexShardOperationsCounter(ShardId shardId) {
+ protected IndexShardReference getIndexShardReferenceOnPrimary(ShardId shardId) {
@bleskes
bleskes Jan 30, 2016 Member

nite: can we add some java docs about the life cycle of this reference (closed when operation is done and replication have completed).

@bleskes bleskes commented on the diff Jan 30, 2016
...n/support/replication/TransportReplicationAction.java
@@ -759,15 +780,17 @@ public ReplicationPhase(ReplicaRequest replicaRequest, Response finalResponse, S
for (ShardRouting shard : shards) {
if (shard.primary() == false && executeOnReplica == false) {
numberOfIgnoredShardInstances++;
- } else if (shard.unassigned()) {
+ continue;
+ }
+ if (shard.unassigned()) {
@bleskes
bleskes Jan 30, 2016 Member

can we add comment telling people this mimics the logic in doRun and they should look there for more information

@bleskes bleskes commented on an outdated diff Jan 30, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
- private final EnumSet<IndexShardState> readAllowedStates = EnumSet.of(IndexShardState.STARTED, IndexShardState.RELOCATED, IndexShardState.POST_RECOVERY);
+ private static final EnumSet<IndexShardState> readAllowedStates = EnumSet.of(IndexShardState.STARTED, IndexShardState.RELOCATED, IndexShardState.POST_RECOVERY);
+ // for primaries, we only allow to write when actually started (so the cluster has decided we started)
+ // in case we have a relocation of a primary, we also allow to write after phase 2 completed, where the shard may be
+ // in state RECOVERING or POST_RECOVERY.
@bleskes
bleskes Jan 30, 2016 Member

I think the it's also important to explain here is why we don't allow writes on RELOCATED

@bleskes bleskes commented on an outdated diff Jan 30, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
@@ -958,16 +977,17 @@ private void ensureWriteAllowed(Engine.Operation op) throws IllegalIndexShardSta
IndexShardState state = this.state; // one time volatile read
if (origin == Engine.Operation.Origin.PRIMARY) {
- // for primaries, we only allow to write when actually started (so the cluster has decided we started)
- // otherwise, we need to retry, we also want to still allow to index if we are relocated in case it fails
- if (state != IndexShardState.STARTED && state != IndexShardState.RELOCATED) {
+ if (writeAllowedStatesForPrimary.contains(state) == false) {
throw new IllegalIndexShardStateException(shardId, state, "operation only allowed when started/recovering, origin [" + origin + "]");
@bleskes
bleskes Jan 30, 2016 Member

we can just use the enumSet.toString here (assuming it gives nice output). This will make sure we are never out of sync and I think it's OK to have a technical message here.

@bleskes bleskes commented on an outdated diff Jan 30, 2016
...in/java/org/elasticsearch/index/shard/IndexShard.java
} else {
- // for replicas, we allow to write also while recovering, since we index also during recovery to replicas
- // and rely on version checks to make sure its consistent
- if (state != IndexShardState.STARTED && state != IndexShardState.RELOCATED && state != IndexShardState.RECOVERING && state != IndexShardState.POST_RECOVERY) {
- throw new IllegalIndexShardStateException(shardId, state, "operation only allowed when started/recovering, origin [" + origin + "]");
+ assert origin == Engine.Operation.Origin.REPLICA;
+ if (writeAllowedStatesForReplica.contains(state) == false) {
+ throw new IllegalIndexShardStateException(shardId, state, "operation only allowed when started/recovering/relocated, origin [" + origin + "]");
@bleskes
bleskes Jan 30, 2016 Member

same here re enumSet.toString

@bleskes bleskes commented on an outdated diff Jan 30, 2016
...ticsearch/indices/recovery/RecoverySourceHandler.java
@@ -395,9 +395,8 @@ public void run() throws InterruptedException {
}
});
-
- if (request.markAsRelocated()) {
- // TODO what happens if the recovery process fails afterwards, we need to mark this back to started
+ if (isPrimaryRelocation()) {
+ // if the recovery process fails after setting the shard state to RELOCATED, both relocation source and target are failed
@bleskes
bleskes Jan 30, 2016 Member

can we link to IndexShard.updateRoutingEntry ? it's not trivial to find.

@bleskes bleskes commented on an outdated diff Jan 30, 2016
...port/replication/TransportReplicationActionTests.java
+ isRelocated.set(true);
+ indexShardRouting.set(primaryShard);
+ executeOnPrimary = false;
+ }
+ primaryPhase.run();
+ assertThat(request.processedOnPrimary.get(), equalTo(executeOnPrimary));
+ assertThat(movedToReplication.get(), equalTo(executeOnPrimary));
+ if (executeOnPrimary == false) {
+ final List<CapturingTransport.CapturedRequest> requests = transport.capturedRequestsByTargetNode().get(primaryShard.relocatingNodeId());
+ assertThat(requests, notNullValue());
+ assertThat(requests.size(), equalTo(1));
+ assertThat("primary request was not delegated", requests.get(0).action, equalTo("testAction[p]"));
+ }
+ }
+
+ public void testPrimaryPhaseExecutesDelegatedRequest() throws InterruptedException, ExecutionException {
@bleskes
bleskes Jan 30, 2016 Member

I'm confused by the test name. It seems we test the execution of the primary phase on the relocation target? if so can we name the test correctly?

@bleskes bleskes commented on an outdated diff Jan 30, 2016
...port/replication/TransportReplicationActionTests.java
@@ -726,12 +782,28 @@ private void assertIndexShardCounter(int expected) {
private final AtomicInteger count = new AtomicInteger(0);
+ private final AtomicBoolean isRelocated = new AtomicBoolean(false);
+
+ private final AtomicReference<ShardRouting> indexShardRouting = new AtomicReference();
@bleskes
bleskes Jan 30, 2016 Member

can we use new AtomicReference<>()?

@bleskes bleskes and 1 other commented on an outdated diff Jan 30, 2016
...port/replication/TransportReplicationActionTests.java
@@ -518,7 +573,7 @@ protected void runReplicateTest(IndexShardRoutingTable shardRoutingTable, int as
if (shard.unassigned()) {
continue;
}
- if (shard.primary() == false) {
+ if (shard.primary() == false || shard.relocating()) { // for relocating primaries, we replica from target to source if source is marked as relocated
nodesSentTo.remove(shard.currentNodeId());
}
if (shard.relocating()) {
@bleskes
bleskes Jan 30, 2016 Member

can we add an assertion that we actually removed something? I think this will also mean that we need a tighter control over what we remove. For example, if the primary is relocating, but we did not move to "relocated" mode, the currentNodeId should not be in the list (and shouldn't be removed).

@ywelsch
ywelsch Feb 1, 2016 Contributor

I'm not sure I understand this comment correctly. I pushed a new commit where I have added assertions: ywelsch@473464c#diff-3988756cc764d148f763c9d0cb740c61R565

@bleskes
bleskes Feb 1, 2016 Member

++. Looks good.

@bleskes bleskes commented on an outdated diff Jan 30, 2016
...search/indices/recovery/IndexPrimaryRelocationIT.java
+import org.elasticsearch.action.delete.DeleteResponse;
+import org.elasticsearch.action.index.IndexResponse;
+import org.elasticsearch.cluster.ClusterState;
+import org.elasticsearch.cluster.node.DiscoveryNode;
+import org.elasticsearch.cluster.routing.allocation.command.MoveAllocationCommand;
+import org.elasticsearch.common.Priority;
+import org.elasticsearch.common.settings.Settings;
+import org.elasticsearch.index.shard.ShardId;
+import org.elasticsearch.test.ESIntegTestCase;
+
+import java.util.concurrent.atomic.AtomicBoolean;
+
+import static org.hamcrest.Matchers.equalTo;
+
+@ESIntegTestCase.ClusterScope(scope = ESIntegTestCase.Scope.TEST)
+public class IndexPrimaryRelocationIT extends ESIntegTestCase {
@bleskes
bleskes Jan 30, 2016 Member

can we put debug logging on this one?

@bleskes bleskes commented on the diff Jan 30, 2016
...search/indices/recovery/IndexPrimaryRelocationIT.java
+ ClusterState initialState = client().admin().cluster().prepareState().get().getState();
+ DiscoveryNode[] dataNodes = initialState.getNodes().dataNodes().values().toArray(DiscoveryNode.class);
+ DiscoveryNode relocationSource = initialState.getNodes().dataNodes().get(initialState.getRoutingTable().shardRoutingTable("test", 0).primaryShard().currentNodeId());
+ for (int i = 0; i < RELOCATION_COUNT; i++) {
+ DiscoveryNode relocationTarget = randomFrom(dataNodes);
+ while (relocationTarget.equals(relocationSource)) {
+ relocationTarget = randomFrom(dataNodes);
+ }
+ logger.info("--> [iteration {}] relocating from {} to {} ", i, relocationSource.getName(), relocationTarget.getName());
+ client().admin().cluster().prepareReroute()
+ .add(new MoveAllocationCommand(new ShardId("test", 0), relocationSource.getId(), relocationTarget.getId()))
+ .execute().actionGet();
+ ClusterHealthResponse clusterHealthResponse = client().admin().cluster().prepareHealth().setWaitForEvents(Priority.LANGUID).setWaitForRelocatingShards(0).execute().actionGet();
+ assertThat(clusterHealthResponse.isTimedOut(), equalTo(false));
+ logger.info("--> [iteration {}] relocation complete", i);
+ relocationSource = relocationTarget;
@bleskes
bleskes Jan 30, 2016 Member

can we check and stop if the background thread had any issues? o.w. will have to dig through more than needed.

@bleskes
Member
bleskes commented Jan 30, 2016

Thanks @ywelsch . Looks great. I left very minor comments and one important one on Replication Action tests. I'd love it if @s1monw gives this a look as well (mostly around the indexshard universe).

@bleskes
Member
bleskes commented Jan 30, 2016

Reviewing this time might be easier by commenting directly on the newly added commits.

Sadly github looses those when you delete the branch and they are not a first class citizen in the UI.

After all is done, I will put SuspendableRefContainer and its tests into a standalone commit.

No need now. Thanks. Would have just made this PR (slightly) easier to review

@ywelsch ywelsch removed the WIP label Feb 1, 2016
@ywelsch
Contributor
ywelsch commented Feb 1, 2016

squashed / rebased on current master.

@s1monw s1monw commented on the diff Feb 1, 2016
...n/support/replication/TransportReplicationAction.java
@@ -995,22 +1028,43 @@ protected boolean shouldExecuteReplication(Settings settings) {
return IndexMetaData.isIndexUsingShadowReplicas(settings) == false;
}
- static class IndexShardReference implements Releasable {
+ interface IndexShardReference extends Releasable {
@s1monw
s1monw Feb 1, 2016 Contributor

since we only have on impl of this can we make it a concrete class and drop the interface?

@ywelsch
ywelsch Feb 1, 2016 Contributor

The interface helps us to mock the implementation in the unit tests.

@s1monw
s1monw Feb 1, 2016 Contributor

wait how can you not override a concrete class?

@ywelsch
ywelsch Feb 1, 2016 Contributor

Technically it is possible, but not nice to do in this particular case. The implementation class acquires the lock in its constructor (yes, one can argue that this can be done in a separate method, but then the lock cannot be a final variable in the class anymore). Also the constructor takes an IndexShard, which is irrelevant for the class that we mock. I still prefer having a small interface as the current implementation class does not share any code / behavior with the mocked version in the tests.

@s1monw s1monw and 1 other commented on an outdated diff Feb 1, 2016
...h/common/util/concurrent/SuspendableRefContainer.java
+ * @return reference holder
+ * @throws InterruptedException if the current thread is interrupted
+ */
+ public Releasable acquire() throws InterruptedException {
+ semaphore.acquire();
+ return () -> semaphore.release();
+ }
+
+ /**
+ * Acquires a reference. Blocks if reference acquisition is blocked at the time of invocation.
+ *
+ * @return reference holder
+ */
+ public Releasable acquireUninterruptibly() {
+ semaphore.acquireUninterruptibly();
+ return () -> semaphore.release();
@s1monw
s1monw Feb 1, 2016 Contributor

this is dangerous, Releasable implements Closeable and that explicitly allows for double closing. I think we have to protecte from double closing on this level! can we have a simple Releaseable impl that does that?

@ywelsch
ywelsch Feb 1, 2016 Contributor

agree. Not complying with the contract in the interface is dangerous.

@s1monw s1monw and 1 other commented on an outdated diff Feb 1, 2016
...n/support/replication/TransportReplicationAction.java
private final AtomicBoolean closed = new AtomicBoolean();
- IndexShardReference(IndexShard counter) {
- counter.incrementOperationCounter();
- this.counter = counter;
+ IndexShardReferenceImpl(IndexShard indexShard, boolean primaryAction) {
+ this.indexShard = indexShard;
+ if (primaryAction) {
+ operationLock = indexShard.acquirePrimaryOperationLock();
+ } else {
+ operationLock = indexShard.acquireReplicaOperationLock();
+ }
}
@Override
public void close() {
if (closed.compareAndSet(false, true)) {
@s1monw
s1monw Feb 1, 2016 Contributor

I think we don't need the AtomicBoolean here. We should protect from double closing on the level below

@ywelsch
ywelsch Feb 1, 2016 Contributor

agree, will move the double closing check to level below.

@s1monw
Contributor
s1monw commented Feb 1, 2016

@ywelsch this looks awesome. I left some comments

@s1monw s1monw commented on the diff Feb 1, 2016
...n/support/replication/TransportReplicationAction.java
- IndexShardReference(IndexShard counter) {
- counter.incrementOperationCounter();
- this.counter = counter;
+ IndexShardReferenceImpl(IndexShard indexShard, boolean primaryAction) {
@s1monw
s1monw Feb 1, 2016 Contributor

I am curious isn't the boolean implicit from the IndexShard? I mean it's shard routing should say primary if the boolean is true no?

@ywelsch
ywelsch Feb 1, 2016 Contributor

yes, but the converse does not hold. If we replicate an index request to a primary (e.g. a relocation target that is still recovering), then we take the replica operation lock. Currently, taking the primary or replica operation lock is pretty much the same thing. In the seq_no branch, this is not the case anymore. @bleskes suggested separating the methods in this PR already as this will make it easier to merge subsequent changes into the seq_no branch.

@s1monw s1monw commented on the diff Feb 1, 2016
...csearch/index/shard/IndexShardRelocatedException.java
@@ -29,10 +29,14 @@
public class IndexShardRelocatedException extends IllegalIndexShardStateException {
@s1monw
s1monw Feb 1, 2016 Contributor

maybe while we are at it can we remove this exception entirely and just use new IllegalIndexShardStateException(shardId, IndexShardState.RELOCATED, "Already relocated") where it's used I think there is only one place really?

@ywelsch
ywelsch Feb 1, 2016 Contributor

used in two places. I'm fine with changing it though.

@ywelsch
ywelsch Feb 1, 2016 Contributor

I had another look and think we should keep it as is for this PR. I looked at other subclasses of IllegalIndexShardStateException, and most of them could be removed as well if we follow the argument given here.
I will leave that for another PR (where this can be discussed then).

@s1monw s1monw commented on the diff Feb 1, 2016
...earch/indices/cluster/IndicesClusterStateService.java
@@ -492,7 +492,11 @@ private void applyNewOrUpdatedShards(final ClusterChangedEvent event) {
// shadow replicas do not support primary promotion. The master would reinitialize the shard, giving it a new allocation, meaning we should be there.
assert (shardRouting.primary() && currentRoutingEntry.primary() == false) == false || indexShard.allowsPrimaryPromotion() :
"shard for doesn't support primary promotion but master promoted it with changing allocation. New routing " + shardRouting + ", current routing " + currentRoutingEntry;
- indexShard.updateRoutingEntry(shardRouting, event.state().blocks().disableStatePersistence() == false);
+ try {
+ indexShard.updateRoutingEntry(shardRouting, event.state().blocks().disableStatePersistence() == false);
+ } catch (Throwable e) {
+ failAndRemoveShard(shardRouting, indexService.indexUUID(), indexService, true, "failed updating shard routing entry", e);
@s1monw
s1monw Feb 1, 2016 Contributor

++

@s1monw
Contributor
s1monw commented Feb 1, 2016

I did another round with minor comments LGTM otherwise

ywelsch added some commits Jan 12, 2016
@ywelsch ywelsch Add operation counter for IndexShard
Adds a container that represents a resource with reference counting capabilities. Provides operations to suspend acquisition of new references. Useful for resource management when resources are intermittently unavailable.

Closes #15956
e1006ea
@ywelsch ywelsch Add proper handoff between old and new copy of relocating primary shard
When primary relocation completes, a cluster state is propagated that deactivates the old primary and marks the new primary as active.
As cluster state changes are not applied synchronously on all nodes, there can be a time interval where the relocation target has processed
the cluster state and believes to be the active primary and the relocation source has not yet processed the cluster state update and
still believes itself to be the active primary. This commit ensures that, before completing the relocation, the reloction source deactivates
writing to its store and delegates requests to the relocation target.

Closes #15900
10b5ffc
@ywelsch ywelsch merged commit 10b5ffc into elastic:master Feb 2, 2016

1 check passed

CLA Commit author is a member of Elasticsearch
Details
@ywelsch ywelsch added a commit to ywelsch/elasticsearch that referenced this pull request Feb 2, 2016
@ywelsch ywelsch [TEST] Fail test if dummy doc is not found
Reverts back 7d3da91 after fix in #15900

Closes #8706
33624c3
@ywelsch ywelsch added a commit that referenced this pull request Feb 2, 2016
@ywelsch ywelsch [TEST] Fail test if dummy doc is not found
Reverts back 7d3da91 after fix in #15900

Closes #8706
cf28d62
@bleskes bleskes added a commit that referenced this pull request Apr 7, 2016
@bleskes bleskes Update resliency page
#14252 , #7572 , #15900, #12573, #14671, #15281 and #9126 have all been closed/merged and will be part of 5.0.0.
557a3d1
@bleskes bleskes referenced this pull request Apr 7, 2016
Merged

Update resliency page #17586

@bleskes bleskes added a commit that referenced this pull request Apr 7, 2016
@bleskes bleskes Update resiliency page (#17586)
#14252 , #7572 , #15900, #12573, #14671, #15281 and #9126 have all been closed/merged and will be part of 5.0.0.
8eee28e
@brwe brwe added a commit to brwe/elasticsearch that referenced this pull request May 23, 2016
@brwe brwe [TEST] disable indexing while relocating
This was fixed in elastic#15900
but the fix will not be backported.
44b2da7
@brwe brwe added a commit that referenced this pull request May 26, 2016
@brwe brwe [TEST] disable indexing while relocating
There is a bug (document loss) with this which should be fixed by
#15900
but it will not be backported so we should not test this.
ffec1fc
@brwe brwe added a commit that referenced this pull request May 26, 2016
@brwe brwe [TEST] disable indexing while relocating
There is a bug (document loss) with this which should be fixed by
#15900
but it will not be backported so we should not test this.
df0c0d8
@brwe brwe added a commit that referenced this pull request May 26, 2016
@brwe brwe [TEST] disable indexing while relocating
There is a bug (document loss) with this which should be fixed by
#15900
but it will not be backported so we should not test this.
b92d539
@brwe brwe added a commit that referenced this pull request May 26, 2016
@brwe brwe [TEST] disable indexing while relocating
There is a bug (document loss) with this which should be fixed by
#15900
but it will not be backported so we should not test this.
8177907
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment