Skip to content

Commit

Permalink
Add explain flag support to the reroute API
Browse files Browse the repository at this point in the history
By specifying the `explain` flag, an explanation for the reason a
command can or cannot be executed is returned. No allocation commands
are actually performed.

Returns a response similar to:

{
  "state": {...cluster state...},
  "acknowledged": true,
  "explanations" : [ {
    "command" : "cancel",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "allow_primary" : false
      },
      "decisions" : [ {
        "decider" : "cancel_allocation_command",
        "decision" : "YES",
        "explanation" : "..."
        } ]
     }, {
      "command" : "move",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
       },
       "decisions" : [ {
         "decider" : "same_shard",
         "decision" : "NO",
         "explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
       },
       etc
       ]
  }]
}

also removes AllocationExplanation from cluster state

Closes #2483
Closes #5169
  • Loading branch information
dakrone committed Feb 27, 2014
1 parent 8ceb987 commit e53a438
Show file tree
Hide file tree
Showing 36 changed files with 744 additions and 201 deletions.
18 changes: 10 additions & 8 deletions docs/reference/cluster/reroute.asciidoc
Expand Up @@ -11,12 +11,11 @@ Here is a short example of how a simple reroute API call:

[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"move" :
"move" :
{
"index" : "test", "shard" : 0,
"index" : "test", "shard" : 0,
"from_node" : "node1", "to_node" : "node2"
}
},
Expand Down Expand Up @@ -45,15 +44,18 @@ the request body). This will cause the commands to apply to the current
cluster state, and return the resulting cluster after the commands (and
re-balancing) has been applied.

The commands supported are:
If the `explain` parameter is specified, a detailed explanation of why the
commands could or could not be executed is returned.

`move`::
The commands supported are:

`move`::
Move a started shard from one node to another node. Accepts
`index` and `shard` for index name and shard number, `from_node` for the
node to move the shard `from`, and `to_node` for the node to move the
shard to.
shard to.

`cancel`::
`cancel`::
Cancel allocation of a shard (or recovery). Accepts `index`
and `shard` for index name and shard number, and `node` for the node to
cancel the shard allocation on. It also accepts `allow_primary` flag to
Expand All @@ -62,7 +64,7 @@ The commands supported are:
from the primary shard by cancelling them and allowing them to be
reinitialized through the standard reallocation process.

`allocate`::
`allocate`::
Allocate an unassigned shard to a node. Accepts the
`index` and `shard` for index name and shard number, and `node` to
allocate the shard to. It also accepts `allow_primary` flag to
Expand Down
4 changes: 4 additions & 0 deletions rest-api-spec/api/cluster.reroute.json
Expand Up @@ -12,6 +12,10 @@
"type" : "boolean",
"description" : "Simulate the operation only and return the resulting state"
},
"explain": {
"type" : "boolean",
"description" : "Return an explanation of why the commands can or cannot be executed"
},
"filter_metadata": {
"type" : "boolean",
"description" : "Don't return cluster state metadata (default: false)"
Expand Down
49 changes: 49 additions & 0 deletions rest-api-spec/test/cluster.reroute/11_explain.yaml
@@ -0,0 +1,49 @@
setup:
- do:
indices.create:
index: test_index
body:
settings:
number_of_shards: "1"
number_of_replicas: "0"

- do:
cluster.health:
wait_for_status: green

---
"Explain API with empty command list":

- do:
cluster.reroute:
explain: true
dry_run: true
body:
commands: []

- match: {explanations: []}

---
"Explain API for non-existant node & shard":

- do:
cluster.reroute:
explain: true
dry_run: true
body:
commands:
- cancel:
index: test_index
shard: 9
node: node_0

- match: {explanations.0.command: cancel}
- match:
explanations.0.parameters:
index: test_index
shard: 9
node: node_0
allow_primary: false
- match: {explanations.0.decisions.0.decider: cancel_allocation_command}
- match: {explanations.0.decisions.0.decision: "NO"}
- is_true: explanations.0.decisions.0.explanation
8 changes: 0 additions & 8 deletions rest-api-spec/test/cluster.state/20_filtering.yaml
Expand Up @@ -20,7 +20,6 @@ setup:
- is_false: metadata
- is_false: routing_table
- is_false: routing_nodes
- is_false: allocations
- length: { blocks: 0 }

---
Expand All @@ -41,7 +40,6 @@ setup:
- is_false: metadata
- is_false: routing_table
- is_false: routing_nodes
- is_false: allocations
- length: { blocks: 1 }

---
Expand All @@ -55,7 +53,6 @@ setup:
- is_false: metadata
- is_false: routing_table
- is_false: routing_nodes
- is_false: allocations

---
"Filtering the cluster state by metadata only should work":
Expand All @@ -68,7 +65,6 @@ setup:
- is_true: metadata
- is_false: routing_table
- is_false: routing_nodes
- is_false: allocations


---
Expand All @@ -82,7 +78,6 @@ setup:
- is_false: metadata
- is_true: routing_table
- is_true: routing_nodes
- is_true: allocations


---
Expand Down Expand Up @@ -120,7 +115,6 @@ setup:
- is_true: metadata
- is_false: routing_table
- is_false: routing_nodes
- is_false: allocations
- is_true: metadata.templates.test1
- is_true: metadata.templates.test2
- is_false: metadata.templates.foo
Expand Down Expand Up @@ -160,5 +154,3 @@ setup:
- is_true: metadata
- is_true: routing_table
- is_true: routing_nodes
- is_true: allocations

Expand Up @@ -20,6 +20,7 @@
package org.elasticsearch.action.admin.cluster.reroute;

import org.elasticsearch.ElasticsearchParseException;
import org.elasticsearch.Version;
import org.elasticsearch.action.ActionRequestValidationException;
import org.elasticsearch.action.support.master.AcknowledgedRequest;
import org.elasticsearch.cluster.routing.allocation.command.AllocationCommand;
Expand All @@ -39,6 +40,7 @@ public class ClusterRerouteRequest extends AcknowledgedRequest<ClusterRerouteReq

AllocationCommands commands = new AllocationCommands();
boolean dryRun;
boolean explain;

public ClusterRerouteRequest() {
}
Expand Down Expand Up @@ -69,6 +71,23 @@ public boolean dryRun() {
return this.dryRun;
}

/**
* Sets the explain flag, which will collect information about the reroute
* request without executing the actions. Similar to dryRun,
* but human-readable.
*/
public ClusterRerouteRequest explain(boolean explain) {
this.explain = explain;
return this;
}

/**
* Returns the current explain flag
*/
public boolean explain() {
return this.explain;
}

/**
* Sets the source for the request.
*/
Expand Down Expand Up @@ -110,6 +129,11 @@ public void readFrom(StreamInput in) throws IOException {
super.readFrom(in);
commands = AllocationCommands.readFrom(in);
dryRun = in.readBoolean();
if (in.getVersion().onOrAfter(Version.V_1_1_0)) {
explain = in.readBoolean();
} else {
explain = false;
}
readTimeout(in);
}

Expand All @@ -118,6 +142,9 @@ public void writeTo(StreamOutput out) throws IOException {
super.writeTo(out);
AllocationCommands.writeTo(commands, out);
out.writeBoolean(dryRun);
if (out.getVersion().onOrAfter(Version.V_1_1_0)) {
out.writeBoolean(explain);
}
writeTimeout(out);
}
}
Expand Up @@ -53,6 +53,15 @@ public ClusterRerouteRequestBuilder setDryRun(boolean dryRun) {
return this;
}

/**
* Sets the explain flag (defaults to <tt>false</tt>). If true, the
* request will include an explanation in addition to the cluster state.
*/
public ClusterRerouteRequestBuilder setExplain(boolean explain) {
request.explain(explain);
return this;
}

/**
* Sets the source for the request
*/
Expand Down
Expand Up @@ -19,8 +19,11 @@

package org.elasticsearch.action.admin.cluster.reroute;

import org.elasticsearch.Version;
import org.elasticsearch.action.support.master.AcknowledgedResponse;
import org.elasticsearch.cluster.ClusterState;
import org.elasticsearch.cluster.routing.allocation.RoutingExplanations;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;

Expand All @@ -32,14 +35,16 @@
public class ClusterRerouteResponse extends AcknowledgedResponse {

private ClusterState state;
private RoutingExplanations explanations;

ClusterRerouteResponse() {

}

ClusterRerouteResponse(boolean acknowledged, ClusterState state) {
ClusterRerouteResponse(boolean acknowledged, ClusterState state, RoutingExplanations explanations) {
super(acknowledged);
this.state = state;
this.explanations = explanations;
}

/**
Expand All @@ -49,17 +54,29 @@ public ClusterState getState() {
return this.state;
}

public RoutingExplanations getExplanations() {
return this.explanations;
}

@Override
public void readFrom(StreamInput in) throws IOException {
super.readFrom(in);
state = ClusterState.Builder.readFrom(in, null);
readAcknowledged(in);
if (in.getVersion().onOrAfter(Version.V_1_1_0)) {
explanations = RoutingExplanations.readFrom(in);
} else {
explanations = new RoutingExplanations();
}
}

@Override
public void writeTo(StreamOutput out) throws IOException {
super.writeTo(out);
ClusterState.Builder.writeTo(state, out);
writeAcknowledged(out);
if (out.getVersion().onOrAfter(Version.V_1_1_0)) {
RoutingExplanations.writeTo(explanations, out);
}
}
}
Expand Up @@ -26,6 +26,7 @@
import org.elasticsearch.cluster.ClusterService;
import org.elasticsearch.cluster.ClusterState;
import org.elasticsearch.cluster.node.DiscoveryNode;
import org.elasticsearch.cluster.routing.allocation.RoutingExplanations;
import org.elasticsearch.cluster.routing.allocation.AllocationService;
import org.elasticsearch.cluster.routing.allocation.RoutingAllocation;
import org.elasticsearch.common.Nullable;
Expand Down Expand Up @@ -75,6 +76,7 @@ protected void masterOperation(final ClusterRerouteRequest request, final Cluste
clusterService.submitStateUpdateTask("cluster_reroute (api)", Priority.URGENT, new AckedClusterStateUpdateTask() {

private volatile ClusterState clusterStateToSend;
private volatile RoutingExplanations explanations;

@Override
public boolean mustAck(DiscoveryNode discoveryNode) {
Expand All @@ -83,12 +85,12 @@ public boolean mustAck(DiscoveryNode discoveryNode) {

@Override
public void onAllNodesAcked(@Nullable Throwable t) {
listener.onResponse(new ClusterRerouteResponse(true, clusterStateToSend));
listener.onResponse(new ClusterRerouteResponse(true, clusterStateToSend, explanations));
}

@Override
public void onAckTimeout() {
listener.onResponse(new ClusterRerouteResponse(false, clusterStateToSend));
listener.onResponse(new ClusterRerouteResponse(false, clusterStateToSend, new RoutingExplanations()));
}

@Override
Expand All @@ -109,9 +111,10 @@ public void onFailure(String source, Throwable t) {

@Override
public ClusterState execute(ClusterState currentState) {
RoutingAllocation.Result routingResult = allocationService.reroute(currentState, request.commands, true);
RoutingAllocation.Result routingResult = allocationService.reroute(currentState, request.commands, request.explain());
ClusterState newState = ClusterState.builder(currentState).routingResult(routingResult).build();
clusterStateToSend = newState;
explanations = routingResult.explanations();
if (request.dryRun) {
return currentState;
}
Expand Down
Expand Up @@ -90,7 +90,6 @@ protected void masterOperation(final ClusterStateRequest request, final ClusterS
} else {
builder.routingTable(currentState.routingTable());
}
builder.allocationExplanation(currentState.allocationExplanation());
}
if (request.blocks()) {
builder.blocks(currentState.blocks());
Expand Down

0 comments on commit e53a438

Please sign in to comment.