Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Explain in AllocationDecider's Decisions #2483

Closed
s1monw opened this issue Dec 13, 2012 · 2 comments
Closed

Use Explain in AllocationDecider's Decisions #2483

s1monw opened this issue Dec 13, 2012 · 2 comments

Comments

@s1monw
Copy link
Contributor

s1monw commented Dec 13, 2012

Currently the Decision class already supports an explain parameter. It would be helpful for development, debugging and for certain error messages like in MoveAllocationCommand to log the actual reasoning behind the decision.

@ghost ghost assigned s1monw Dec 13, 2012
s1monw added a commit to s1monw/elasticsearch that referenced this issue Dec 13, 2012
@ghost ghost assigned dakrone Jan 28, 2014
dakrone added a commit to dakrone/elasticsearch that referenced this issue Jan 31, 2014
dakrone added a commit that referenced this issue Jan 31, 2014
dakrone added a commit that referenced this issue Jan 31, 2014
dakrone added a commit that referenced this issue Jan 31, 2014
Relates to #4380
Relates to #2483

Conflicts:
	src/main/java/org/elasticsearch/cluster/routing/allocation/AllocationService.java
	src/main/java/org/elasticsearch/cluster/routing/allocation/decider/EnableAllocationDecider.java
	src/main/java/org/elasticsearch/cluster/routing/allocation/decider/SameShardAllocationDecider.java
	src/main/java/org/elasticsearch/cluster/routing/allocation/decider/SnapshotInProgressAllocationDecider.java
@dakrone
Copy link
Member

dakrone commented Feb 4, 2014

We should add an explain parameter so people can get detailed feedback about why a shard can or cannot be allocated to a node. Something like this:

For these commands:

curl -XPOST 'localhost:9200/_cluster/reroute?explain&pretty' -d '{
  "commands" : [
    {
      "cancel" : {
        "index" : "decide", "shard" : 0, "node": "IvpoKRdtRiGrQ_WKtt4_4w"
      }
    },
    {
      "move" : {
        "index" : "decide", "shard" : 0,
        "from_node" : "IvpoKRdtRiGrQ_WKtt4_4w", "to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
      }
    }
  ]
}'

This result:

{
  "explanations" : [ {
    "command" : "cancel",
    "parameters" : {
      "index" : "decide",
      "shard" : 0,
      "node" : "IvpoKRdtRiGrQ_WKtt4_4w",
      "allow_primary" : false
    },
    "decisions" : [ {
      "decider" : "CancelAllocationCommand",
      "decision" : "NO",
      "explanation" : "can't cancel [decide][0] on node [Wysper][IvpoKRdtRiGrQ_WKtt4_4w][Xanadu.local][inet[/172.16.1.8:9300]], shard is primary and started"
    } ]
  }, {
    "command" : "move",
    "parameters" : {
      "index" : "decide",
      "shard" : 0,
      "from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
      "to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
    },
    "decisions" : [ {
      "decider" : "SameShard",
      "decision" : "NO",
      "explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
    }, {
      "decider" : "Filter",
      "decision" : "YES",
      "explanation" : "node passes include/exclude/require filters"
    }, {
      "decider" : "ReplicaAfterPrimaryActive",
      "decision" : "YES",
      "explanation" : "shard is primary"
    }, {
      "decider" : "Throttling",
      "decision" : "YES",
      "explanation" : "below shard recovery limit of [2]"
    }, {
      "decider" : "Enable",
      "decision" : "YES",
      "explanation" : "allocation disabling is ignored"
    }, {
      "decider" : "Disable",
      "decision" : "YES",
      "explanation" : "allocation disabling is ignored"
    }, {
      "decider" : "Awareness",
      "decision" : "YES",
      "explanation" : "no allocation awareness enabled"
    }, {
      "decider" : "ShardsLimit",
      "decision" : "YES",
      "explanation" : "total shard limit disabled: [-1] <= 0"
    }, {
      "decider" : "NodeVersion",
      "decision" : "YES",
      "explanation" : "target node version [2.0.0-SNAPSHOT] is same or newer than source node version [2.0.0-SNAPSHOT]"
    }, {
      "decider" : "DiskThreshold",
      "decision" : "YES",
      "explanation" : "disk threshold decider disabled"
    }, {
      "decider" : "SnapshotInProgress",
      "decision" : "YES",
      "explanation" : "no snapshots are currently running"
    } ]
  } ]
}

@dakrone
Copy link
Member

dakrone commented Feb 4, 2014

Related: #4380

dakrone added a commit to dakrone/elasticsearch that referenced this issue Feb 18, 2014
By specifying the `explain` flag, an explanation for the reason a
command can or cannot be executed is returned. No allocation commands
are actually performed.

Returns a response similar to:

{
  "explanations" : [ {
    "command" : "cancel",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "allow_primary" : false
      },
      "decisions" : [ {
        "decider" : "CancelAllocationCommand",
        "decision" : "YES",
        "explanation" : "..."
        } ]
     }, {
      "command" : "move",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
       },
       "decisions" : [ {
         "decider" : "same_shard",
         "decision" : "NO",
         "explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
       },
       etc
       ]
  }]
}

Closes elastic#2483
dakrone added a commit that referenced this issue Feb 27, 2014
By specifying the `explain` flag, an explanation for the reason a
command can or cannot be executed is returned.

Returns a response similar to:

{
  "state": {...cluster state...},
  "acknowledged": true,
  "explanations" : [ {
    "command" : "cancel",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "allow_primary" : false
      },
      "decisions" : [ {
        "decider" : "cancel_allocation_command",
        "decision" : "YES",
        "explanation" : "..."
        } ]
     }, {
      "command" : "move",
      "parameters" : {
        "index" : "decide",
        "shard" : 0,
        "from_node" : "IvpoKRdtRiGrQ_WKtt4_4w",
        "to_node" : "IvpoKRdtRiGrQ_WKtt4_4w"
       },
       "decisions" : [ {
         "decider" : "same_shard",
         "decision" : "NO",
         "explanation" : "shard cannot be allocated on same node [IvpoKRdtRiGrQ_WKtt4_4w] it already exists on"
       },
       etc
       ]
  }]
}

also removes AllocationExplanation from cluster state

Closes #2483
Closes #5169
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Relates to elastic#4380
Relates to elastic#2483

Conflicts:
	src/main/java/org/elasticsearch/cluster/routing/allocation/AllocationService.java
	src/main/java/org/elasticsearch/cluster/routing/allocation/decider/EnableAllocationDecider.java
	src/main/java/org/elasticsearch/cluster/routing/allocation/decider/SameShardAllocationDecider.java
	src/main/java/org/elasticsearch/cluster/routing/allocation/decider/SnapshotInProgressAllocationDecider.java
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants