-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an API to locate unrecovered shards and their state #11545
Conversation
@s1monw This is still a WIP in terms of documentation and testing, would appreciate a review. |
this.metaData = metaData; | ||
this.listener = listener; | ||
this.expectedOps = expectedOps; | ||
this.opsCount = new AtomicInteger(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a class called CountDown.java
for this - check it out it might make things simpler her
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer, switched to using CountDown
wow @areek this looks pretty awesome. I left some comments |
Thanks for the review @s1monw! Addressed all the comments. Was wondering if there are any tests that I can look at to get the cluster to have a bunch of unassigned nodes (currently just stoping random nodes)? |
Set<String> requestedIndices = new HashSet<>(); | ||
requestedIndices.addAll(Arrays.asList(request.indices())); | ||
List<ShardId> shardIdsToFetch = new ArrayList<>(); | ||
for (MutableShardRouting shard : Iterables.concat(routingNodes.unassigned(), routingNodes.ignoredUnassigned())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should be able to use state.routingTable().shardsWithState(ShardRoutingState.UNASSIGNED);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to using state.routingTable().shardsWithState(ShardRoutingState.UNASSIGNED)
@areek I think this as awesome api. Left some comments here and there. I think we need to beef up the tests to check for the actual content for the shard responses (check it finds stuff and check that it detects corruption etc.) . I'll respond to the naming part on the ticket.. |
indexShardsBuilder.put(res.shardId.id(), shardStatuses); | ||
shardsResponseBuilder.put(res.shardId.getIndex(), indexShardsBuilder.build()); | ||
for (FailedNodeException failure : res.failures) { | ||
failureBuilder.add(failure); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we loose the information about which shard has failed. Should we wrap it in DefaultShardOperationFailedException ?
@areek change looks good. Did you see my comment about beefing up the testing? |
3848ce1
to
9f8d224
Compare
@bleskes @clintongormley, I have updated the description with the new API, thoughts? It turned out to be a bit different from what we have discussed before, in terms of default behaviour. It would be good to have this reviewed. |
/** | ||
* Status used to choose shards to get store information on | ||
*/ | ||
public enum Status { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need this? Can't we use ClusterHealthStatus ? now that we have the EnumSet , we don't need ALL anymore...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, we now use ClusterHealthStatus
I left some final minor comments. I think we are getting close! |
@bleskes Thanks for the review, addressed all your comments |
@@ -0,0 +1,62 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left over IndicesShard_S_ ..
Left some final suggestions. Thx @areek |
@bleskes Thanks for the review, updated the PR addressing all your comments. |
LGTM. Left some very minor comment. Thx for all the hard word 👍 |
merged to master 7a21d84 |
This API provides store information for shard copies of indices.
Store information reports on which nodes shard copies exist, the shard
copy version, indicating how recent they are, and any exceptions
encountered while opening the shard index or from earlier engine failure.
By default, only lists store information for shards that have at least one
unallocated copy. When the cluster health status is yellow, this will list
store information for shards that have at least one unassigned replica.
When the cluster health status is red, this will list store information
for shards, which has unassigned primaries.
Endpoints include shard stores information for a specific index, several
indices, or all:
The scope of shards to list store information can be changed through
status
param. Defaults to 'yellow' and 'red'. 'yellow' lists store information ofshards with at least one unassigned replica and 'red' for shards with unassigned
primary shard.
Use 'green' to list store information for shards with all assigned copies.
curl -XGET 'http://localhost:9200/_shard_stores?status=green'
Response:
The shard stores information is grouped by indices and shard ids.
<1> The key is the corresponding shard id for the store information
<2> A list of store information for all copies of the shard
<3> The node information that hosts a copy of the store, the key
is the unique node id.
<4> The version of the store copy
<5> The status of the store copy, whether it is used as a
primary, replica or not used at all
<6> Any exception encountered while opening the shard index or
from earlier engine failure
closes #10952