Skip to content

Commit

Permalink
Recovery API
Browse files Browse the repository at this point in the history
Adds a new API endpoint at /_recovery as well as to the Java API. The
recovery API allows one to see the recovery status of all shards in the
cluster. It will report on percent complete, recovery type, and which
files are copied.

Closes #4637
  • Loading branch information
Andrew Selden committed Mar 20, 2014
1 parent 6977479 commit 91627fa
Show file tree
Hide file tree
Showing 35 changed files with 2,448 additions and 501 deletions.
3 changes: 3 additions & 0 deletions docs/reference/indices.asciidoc
Expand Up @@ -46,6 +46,7 @@ and warmers.
* <<indices-status>>
* <<indices-stats>>
* <<indices-segments>>
* <<indices-recovery>>

[float]
[[status-management]]
Expand Down Expand Up @@ -94,6 +95,8 @@ include::indices/stats.asciidoc[]

include::indices/segments.asciidoc[]

include::indices/recovery.asciidoc[]

include::indices/clearcache.asciidoc[]

include::indices/flush.asciidoc[]
Expand Down
194 changes: 194 additions & 0 deletions docs/reference/indices/recovery.asciidoc
@@ -0,0 +1,194 @@
[[indices-recovery]]
== Indices Recovery

The indices recovery API provides insight into on-going shard recoveries.
Recovery status may be reported for specific indices, or cluster-wide.

For example, the following command would show recovery information for the indices "index1" and "index2".

[source,js]
--------------------------------------------------
curl -XGET http://localhost:9200/index1,index2/_recovery?pretty=true
--------------------------------------------------

To see cluster-wide recovery status simply leave out the index names.

[source,js]
--------------------------------------------------
curl -XGET http://localhost:9200/_recovery?pretty=true
--------------------------------------------------

Response:

[source,js]
--------------------------------------------------
{
"index1" : {
"shards" : [ {
"id" : 0,
"type" : "snapshot",
"stage" : "index",
"primary" : true,
"start_time" : "2014-02-24T12:15:59.716",
"stop_time" : 0,
"total_time_in_millis" : 175576,
"source" : {
"repository" : "my_repository",
"snapshot" : "my_snapshot",
"index" : "index1"
},
"target" : {
"id" : "ryqJ5lO5S4-lSFbGntkEkg",
"hostname" : "my.fqdn",
"ip" : "10.0.1.7",
"name" : "my_es_node"
},
"index" : {
"files" : {
"total" : 73,
"reused" : 0,
"recovered" : 69,
"percent" : "94.5%"
},
"bytes" : {
"total" : 79063092,
"reused" : 0,
"recovered" : 68891939,
"percent" : "87.1%"
},
"total_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total_time_in_millis" : 0
},
"start" : {
"check_index_time" : 0,
"total_time_in_millis" : 0
}
} ]
}
}
--------------------------------------------------

The above response shows a single index recovering a single shard. In this case, the source of the recovery is a snapshot repository
and the target of the recovery is the node with name "my_es_node".

Additionally, the output shows the number and percent of files recovered, as well as the number and percent of bytes recovered.

In some cases a higher level of detail may be preferable. Setting "detailed=true" will present a list of physical files in recovery.

[source,js]
--------------------------------------------------
curl -XGET http://localhost:9200/_recovery?pretty=true&detailed=true
--------------------------------------------------

Response:

[source,js]
--------------------------------------------------
{
"index1" : {
"shards" : [ {
"id" : 0,
"type" : "gateway",
"stage" : "done",
"primary" : true,
"start_time" : "2014-02-24T12:38:06.349",
"stop_time" : "2014-02-24T12:38:08.464",
"total_time_in_millis" : 2115,
"source" : {
"id" : "RGMdRc-yQWWKIBM4DGvwqQ",
"hostname" : "my.fqdn",
"ip" : "10.0.1.7",
"name" : "my_es_node"
},
"target" : {
"id" : "RGMdRc-yQWWKIBM4DGvwqQ",
"hostname" : "my.fqdn",
"ip" : "10.0.1.7",
"name" : "my_es_node"
},
"index" : {
"files" : {
"total" : 26,
"reused" : 26,
"recovered" : 26,
"percent" : "100.0%",
"details" : [ {
"name" : "segments.gen",
"length" : 20,
"recovered" : 20
}, {
"name" : "_0.cfs",
"length" : 135306,
"recovered" : 135306
}, {
"name" : "segments_2",
"length" : 251,
"recovered" : 251
},
...
]
},
"bytes" : {
"total" : 26001617,
"reused" : 26001617,
"recovered" : 26001617,
"percent" : "100.0%"
},
"total_time_in_millis" : 2
},
"translog" : {
"recovered" : 71,
"total_time_in_millis" : 2025
},
"start" : {
"check_index_time" : 0,
"total_time_in_millis" : 88
}
} ]
}
}
--------------------------------------------------

This response shows a detailed listing (truncated for brevity) of the actual files recovered and their sizes.

Also shown are the timings in milliseconds of the various stages of recovery: index retrieval, translog replay, and index start time.

Note that the above listing indicates that the recovery is in stage "done". All recoveries, whether on-going or complete, are kept in
cluster state and may be reported on at any time. Setting "active_only=true" will cause only on-going recoveries to be reported.

Here is a complete list of options:

[horizontal]
`detailed`:: Display a detailed view. This is primarily useful for viewing the recovery of physical index files. Default: false.
`active_only`:: Display only those recoveries that are currently on-going. Default: false.

Description of output fields:

[horizontal]
`id`:: Shard ID
`type`:: Recovery type:
* gateway
* snapshot
* replica
* relocating
`stage`:: Recovery stage:
* init: Recovery has not started
* index: Reading index meta-data and copying bytes from source to destination
* start: Starting the engine; opening the index for use
* translog: Replaying transaction log
* finalize: Cleanup
* done: Complete
`primary`:: True if shard is primary, false otherwise
`start_time`:: Timestamp of recovery start
`stop_time`:: Timestamp of recovery finish
`total_time_in_millis`:: Total time to recover shard in milliseconds
`source`:: Recovery source:
* repository description if recovery is from a snapshot
* description of source node otherwise
`target`:: Destination node
`index`:: Statistics about physical index recovery
`translog`:: Statistics about translog recovery
`start`:: Statistics about time to open and start the index
4 changes: 0 additions & 4 deletions rest-api-spec/api/cat.recovery.json
Expand Up @@ -17,10 +17,6 @@
"description" : "The unit in which to display byte values",
"options": [ "b", "k", "m", "g" ]
},
"local": {
"type" : "boolean",
"description" : "Return local information, do not retrieve the state from master node (default: false)"
},
"master_timeout": {
"type" : "time",
"description" : "Explicit operation timeout for connection to master node"
Expand Down
34 changes: 34 additions & 0 deletions rest-api-spec/api/indices.recovery.json
@@ -0,0 +1,34 @@
{
"indices.recovery" : {
"documentation": "http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/indices-recovery.html",
"methods": ["GET"],
"url": {
"path": "/_recovery",
"paths": ["/_recovery", "/{index}/_recovery"],
"parts": {
"index": {
"type" : "list",
"description" : "A comma-separated list of index names; use `_all` or empty string to perform the operation on all indices"
}
},
"params": {
"detailed" : {
"type": "boolean",
"description": "Whether to display detailed information about shard recovery",
"default": false
},
"active_only" : {
"type": "boolean",
"description": "Display only those recoveries that are currently on-going",
"default": false
},
"human": {
"type": "boolean",
"description": "Whether to return time and byte values in human-readable format.",
"default": false
}
}
},
"body": null
}
}
26 changes: 26 additions & 0 deletions rest-api-spec/test/cat.recovery/10_basic.yaml
@@ -0,0 +1,26 @@
---
"Test cat recovery output":

- do:
cat.recovery: {}

- match:
$body: >
/^$/
- do:
index:
index: index1
type: type1
id: 1
body: { foo: bar }
refresh: true
- do:
cluster.health:
wait_for_status: yellow
- do:
cat.recovery: {}
- match:
$body: >
/^(index1\s+\d+\s+\d+\s+(gateway|replica|snapshot|relocating)\s+(init|index|start|translog|finalize|done)\s+([a-zA-Z_0-9/.])+\s+([a-zA-Z_0-9/.])+\s+([a-zA-Z_0-9/.])+\s+([a-zA-Z_0-9/.])+\s+\d+\s+\d+\.\d+\%\s+\d+\s+\d+\.\d+\%\s+\n?){1,}$/
32 changes: 32 additions & 0 deletions rest-api-spec/test/indices.recovery/10_basic.yaml
@@ -0,0 +1,32 @@
---
"Indices recovery test":

- skip:
features: gtelte

- do:
indices.create:
index: test_1

- do:
indices.recovery:
index: [test_1]

- match: { test_1.shards.0.type: "GATEWAY" }
- match: { test_1.shards.0.stage: "DONE" }
- match: { test_1.shards.0.primary: true }
- match: { test_1.shards.0.target.ip: /^\d+\.\d+\.\d+\.\d+$/ }
- gte: { test_1.shards.0.index.files.total: 0 }
- gte: { test_1.shards.0.index.files.reused: 0 }
- gte: { test_1.shards.0.index.files.recovered: 0 }
- match: { test_1.shards.0.index.files.percent: /^\d+\.\d\%$/ }
- gte: { test_1.shards.0.index.bytes.total: 0 }
- gte: { test_1.shards.0.index.bytes.reused: 0 }
- gte: { test_1.shards.0.index.bytes.recovered: 0 }
- match: { test_1.shards.0.index.bytes.percent: /^\d+\.\d\%$/ }
- gte: { test_1.shards.0.translog.recovered: 0 }
- gte: { test_1.shards.0.translog.total_time_in_millis: 0 }
- gte: { test_1.shards.0.start.check_index_time_in_millis: 0 }
- gte: { test_1.shards.0.start.total_time_in_millis: 0 }


3 changes: 3 additions & 0 deletions src/main/java/org/elasticsearch/action/ActionModule.java
Expand Up @@ -95,6 +95,8 @@
import org.elasticsearch.action.admin.indices.optimize.TransportOptimizeAction;
import org.elasticsearch.action.admin.indices.refresh.RefreshAction;
import org.elasticsearch.action.admin.indices.refresh.TransportRefreshAction;
import org.elasticsearch.action.admin.indices.recovery.RecoveryAction;
import org.elasticsearch.action.admin.indices.recovery.TransportRecoveryAction;
import org.elasticsearch.action.admin.indices.segments.IndicesSegmentsAction;
import org.elasticsearch.action.admin.indices.segments.TransportIndicesSegmentsAction;
import org.elasticsearch.action.admin.indices.settings.get.GetSettingsAction;
Expand Down Expand Up @@ -284,6 +286,7 @@ protected void configure() {
registerAction(MultiPercolateAction.INSTANCE, TransportMultiPercolateAction.class, TransportShardMultiPercolateAction.class);
registerAction(ExplainAction.INSTANCE, TransportExplainAction.class);
registerAction(ClearScrollAction.INSTANCE, TransportClearScrollAction.class);
registerAction(RecoveryAction.INSTANCE, TransportRecoveryAction.class);

// register Name -> GenericAction Map that can be injected to instances.
MapBinder<String, GenericAction> actionsBinder
Expand Down
@@ -0,0 +1,46 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.elasticsearch.action.admin.indices.recovery;

import org.elasticsearch.client.IndicesAdminClient;
import org.elasticsearch.action.admin.indices.IndicesAction;

/**
* Recovery information action
*/
public class RecoveryAction extends IndicesAction<RecoveryRequest, RecoveryResponse, RecoveryRequestBuilder> {

public static final RecoveryAction INSTANCE = new RecoveryAction();
public static final String NAME = "indices/recovery";

private RecoveryAction() {
super(NAME);
}

@Override
public RecoveryRequestBuilder newRequestBuilder(IndicesAdminClient client) {
return new RecoveryRequestBuilder(client);
}

@Override
public RecoveryResponse newResponse() {
return new RecoveryResponse();
}
}

0 comments on commit 91627fa

Please sign in to comment.