Skip to content

Commit

Permalink
Add elasticsearch health API (#83119)
Browse files Browse the repository at this point in the history
Add an API to return information about Elasticsearch health status.

Relates to #83303.
  • Loading branch information
Tim-Brooks committed Feb 3, 2022
1 parent d2e7b4c commit ea96bfe
Show file tree
Hide file tree
Showing 14 changed files with 499 additions and 33 deletions.
5 changes: 5 additions & 0 deletions docs/changelog/83119.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 83119
summary: Add elasticsearch health API
area: Distributed
type: enhancement
issues: []
69 changes: 69 additions & 0 deletions docs/reference/health/health.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
[[health-api]]
=== Health API
++++
<titleabbrev>Health</titleabbrev>
++++

An experimental API that returns the health status of an {es} cluster.

This API is currently experimental for internal use by Elastic software only.

NOTE: {cloud-only}

[[health-api-request]]
==== {api-request-title}

`GET /_internal/_health`

[[health-api-prereqs]]
==== {api-prereq-title}

* If the {es} {security-features} are enabled, you must have the `monitor` or
`manage` <<privileges-list-cluster,cluster privilege>> to use this API.

[[health-api-desc]]
==== {api-description-title}

The health API returns a the health status of an Elasticsearch cluster. It
returns a list of components that compose Elasticsearch functionality. Each
component's health is determined by health indicators associated with the
component.

Each indicator has a health status of: `green`, `yellow` or `red`. The indicator will
provide an explanation and metadata describing the reason for its current health status.

A component's status is controlled by the worst indicator status. The cluster's status
is controlled by the worst component status.

[[health-api-query-params]]
==== {api-query-parms-title}

include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=local]

include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=timeoutparms]

[[health-api-response-body]]
==== {api-response-body-title}

`cluster_name`::
(string) The name of the cluster.

`status`::
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=cluster-health-status]

`impacts`::
(list) A list of current health impacts to the cluster.

`components`::
(object) Information about the health of the cluster components.

[[cluster-health-api-example]]
==== {api-examples-title}

[source,console]
--------------------------------------------------
GET _internal/_health
--------------------------------------------------

The API returns the a response with all the components and indicators regardless
of current status.
29 changes: 29 additions & 0 deletions rest-api-spec/src/main/resources/rest-api-spec/api/health.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"health":{
"documentation":{
"url": null,
"description":"Returns the health of the cluster."
},
"stability":"experimental",
"visibility":"private",
"headers":{
"accept": [ "application/json"]
},
"url":{
"paths":[
{
"path":"/_internal/_health",
"methods":[
"GET"
]
}
]
},
"params":{
"timeout":{
"type":"time",
"description":"Explicit operation timeout"
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
"cluster health basic test":
- skip:
version: "- 8.1.99"
reason: "health was only added in 8.2.0"

- do:
health: {}

- is_true: cluster_name
- match: { status: "GREEN" }
- match: { impacts: [] }
- match: { components.cluster_coordination.status: "GREEN" }
- match: { components.cluster_coordination.indicators.instance_has_master.status: "GREEN" }
- match: { components.cluster_coordination.indicators.instance_has_master.summary: "Health coordinating instance has a master node." }
- is_true: components.cluster_coordination.indicators.instance_has_master.details.coordinating_node.node_id
- is_true: components.cluster_coordination.indicators.instance_has_master.details.coordinating_node.name
- is_true: components.cluster_coordination.indicators.instance_has_master.details.master_node.node_id
- is_true: components.cluster_coordination.indicators.instance_has_master.details.master_node.name
- match: { components.snapshots.status: "GREEN" }
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

package org.elasticsearch.health;

import org.elasticsearch.client.internal.Client;
import org.elasticsearch.cluster.ClusterState;
import org.elasticsearch.cluster.coordination.NoMasterBlockService;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.health.components.controller.ClusterCoordination;
import org.elasticsearch.plugins.Plugin;
import org.elasticsearch.test.ESIntegTestCase;
import org.elasticsearch.test.disruption.NetworkDisruption;
import org.elasticsearch.test.transport.MockTransportService;

import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.HashSet;

@ESIntegTestCase.ClusterScope(scope = ESIntegTestCase.Scope.SUITE)
public class GetHealthActionIT extends ESIntegTestCase {

@Override
protected Collection<Class<? extends Plugin>> nodePlugins() {
return Collections.singletonList(MockTransportService.TestPlugin.class);
}

@Override
protected Settings nodeSettings(int nodeOrdinal, Settings otherSettings) {
return Settings.builder()
.put(super.nodeSettings(nodeOrdinal, otherSettings))
.put(NoMasterBlockService.NO_MASTER_BLOCK_SETTING.getKey(), "all")
.build();
}

public void testGetHealth() throws Exception {
GetHealthAction.Response response = client().execute(GetHealthAction.INSTANCE, new GetHealthAction.Request()).get();
assertEquals(cluster().getClusterName(), response.getClusterName().value());
assertEquals(HealthStatus.GREEN, response.getStatus());

assertEquals(2, response.getComponents().size());

for (HealthComponentResult component : response.getComponents()) {
assertEquals(HealthStatus.GREEN, component.status());
}

HealthComponentResult controller = response.getComponents()
.stream()
.filter(c -> c.name().equals("cluster_coordination"))
.findAny()
.orElseThrow();
assertEquals(1, controller.indicators().size());
HealthIndicatorResult nodeDoesNotHaveMaster = controller.indicators().get(ClusterCoordination.INSTANCE_HAS_MASTER_NAME);
assertEquals(ClusterCoordination.INSTANCE_HAS_MASTER_NAME, nodeDoesNotHaveMaster.name());
assertEquals(HealthStatus.GREEN, nodeDoesNotHaveMaster.status());
assertEquals(ClusterCoordination.INSTANCE_HAS_MASTER_GREEN_SUMMARY, nodeDoesNotHaveMaster.summary());
}

public void testGetHealthInstanceNoMaster() throws Exception {
Client client = internalCluster().coordOnlyNodeClient();

final NetworkDisruption disruptionScheme = new NetworkDisruption(
new NetworkDisruption.IsolateAllNodes(new HashSet<>(Arrays.asList(internalCluster().getNodeNames()))),
NetworkDisruption.DISCONNECT
);

internalCluster().setDisruptionScheme(disruptionScheme);
disruptionScheme.startDisrupting();

try {
assertBusy(() -> {
ClusterState state = client.admin().cluster().prepareState().setLocal(true).execute().actionGet().getState();
assertTrue(state.blocks().hasGlobalBlockWithId(NoMasterBlockService.NO_MASTER_BLOCK_ID));

GetHealthAction.Response response = client().execute(GetHealthAction.INSTANCE, new GetHealthAction.Request()).get();
assertEquals(HealthStatus.RED, response.getStatus());
assertEquals(2, response.getComponents().size());
HealthComponentResult controller = response.getComponents()
.stream()
.filter(c -> c.name().equals("cluster_coordination"))
.findAny()
.orElseThrow();
assertEquals(1, controller.indicators().size());
HealthIndicatorResult instanceHasMaster = controller.indicators().get(ClusterCoordination.INSTANCE_HAS_MASTER_NAME);
assertEquals(ClusterCoordination.INSTANCE_HAS_MASTER_NAME, instanceHasMaster.name());
assertEquals(HealthStatus.RED, instanceHasMaster.status());
assertEquals(ClusterCoordination.INSTANCE_HAS_MASTER_RED_SUMMARY, instanceHasMaster.summary());
});
} finally {
internalCluster().clearDisruptionScheme(true);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,8 @@
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.settings.SettingsFilter;
import org.elasticsearch.gateway.TransportNodesListGatewayStartedShards;
import org.elasticsearch.health.GetHealthAction;
import org.elasticsearch.health.RestGetHealthAction;
import org.elasticsearch.index.seqno.GlobalCheckpointSyncAction;
import org.elasticsearch.index.seqno.RetentionLeaseActions;
import org.elasticsearch.indices.SystemIndices;
Expand Down Expand Up @@ -534,6 +536,7 @@ public <Request extends ActionRequest, Response extends ActionResponse> void reg
actions.register(ListTasksAction.INSTANCE, TransportListTasksAction.class);
actions.register(GetTaskAction.INSTANCE, TransportGetTaskAction.class);
actions.register(CancelTasksAction.INSTANCE, TransportCancelTasksAction.class);
actions.register(GetHealthAction.INSTANCE, GetHealthAction.TransportAction.class);

actions.register(AddVotingConfigExclusionsAction.INSTANCE, TransportAddVotingConfigExclusionsAction.class);
actions.register(ClearVotingConfigExclusionsAction.INSTANCE, TransportClearVotingConfigExclusionsAction.class);
Expand Down Expand Up @@ -739,6 +742,7 @@ public void initRestHandlers(Supplier<DiscoveryNodes> nodesInCluster) {
registerHandler.accept(new RestCloseIndexAction());
registerHandler.accept(new RestOpenIndexAction());
registerHandler.accept(new RestAddIndexBlockAction());
registerHandler.accept(new RestGetHealthAction());

registerHandler.accept(new RestUpdateSettingsAction());
registerHandler.accept(new RestGetSettingsAction());
Expand Down
126 changes: 126 additions & 0 deletions server/src/main/java/org/elasticsearch/health/GetHealthAction.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

package org.elasticsearch.health;

import org.elasticsearch.action.ActionListener;
import org.elasticsearch.action.ActionRequest;
import org.elasticsearch.action.ActionRequestValidationException;
import org.elasticsearch.action.ActionResponse;
import org.elasticsearch.action.ActionType;
import org.elasticsearch.action.support.ActionFilters;
import org.elasticsearch.cluster.ClusterName;
import org.elasticsearch.cluster.ClusterState;
import org.elasticsearch.cluster.service.ClusterService;
import org.elasticsearch.common.inject.Inject;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.health.components.controller.ClusterCoordination;
import org.elasticsearch.tasks.Task;
import org.elasticsearch.transport.TransportService;
import org.elasticsearch.xcontent.ToXContent;
import org.elasticsearch.xcontent.ToXContentObject;
import org.elasticsearch.xcontent.XContentBuilder;

import java.io.IOException;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;

public class GetHealthAction extends ActionType<GetHealthAction.Response> {

public static final GetHealthAction INSTANCE = new GetHealthAction();
public static final String NAME = "cluster:monitor/health_api";

private GetHealthAction() {
super(NAME, GetHealthAction.Response::new);
}

public static class Response extends ActionResponse implements ToXContentObject {

private final ClusterName clusterName;
private final HealthStatus status;
private final List<HealthComponentResult> components;

public Response(StreamInput in) {
throw new AssertionError("GetHealthAction should not be sent over the wire.");
}

public Response(final ClusterName clusterName, final List<HealthComponentResult> components) {
this.clusterName = clusterName;
this.components = components;
this.status = HealthStatus.merge(components.stream().map(HealthComponentResult::status));
}

public ClusterName getClusterName() {
return clusterName;
}

public HealthStatus getStatus() {
return status;
}

public List<HealthComponentResult> getComponents() {
return components;
}

@Override
public void writeTo(StreamOutput out) throws IOException {
throw new AssertionError("GetHealthAction should not be sent over the wire.");
}

@Override
public XContentBuilder toXContent(XContentBuilder builder, ToXContent.Params params) throws IOException {
builder.startObject();
builder.field("status", status);
builder.field("cluster_name", clusterName.value());
builder.array("impacts");
builder.startObject("components");
for (HealthComponentResult component : components) {
builder.field(component.name(), component, params);
}
builder.endObject();
return builder.endObject();
}
}

public static class Request extends ActionRequest {

@Override
public ActionRequestValidationException validate() {
return null;
}
}

public static class TransportAction extends org.elasticsearch.action.support.TransportAction<Request, Response> {

private final ClusterService clusterService;

@Inject
public TransportAction(
final ActionFilters actionFilters,
final TransportService transportService,
final ClusterService clusterService
) {
super(NAME, actionFilters, transportService.getTaskManager());
this.clusterService = clusterService;
}

@Override
protected void doExecute(Task task, Request request, ActionListener<Response> listener) {
final ClusterState clusterState = clusterService.state();
final HealthComponentResult controller = ClusterCoordination.createClusterCoordinationComponent(
clusterService.localNode(),
clusterState
);
final HealthComponentResult snapshots = new HealthComponentResult("snapshots", HealthStatus.GREEN, Collections.emptyMap());
final ClusterName clusterName = clusterService.getClusterName();
listener.onResponse(new Response(clusterName, Arrays.asList(controller, snapshots)));
}
}
}

0 comments on commit ea96bfe

Please sign in to comment.