Added wait_for_metadata_version parameter to cluster state api. #35535

martijnvg · 2018-11-14T11:06:59Z

The wait_for_metadata_version parameter will instruct the cluster state
api to return a cluster state until the metadata's version is equal or
greater than the version specified in wait_for_metadata_version or wait
until till the specified wait_for_timeout has expired, in this case the
cluster state is returned and time_out field is set to true.

In the case metadata's version is equal or higher than wait_for_metadata_version
then this api will immediately return.

This feature is useful to avoid external components from constantly
polling the cluster state to whether somethings have changed in the
cluster state's metadata.

I'm not sure what the best feature label for this issue is, so I labelled it as :Core/Core. I'm happy to change the label.

The `wait_for_metadata_version` parameter will instruct the cluster state api to return a cluster state until the metadata's version is equal or greater than the version specified in `wait_for_metadata_version` or wait until till the specified `wait_for_timeout` has expired. In the case metadata's version is equal or higher than `wait_for_metadata_version` then the api will immediately return. This feature is useful to avoid external components from constantly polling the cluster state to whether somethings have changed in the cluster state's metadata.

elasticmachine · 2018-11-14T11:07:01Z

Pinging @elastic/es-core-infra

…ata_version

jasontedor

The change looks good overall. I left a few minor comments.

The only thing I am doubting is whether or not to return the last known state after a timeout. If we don't do that, we don't need to change the ClusterStateObserver.Listener interface. What is your thinking behind sending back a cluster state in this case as opposed to an empty response that would more clearly indicate the request timed out?

I'll do another quick pass after we come to a decision on this point.

jasontedor · 2018-11-21T11:16:02Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateResponse.java

+     * Returns whether waiting for a cluster state wait a metadata version equal or higher than the specified
+     * metadata version timed out.
+     */
+    public boolean isTimedOut() {


The Javadocs are clear but I think the name could be clearer and reflect the name of the parameter on the request: isWaitForTimedOut

jasontedor · 2018-11-21T11:16:20Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateResponse.java


    public ClusterStateResponse() {
    }

-    public ClusterStateResponse(ClusterName clusterName, ClusterState clusterState, long sizeInBytes) {
+    public ClusterStateResponse(ClusterName clusterName, ClusterState clusterState, long sizeInBytes, boolean timedOut) {


timedOut -> waitForTimedOut (to clarify that it's not a generic timeout)

jasontedor · 2018-11-21T11:16:30Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateResponse.java

@@ -40,14 +41,16 @@
    // the total compressed size of the full cluster state, not just
    // the parts included in this response
    private ByteSizeValue totalCompressedSize;
+    private boolean timedOut = false;


timedOut -> waitForTimedOut

jasontedor · 2018-11-21T11:16:56Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateResponse.java

+
+    @Override
+    public int hashCode() {
+        // Best effort for testing. Left out cluster state, because it doesn't implement equals() and hashcode()


Maybe hash the version and the master node ID (that's what an observer considers to be the same for purposes of comparing cluster states)?

jasontedor · 2018-11-21T11:17:02Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateResponse.java

+        ClusterStateResponse response = (ClusterStateResponse) o;
+        return timedOut == response.timedOut &&
+            Objects.equals(clusterName, response.clusterName) &&
+            // Best effort. Left out cluster state, because it doesn't implement equals() and hashcode()


Maybe compare the version and the master node ID (that's what an observer considers to the same for purposes of comparing cluster states)?

jasontedor · 2018-11-21T11:17:23Z

.../src/main/java/org/elasticsearch/action/admin/cluster/state/TransportClusterStateAction.java

+
+    private void buildResponse(final ClusterStateRequest request,
+                               final ClusterState currentState,
+                               final boolean timedOut,


waitForTimedOut

…ata_version

martijnvg · 2018-11-21T13:18:48Z

Thanks for reviewing @jasontedor

The only thing I am doubting is whether or not to return the last known state after a timeout. If we don't do that, we don't need to change the ClusterStateObserver.Listener interface. What is your thinking behind sending back a cluster state in this case as opposed to an empty response that would more clearly indicate the request timed out?

I wanted it to be similar to the shard changes api. In this case when it times out waiting for new ops, it returns a response with potential updated mapping / settings version and no ops. In the case of cluster state api, other things outside metadata may have changed and so it made sense to me to return the cluster state with a timed out flag.

…e in the response and set timed out field to true.

martijnvg · 2018-11-21T22:12:43Z

After talking to @jasontedor, it made sense to not return a cluster state when wait for metadata version has timed out, so I made that change. This also means the change to ClusterStateObserver.Listener has been reverted.

jasontedor

Looking good. I left a few more minors.

jasontedor · 2018-11-21T22:11:43Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateResponse.java

    }

    @Override
    public void writeTo(StreamOutput out) throws IOException {
        super.writeTo(out);
        clusterName.writeTo(out);
-        if (out.getVersion().onOrAfter(Version.V_6_3_0)) {
-            clusterState.writeTo(out);
+        if (out.getVersion().onOrAfter(Version.V_7_0_0)) {


TODO: 6.6.0

jasontedor · 2018-11-21T22:11:47Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateResponse.java

    @Override
    public void readFrom(StreamInput in) throws IOException {
        super.readFrom(in);
        clusterName = new ClusterName(in);
-        clusterState = ClusterState.readFrom(in, null);
+        if (in.getVersion().onOrAfter(Version.V_7_0_0)) {


TODO: 6.6.0

jasontedor · 2018-11-21T22:12:04Z

server/src/test/java/org/elasticsearch/action/admin/cluster/state/ClusterStateRequestTests.java

@@ -56,6 +67,10 @@ public void testSerialization() throws Exception {
            assertThat(deserializedCSRequest.blocks(), equalTo(clusterStateRequest.blocks()));
            assertThat(deserializedCSRequest.indices(), equalTo(clusterStateRequest.indices()));
            assertOptionsMatch(deserializedCSRequest.indicesOptions(), clusterStateRequest.indicesOptions());
+            if (testVersion.onOrAfter(Version.V_7_0_0)) {


TODO: 6.6.0

jasontedor · 2018-11-21T22:13:55Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateRequest.java

+    }
+
+    public ClusterStateRequest waitForMetaDataVersion(long expectedMetaDataVersion) {
+        this.waitForMetaDataVersion = expectedMetaDataVersion;


Should we require this be >= 1?

jasontedor · 2018-11-21T22:15:15Z

server/src/main/java/org/elasticsearch/rest/action/admin/cluster/RestClusterStateAction.java

@@ -94,6 +98,7 @@ public RestChannelConsumer prepareRequest(final RestRequest request, final NodeC
            @Override
            public RestResponse buildResponse(ClusterStateResponse response, XContentBuilder builder) throws Exception {
                builder.startObject();
+                builder.field(Fields.TIMED_OUT, response.isWaitForTimedOut());


I wonder if this field should be called wait_for_timed_out? And I wonder if we should only include it if wait_for_metadata_version was set on the request?

martijnvg · 2018-11-22T06:50:34Z

@jasontedor I've updated the PR.

jasontedor · 2018-11-22T12:57:53Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateRequest.java

+
+    public ClusterStateRequest waitForMetaDataVersion(long waitForMetaDataVersion) {
+        if (waitForMetaDataVersion < 1) {
+            throw new IllegalArgumentException("waitForMetaDataVersion should be >= 1");


Include the invalid value?

@martijnvg What do you think about including the default value here?

jasontedor · 2018-11-23T19:11:12Z

rest-api-spec/src/main/resources/rest-api-spec/api/cluster.state.json

+        },
+        "wait_for_timeout" : {
+          "type": "time",
+          "description": "The maximum time to wait for wait_for_metadata_version before stop waiting and then the last observed cluster state is returned"


This documentation comment is stale now.

jasontedor · 2018-11-23T19:12:49Z

server/src/main/java/org/elasticsearch/action/admin/cluster/state/ClusterStateResponse.java

@@ -75,11 +79,24 @@ public ByteSizeValue getTotalCompressedSize() {
        return totalCompressedSize;
    }

+    /**
+     * Returns whether waiting for a cluster state wait a metadata version equal or higher than the specified


Returns whether -> Returns whether the request timed out waiting... and remove timed out from the end of the sentence.

jasontedor · 2018-11-23T19:16:49Z

server/src/test/java/org/elasticsearch/action/admin/cluster/state/ClusterStateApiTests.java

+            assertThat(future2.isDone(), is(true));
+        });
+        ClusterStateResponse response = future2.actionGet();
+        assertThat(response.isWaitForTimedOut(), is(false));


On a really slow or overloaded machine, this can timeout in a second and cause spurious build failures. How about setting the timeout really high here (e.g., one hour). Then, in the test when you test that the timeout works, set the timeout low again.

jasontedor · 2018-11-23T19:17:32Z

I left a few comments, no need for another round.

…ata_version

…tic#35535) The `wait_for_metadata_version` parameter will instruct the cluster state api to only return a cluster state until the metadata's version is equal or greater than the version specified in `wait_for_metadata_version`. If the specified `wait_for_timeout` has expired then a timed out response is returned. (a response with no cluster state and wait for timed out flag set to true) In the case metadata's version is equal or higher than `wait_for_metadata_version` then the api will immediately return. This feature is useful to avoid external components from constantly polling the cluster state to whether somethings have changed in the cluster state's metadata.

The `wait_for_metadata_version` parameter will instruct the cluster state api to only return a cluster state until the metadata's version is equal or greater than the version specified in `wait_for_metadata_version`. If the specified `wait_for_timeout` has expired then a timed out response is returned. (a response with no cluster state and wait for timed out flag set to true) In the case metadata's version is equal or higher than `wait_for_metadata_version` then the api will immediately return. This feature is useful to avoid external components from constantly polling the cluster state to whether somethings have changed in the cluster state's metadata.

martijnvg added >enhancement :Core/Infra/Core Core issues without another label v7.0.0 v6.6.0 labels Nov 14, 2018

martijnvg requested a review from jasontedor November 14, 2018 11:07

martijnvg added 10 commits November 14, 2018 12:13

iter

e1dc9c6

Merge remote-tracking branch 'es/master' into cluster_state_api_metad…

7c69668

…ata_version

Merge remote-tracking branch 'es/master' into cluster_state_api_metad…

96d1aa2

…ata_version

Merge remote-tracking branch 'es/master' into cluster_state_api_metad…

d92a1b3

…ata_version

Merge remote-tracking branch 'es/master' into cluster_state_api_metad…

4c70c07

…ata_version

Merge remote-tracking branch 'es/master' into cluster_state_api_metad…

1690eef

…ata_version

fixed binary protocol versions

7327c1f

Merge remote-tracking branch 'es/master' into cluster_state_api_metad…

ce7d03d

…ata_version

fixed test

a55b501

fixed versioning in response class...

1607c16

jasontedor reviewed Nov 21, 2018

View reviewed changes

martijnvg added 2 commits November 21, 2018 13:56

Merge remote-tracking branch 'es/master' into cluster_state_api_metad…

a9adee2

…ata_version

iter

5051dd1

martijnvg added 2 commits November 21, 2018 15:38

fixed checkstyle violation

6feabd6

Is waitForMetadataVersion timed out then don't include a cluster stat…

b38f43c

…e in the response and set timed out field to true.

jasontedor reviewed Nov 21, 2018

View reviewed changes

iter

e061319

jasontedor approved these changes Nov 23, 2018

View reviewed changes

Merge remote-tracking branch 'es/master' into cluster_state_api_metad…

9f448eb

…ata_version

martijnvg added 2 commits November 23, 2018 21:08

iter

1609833

Merge remote-tracking branch 'es/master' into cluster_state_api_metad…

3c03777

…ata_version

martijnvg merged commit 7624734 into elastic:master Nov 26, 2018

martijnvg added the backport pending label Nov 26, 2018

martijnvg added a commit that referenced this pull request Nov 27, 2018

Changed versions in serialization code after backporting #35535

447e5d2

danielmitterdorfer removed the backport pending label Jan 31, 2019

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

joegallo mentioned this pull request Oct 30, 2023

Drop very old version check from this ClusterStateRequestTests #101570

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added wait_for_metadata_version parameter to cluster state api. #35535

Added wait_for_metadata_version parameter to cluster state api. #35535

martijnvg commented Nov 14, 2018

elasticmachine commented Nov 14, 2018

jasontedor left a comment

jasontedor Nov 21, 2018

jasontedor Nov 21, 2018

jasontedor Nov 21, 2018

jasontedor Nov 21, 2018

jasontedor Nov 21, 2018

jasontedor Nov 21, 2018

martijnvg commented Nov 21, 2018

martijnvg commented Nov 21, 2018

jasontedor left a comment

jasontedor Nov 21, 2018

jasontedor Nov 21, 2018

jasontedor Nov 21, 2018

jasontedor Nov 21, 2018

jasontedor Nov 21, 2018

martijnvg commented Nov 22, 2018

jasontedor Nov 22, 2018

jasontedor Nov 23, 2018

jasontedor Nov 23, 2018

jasontedor Nov 23, 2018

jasontedor Nov 23, 2018

jasontedor commented Nov 23, 2018

Added wait_for_metadata_version parameter to cluster state api. #35535

Added wait_for_metadata_version parameter to cluster state api. #35535

Conversation

martijnvg commented Nov 14, 2018

elasticmachine commented Nov 14, 2018

jasontedor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martijnvg commented Nov 21, 2018

martijnvg commented Nov 21, 2018

jasontedor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martijnvg commented Nov 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jasontedor commented Nov 23, 2018