Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CCR] Follower info API doesn't return paused status #37738

Closed
jen-huang opened this issue Jan 23, 2019 · 4 comments · Fixed by #37752
Closed

[CCR] Follower info API doesn't return paused status #37738

jen-huang opened this issue Jan 23, 2019 · 4 comments · Fixed by #37752
Assignees
Labels
>bug :Distributed/CCR Issues around the Cross Cluster State Replication features

Comments

@jen-huang
Copy link

I have two active follower indices: copy-of-kibana_sample_data_logs, copy-of-kibana_sample_data_flights.

Then I send a request to pause the second one:

POST /copy-of-kibana_sample_data_flights/_ccr/pause_follow

Follower stats API does not show the index that was paused (correct behavior):

GET /_ccr/stats

{
  "auto_follow_stats" : {
    "number_of_failed_follow_indices" : 0,
    "number_of_failed_remote_cluster_state_requests" : 0,
    "number_of_successful_follow_indices" : 2,
    "recent_auto_follow_errors" : [ ],
    "auto_followed_clusters" : [
      {
        "cluster_name" : "prod2",
        "time_since_last_check_millis" : 35108,
        "last_seen_metadata_version" : 32
      }
    ]
  },
  "follow_stats" : {
    "indices" : [
      {
        "index" : "copy-of-kibana_sample_data_logs",
        "shards" : [
          {
            "remote_cluster" : "prod2",
            "leader_index" : "kibana_sample_data_logs",
            "follower_index" : "copy-of-kibana_sample_data_logs",
            "shard_id" : 0,
            "leader_global_checkpoint" : 14004,
            "leader_max_seq_no" : 14004,
            "follower_global_checkpoint" : 14004,
            "follower_max_seq_no" : 14004,
            "last_requested_seq_no" : 14004,
            "outstanding_read_requests" : 1,
            "outstanding_write_requests" : 0,
            "write_buffer_operation_count" : 0,
            "write_buffer_size_in_bytes" : 0,
            "follower_mapping_version" : 2,
            "follower_settings_version" : 1,
            "total_read_time_millis" : 6209,
            "total_read_remote_exec_time_millis" : 6086,
            "successful_read_requests" : 29,
            "failed_read_requests" : 0,
            "operations_read" : 14005,
            "bytes_read" : 13606642,
            "total_write_time_millis" : 3231,
            "successful_write_requests" : 29,
            "failed_write_requests" : 0,
            "operations_written" : 14005,
            "read_exceptions" : [ ],
            "time_since_last_read_millis" : 39676
          }
        ]
      }
    ]
  }
}

However, the follower info API still returns status: "active" for it:

GET /_all/_ccr/info

{
  "follower_indices" : [
    {
      "follower_index" : "copy-of-kibana_sample_data_logs",
      "remote_cluster" : "prod2",
      "leader_index" : "kibana_sample_data_logs",
      "status" : "active",
      "parameters" : {
        "max_read_request_operation_count" : 5120,
        "max_read_request_size" : "32mb",
        "max_outstanding_read_requests" : 12,
        "max_write_request_operation_count" : 5120,
        "max_write_request_size" : "9223372036854775807b",
        "max_outstanding_write_requests" : 9,
        "max_write_buffer_count" : 2147483647,
        "max_write_buffer_size" : "512mb",
        "max_retry_delay" : "500ms",
        "read_poll_timeout" : "1m"
      }
    },
    {
      "follower_index" : "copy-of-kibana_sample_data_flights",
      "remote_cluster" : "prod2",
      "leader_index" : "kibana_sample_data_flights",
      "status" : "active",
      "parameters" : {
        "max_read_request_operation_count" : 5120,
        "max_read_request_size" : "32mb",
        "max_outstanding_read_requests" : 12,
        "max_write_request_operation_count" : 5120,
        "max_write_request_size" : "9223372036854775807b",
        "max_outstanding_write_requests" : 9,
        "max_write_buffer_count" : 2147483647,
        "max_write_buffer_size" : "512mb",
        "max_retry_delay" : "500ms",
        "read_poll_timeout" : "1m"
      }
    }
  ]
}

cc @martijnvg

@jen-huang jen-huang added the :Distributed/CCR Issues around the Cross Cluster State Replication features label Jan 23, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@jen-huang
Copy link
Author

This does not seem to be an issue if both indices are paused:

GET /_all/_ccr/info

{
  "follower_indices" : [
    {
      "follower_index" : "copy-of-kibana_sample_data_logs",
      "remote_cluster" : "prod2",
      "leader_index" : "kibana_sample_data_logs",
      "status" : "paused"
    },
    {
      "follower_index" : "copy-of-kibana_sample_data_flights",
      "remote_cluster" : "prod2",
      "leader_index" : "kibana_sample_data_flights",
      "status" : "paused"
    }
  ]
}

@jen-huang jen-huang added the >bug label Jan 23, 2019
@jen-huang
Copy link
Author

Hmm, if I request for just the paused index (copy-of-kibana_sample_data_flights), both of them are returned with a paused status. This behavior seems related, but maybe not?

GET /copy-of-kibana_sample_data_flights/_ccr/info

{
  "follower_indices" : [
    {
      "follower_index" : "copy-of-kibana_sample_data_logs",
      "remote_cluster" : "prod2",
      "leader_index" : "kibana_sample_data_logs",
      "status" : "paused"
    },
    {
      "follower_index" : "copy-of-kibana_sample_data_flights",
      "remote_cluster" : "prod2",
      "leader_index" : "kibana_sample_data_flights",
      "status" : "paused"
    }
  ]
}

@martijnvg
Copy link
Member

@jen-huang That is related to the bug in the issue description. Index filtering is just completely broken. I will have a PR open soon.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Jan 23, 2019
The filtering by follower index was completely broken.
Also the wrong persistent tasks were selected, causing the
wrong status to be reported.

Closes elastic#37738
martijnvg added a commit that referenced this issue Jan 24, 2019
The filtering by follower index was completely broken.
Also the wrong persistent tasks were selected, causing the
wrong status to be reported.

Closes #37738
martijnvg added a commit that referenced this issue Jan 24, 2019
The filtering by follower index was completely broken.
Also the wrong persistent tasks were selected, causing the
wrong status to be reported.

Closes #37738
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/CCR Issues around the Cross Cluster State Replication features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants