Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip_unavailable changes from true to false when remote connection fails #107125

Open
asmith-elastic opened this issue Apr 4, 2024 · 1 comment
Labels
>bug :Distributed/Network Http and internode communication implementations Team:Distributed Meta label for distributed team

Comments

@asmith-elastic
Copy link
Contributor

Elasticsearch Version

8.12.2

Installed Plugins

No response

Java Version

bundled

OS Version

linux

Problem Description

There is a behavior in Elasticsearch where the skip_unavailable setting for a remote cluster connection is automatically reset to false when an incorrect remote cluster address is configured. After correcting the connection details, the skip_unavailable setting does not revert to true, even if it was previously set to that value. Instead, it requires an explicit reconfiguration to set it back to true.

Steps to Reproduce

  1. Configure a remote cluster with skip_unavailable set to true:
PUT _cluster/settings
{
  "persistent": {
    "cluster.remote.ccs.mode": "proxy",
    "cluster.remote.ccs.proxy_address": "ccs.es.us-central1.gcp.cloud.es.io:9400",
    "cluster.remote.ccs.proxy_socket_connections": "18",
    "cluster.remote.ccs.server_name": "ccs.es.us-central1.gcp.cloud.es.ioo",
    "cluster.remote.ccs.skip_unavailable": "true"
  }
}
  1. Verify the configuration, note that skip_unavailable is true.

  2. Introduce an error by setting an incorrect remote cluster address:

PUT _cluster/settings
{
  "persistent": {
    "cluster.remote.ccs.proxy_address": "ccs-broken.es.us-central1.gcp.cloud.es.io:9400"
  }
}
  1. Observe that the remote connection fails and skip_unavailable is automatically set to false.
{
  "ccs": {
    "connected": false,
    "mode": "proxy",
    "proxy_address": "ccs-broken.es.us-central1.gcp.cloud.es.io:9400",
    "server_name": "ccs-broken.es.us-central1.gcp.cloud.es.ioo",
    "num_proxy_sockets_connected": 0,
    "max_proxy_socket_connections": 18,
    "initial_connect_timeout": "30s",
    "skip_unavailable": false
  }
}
  1. Correct the server address back to the initial correct value.

  2. Notice that skip_unavailable remains false and does not revert back to true.

{
  "ccs": {
    "connected": true,
    "mode": "proxy",
    "proxy_address": "ccs.es.us-central1.gcp.cloud.es.io:9400",
    "server_name": "ccs.es.us-central1.gcp.cloud.es.ioo",
    "num_proxy_sockets_connected": 18,
    "max_proxy_socket_connections": 18,
    "initial_connect_timeout": "30s",
    "skip_unavailable": false
  }
}
  1. Manually attempt to set skip_unavailable to true again:
PUT _cluster/settings
{
  "persistent": {
    "cluster.remote.ccs.skip_unavailable": "true"
  }
}
  1. Observe how skip_unavailable does not change to true and remains set to false.

  2. Set skip_unavailable to false while it is already set to a false value.

PUT _cluster/settings
{
  "persistent": {
    "cluster.remote.ccs.skip_unavailable": "false"
  }
}
  1. Manually attempt to set skip_unavailable to true.
PUT _cluster/settings
{
  "persistent": {
    "cluster.remote.ccs.skip_unavailable": "true"
  }
}
  1. The setting now updates successfully, verify that the remote connection works and skip_unavailable is set back to true.

Logs (if relevant)

No response

@asmith-elastic asmith-elastic added >bug needs:triage Requires assignment of a team area label labels Apr 4, 2024
@demjened demjened added the :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. label Apr 5, 2024
@elasticsearchmachine elasticsearchmachine added Team:Distributed Meta label for distributed team and removed needs:triage Requires assignment of a team area label labels Apr 5, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner DaveCTurner added :Distributed/Network Http and internode communication implementations and removed :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Network Http and internode communication implementations Team:Distributed Meta label for distributed team
Projects
None yet
Development

No branches or pull requests

4 participants