Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow discovery by label as well as service #1

Closed
wants to merge 1 commit into from

Conversation

wozniakjan
Copy link
Owner

@wozniakjan wozniakjan commented Jun 23, 2017

We have stumbled upon an issue with ElasticSearch cluster with a size larger than one, its master discovery and readiness probe. I think the sequence of events turns into a deadlock due to nature how elasticsearch-cloud-kubernetes plugin works using services for addressing the pods.

  1. ES pods are started and waiting for readiness probe to succeed to include these pods in the ES service
  2. ES containers try master discovery (requires network communication through the ES service)
  3. ES service has no pods available to forward traffic, therefore master discovery is unable to happen and readiness probe never pronounces pods as healthy

This proposed change allows to define pod_label and pod_port for ES master discovery to bypass service completely and negotiate the protocol based on pods being correctly labeled.

So the configuration file would still support service based discovery, but it could also look like:

cloud:
  kubernetes:
    pod_label: component=es
    pod_port: 9300
    namespace: ${NAMESPACE}

While currently only this is accepted:

cloud:
  kubernetes:
    service: ${SERVICE_DNS}
    namespace: ${NAMESPACE}

@wozniakjan
Copy link
Owner Author

The proposed implementation should currently only serve as a proof of concept, it does require quite a lot of polishing

@portante
Copy link

It seems we are missing the spirit of what a readiness probe means for Elasticsearch. An OpenShift readiness probe refers to the readiness of a particular pod to operate. It does not refer to the readiness of the global distributed service that pod is part of when running.

The readiness of an Elasticsearch pod should be to make sure the low-level services needed by the ES pod are in place: disk space is available (/elasticsearch/persistent/logging-es/...), sanity check of the configuration files are in place (/usr/share/elasticsearch/elasticsearch/config/elasticsearch.yml), etc.

@portante
Copy link

First, can you post this as a PR against the upstream repository?

Pod "readiness" probes cannot be tied to cluster readiness, in any way, since pod readiness is not the same notion a "service" readiness.

Since Elasticsearch immediately starts to reach out to try to find other members of the Elasticsearch cluster as defined in the elasticsearch.yml file, and we don't have a way to "pause" elasticsearch startup until the "readiness" state has been acknowledged; as such it does not seem viable to tie "readiness" to some running state of the ES process.

Also, we don't want to emit warnings or errors in the logs of Elasticsearch because the networking is not enabled for some duration.

This commit appears to use the openshift APIs requiring communication with the master API service to inspect itself. This seems like a readiness dependency we would not want to have.

Instead, since this plugin is responsible for discovering other members of the cluster, is there a way for this plugin to get the ES pod to a state where it is functional and responsive to local ES API queries, where the plugin could respond to a request saying that it is "ready" to operate, and only begins to discover other members of the ES cluster after it is "ready"?

@wozniakjan
Copy link
Owner Author

let's not scatter the conversation and keep it on the issue fabric8io#90. As stated above, this should only serve testing purposes and as a soft proof that discovery by label is possible (not that it is a good idea)

@wozniakjan wozniakjan closed this Aug 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants