-
Notifications
You must be signed in to change notification settings - Fork 40.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Statefulset pods disappearing after initial correct start #41012
Comments
@kubernetes/sig-apps-bugs |
This is correct. If pod-0 restarts, the StatefulSet controller will bring that back up, but will not affect pod-1 and pod-2 which are already running. The more likely event here seems like one of your node deletions took down pod-1 and pod-2 after pod-0 went unhealthy. In that case, we do not attempt to recreate pod-1 or 2 till pod-0 becomes healthy again. The rationale for this is that users rely on the deterministic initialization order and write logic around that guarantee. To bring up the pods in arbitrary order would violate this guarantee. I would recommend studying the particular application you're running to ensure that it does not enter an unhealthy state and can indeed tolerate failures and come back up successfully. |
And we have had feedback that users would like to opt in to a
non-deterministic order - that may very well be reasonable to experiment
with in 1.6 if we can reach agreement on the form.
…On Tue, Feb 7, 2017 at 2:41 PM, Anirudh Ramanathan ***@***.*** > wrote:
I would expect once a statefulset has correctly started the separate pods
are no longer dependent on the first one running correctly.
This is correct. They do not depend on the first one running correctly. If
pod-0 restarts, the StatefulSet controller will bring that back up, but
will not affect pod-1 and pod-2 which are already running. The more likely
event here seems like one of your node deletions took down pod-1 and pod-2
after pod-0 went unhealthy. In that case, we do not attempt to recreate
pod-1 or 2 till pod-0 becomes healthy again. The rationale for this is that
users rely on the deterministic initialization order and write logic around
that guarantee. To bring up the pods in arbitrary order would violate this
guarantee.
I would recommend studying the particular application you're running to
ensure that it does not enter an unhealthy state and can indeed tolerate
failures and come back up successfully.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#41012 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p2Gip_3kQq00AfZ90tDki-EtxhK3ks5raMjPgaJpZM4L4NW3>
.
|
I imagine that such non-determinism would apply at initialization time as well? Can you expand some more on the potential use cases of what you mention? If we want to do this, we should open up an issue and start collecting concrete use cases which point to this need. |
Discussed in #39363
…On Tue, Feb 7, 2017 at 7:01 PM, Anirudh Ramanathan ***@***.*** > wrote:
I imagine that such non-determinism would apply at initialization time as
well? Can you expand some more on the potential use cases of what you
mention? If it is a legitimate use case, then we should open up an issue
and start collecting concrete use cases which point to this need.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#41012 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p84nYuRcnROmMyNKriJtn67C7SNMks5raQXEgaJpZM4L4NW3>
.
|
I'm running a modified version of elasticsearch - https://github.com/pires/docker-elasticsearch-kubernetes - that uses the SVC dns records for discovery and doesn't depend on the order. Of course tt could use a deployment instead of a statefulset but I need the PDs to be provisioned, hence the use of the statefulset. The dependency on the first node lowers total availability a lot, especially since we're running on preemptibles that do not last longer than 24 hours. For a lot of other applications I would expect the order to be important the very first time it starts up, but after that it would be better to join the cluster via a service. Is there an annotation that can modify this behaviour for scenarios like these where it isn't needed? |
When I opened this ticket I didn't use pod anti-affinity in the statefulset. Running on top of preemptible vms this lead to described failure. However since adding anti-affinity we haven't run into this issue. Closing the ticket. |
BUG REPORT
I'm running ElasticSearch in a statefulset and in the range of 36 hours all 3 pods except for the first one (-0) have disappeared. The first one is in a CrashLoopBackoff state.
I would expect once a statefulset has correctly started the separate pods are no longer dependent on the first one running correctly.
Kubernetes version (use
kubectl version
):Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e
6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:57:25Z", GoVersion:"go1.7.4
", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0
950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:52:01Z", GoVersion:"go1.7.4
", Compiler:"gc", Platform:"linux/amd64"}
Environment:
What happened:
All statefulset replicas disappeared, except for the first on which was in CrashLoopBackoff state.
What you expected to happen:
The second and third node to stay operational even if the first node fails.
How to reproduce it (as minimally and precisely as possible):
Haven't been able to do so, but it has happened a couple of times in a week.
Anything else we need to know:
The statefulset runs in a GKE cluster on top of preemptible nodes. To avoid the preemptibles from mass expiring I'm stopping and deleting them randomly before the 24 hours are up. This should spread deletion and making it less likely that more than one host is deleted at a time.
There's also
The text was updated successfully, but these errors were encountered: