Skip to content
This repository has been archived by the owner on Apr 28, 2022. It is now read-only.

Mark cluster status failed when no pods are running or quorum is lost #29

Merged
merged 1 commit into from
Nov 23, 2019

Conversation

Marlinc
Copy link

@Marlinc Marlinc commented Nov 12, 2019

Clusters that currently have all their pods killed or have lost
quorum will stay in a running phase while being completed dead.

This change marks the corresponding errors as being fatal which
lets the operator mark those clusters as failed.

Fixes coreos#2067, coreos#1973

Example status:

Status:
  Client Port:  2379
  Conditions:
    Last Transition Time:  2019-11-11T20:03:43Z
    Last Update Time:      2019-11-11T20:03:43Z
    Reason:                Cluster available
    Status:                True
    Type:                  Available
  Current Version:         3.2.25
  Members:
    Ready:
      etcd-cluster-64687hxm2r
      etcd-cluster-klb4cjqbbb
      etcd-cluster-qcfpqm495v
  Phase:           Failed
  Reason:          all etcd pods are dead.
  Service Name:    etcd-cluster-client
  Size:            3
  Target Version:  

Please read https://github.com/coreos/etcd-operator/blob/master/CONTRIBUTING.md#contribution-flow

Clusters that currently have all their pods killed or have lost
quorum will stay in a running phase while being completed dead.

This change marks the corresponding errors as being fatal which
lets the operator mark those clusters as failed.
@Marlinc Marlinc self-assigned this Nov 12, 2019
@Wouter0100 Wouter0100 merged commit 542657e into master Nov 23, 2019
@Wouter0100 Wouter0100 deleted the quorum-lost-failed branch November 23, 2019 22:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

etcd-operator should add status to etcdclusters/<clustername> when lost quorum
2 participants