Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Liveness Probe on multi-node etcd #280

Open
ishan16696 opened this issue Dec 22, 2021 · 20 comments · Fixed by #357
Open

[Feature] Liveness Probe on multi-node etcd #280

ishan16696 opened this issue Dec 22, 2021 · 20 comments · Fixed by #357
Assignees
Labels
kind/enhancement Enhancement, improvement, extension lifecycle/stale Nobody worked on this for 6 months (will further age) priority/2 Priority (lower number equals higher priority) release/ga Planned for GA(General Availability) release of the Feature status/on-hold Issue on hold (e.g. because work was suspended)

Comments

@ishan16696
Copy link
Member

ishan16696 commented Dec 22, 2021

Feature (What you would like to be added):
Configure the correct livenessProbe in multi-node etcd.

Motivation (Why is this needed?):
Right now I found 2 ways to configure the livenessProbe:

  1. /health endpoint
    /health endpoint returns false if one of the following conditions is met (source):
    • there is no etcd leader or leader-election is currently going on.
    • the latency of a QGET request exceeds 1sec

Disadvantage of Method 1 (/health endpoint).

  • If there is no Quorum, etcd-follower will get restart by kubelet but etcd's restart isn't a right behavior.
  • If etcd cluster is overloaded which is bad, but restart will generate even more load
  1. ETCDCTL_API=3 etcdctl endpoint health --endpoints=${ENDPOINTS} --command-timeout=Xs
    etcdctl endpoint health command performs a GET on the "health" key(source)
    • fails if there is no etcd leader or leader-election is going on as I think GET request will fail if there is no etcd leader.

Disadvantage of Method 2 (etcdctl endpoint health).

  • If there is no Quorum, etcd-follower will get restart by kubelet but etcd's restart isn't a right behavior.

This issue of liveness is already reported on etcd here and it has been fixed by this PR but this feature will come with etcd v3.6 and we don't know the timeline when it will be released.

What options we have:

  1. Go with the method 2(etcdctl endpoint health) and whenever etcd release v3.6, switch to that version and configure the correct liveness probe.
  2. Other way is fork the etcd v3.4.13 and make the necessary changes and use the custom image.
@ishan16696 ishan16696 added the kind/enhancement Enhancement, improvement, extension label Dec 22, 2021
@ishan16696
Copy link
Member Author

cc @abdasgupta @aaronfern @timuthy

@timuthy
Copy link
Member

timuthy commented May 30, 2022

Reading this I wonder about the option keeping the current liveness probe but setting --consistency="s"? This mimics the implementation of the linked PR etcd-io/etcd#13399. WDYT @ishan16696?

@ishan16696
Copy link
Member Author

Reading this I wonder about the option keeping the current liveness probe but setting --consistency="s"

yes, this make sense to me .... I don’t see any major issue with this.
we can configure the livenessProbe like this making the consistency serializable instead of linearizable

- /bin/sh
      - -ec
      - ETCDCTL_API=3
      - etcdctl
      - --cert= ...
      - --key= ....
      - --cacert= ....
      - --endpoints=https://etcd-main-local:2379
      - get
      - foo
      - --consistency="s"

@ishan16696
Copy link
Member Author

Hi @timuthy ,
Unfortunately this way of using --consistency="s" is not working for following scenario.

  1. Start an etcd cluster with 3 members.
  2. Add some dummy data.
  3. Stop all the members
  4. Start only one member
  5. then try etcdctl get foo --endpoints=http://127.0.0.1:2379 --consistency="s".
    Unfortunately --consistency="s" won't help us in this case until quorum will get satisfy.

@timuthy
Copy link
Member

timuthy commented Jun 1, 2022

As discussed, the assumption wasn't that all members are down but rather that if e.g. 1/3 are available, this last running pod is not killed by the liveness probe as well because it doesn't make sense.

@ishan16696
Copy link
Member Author

this last running pod is not killed by the liveness probe as well because it doesn't make sense.

I'm not saying last pod is killed due to liveness probe ... I'm saying last pod can be killed/crash due to some reason, why you are assuming that last pod can’t be killed.

@ishan16696
Copy link
Member Author

ishan16696 commented Jun 1, 2022

The point I'm making is, suppose all pods/member of etcd-cluster crashes/goes down due to some reasons, they will not be able to come up due to this livenessProbe: etcdctl get foo --consistency="s" will get fail even for 1/3 member until quorum will achieve.

@timuthy
Copy link
Member

timuthy commented Jun 2, 2022

they will not be able to come up due to this livenessProbe

I assume that they will come up fast enough, so that the livenessProbes don't fail beyond the expected threshold (initialDelaySeconds + periodSeconds * failureThreshold) and thus the containers aren't restarted before they're up and running.

@ishan16696
Copy link
Member Author

(initialDelaySeconds + periodSeconds * failureThreshold)

yeah exactly but then we have to depend on this configuration and set them accordingly.

@unmarshall
Copy link
Contributor

unmarshall commented Jun 8, 2022

IMO using --consistency=s is non-deterministic as the time it takes to start the etcd process cannot be guaranteed. The correct thing to do would be to create a /livez or /healthz endpoint which is served out of backup-restore container. If this endpoint has come up and responds with a 200 OK then etcd pod should be considered as LIVE which then prevents it from getting restarted. For readiness probe you can probably use etcdctl with --consistency=s which ensures that the traffic can be redirected to this etcd pod. For redirecting traffic to a single etcd instance it is not important to check if the cluster is healthy (i.e quorum is not lost). If the quorum is lost then the request will anyways get rejected if it has to by etcd member.

@timuthy
Copy link
Member

timuthy commented Jun 8, 2022

IMO using --consistency=s is non-deterministic as the time it takes to start the etcd process cannot be guaranteed. The correct thing to do would be to create a /livez or /healthz endpoint which is served out of backup-restore container.

Can you sketch how an implementation of /livez would look like to make it deterministic?

@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Jun 14, 2022
@ishan16696
Copy link
Member Author

Reopening this issue as livenessProbe can be improved to resolve this kind of scenario #280 (comment)

@ishan16696 ishan16696 reopened this Jun 20, 2022
@gardener-robot gardener-robot removed the status/closed Issue is closed (either delivered or triaged) label Jun 20, 2022
@timuthy
Copy link
Member

timuthy commented Jun 20, 2022

I'd be interested in knowing what is planned here. Do you already have something in mind?

@ishan16696
Copy link
Member Author

I reopened it for future reference as whenever we will upgrade our Etcd version(may be to 3.6) then we can also update the livenessProbe for etcd.

@timuthy
Copy link
Member

timuthy commented Jun 20, 2022

TBH, I don't get the motivation of having an open issue if there is no issue today (as claimed by #280 (comment)). If you want to keep this as a future reference then you can still take the information from this issue, even if it's closed.
At the moment, this issue represents an open item in #107 and gives the impression that work needs to be done on our end.

Either we see a problem with the readiness/liveness probes as they are configured today and discuss/implement a solution or we open a ☂️ issue for upgrading to etcd 3.6 and list all the requirements/changes once we plan to do so.

@ishan16696
Copy link
Member Author

ok, then feel free to close.

@timuthy timuthy closed this as completed Jun 22, 2022
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Jun 22, 2022
@ishan16696
Copy link
Member Author

Found out the upstream issue which is also described in this comment #280 (comment)
Mentioning this for future reference whenever we want to move to etcd v3.6

@ashwani2k ashwani2k added the release/beta Planned for Beta release of the Feature label Jul 6, 2022
@ishan16696 ishan16696 reopened this Sep 22, 2022
@gardener-robot gardener-robot removed the status/closed Issue is closed (either delivered or triaged) label Sep 22, 2022
@ishan16696
Copy link
Member Author

We had introduced the livenessProbe in multi-node etcd with this PR: #396
But we found out that livenessProbe was causing more harm than good. Then, we decided to remove the livenessProbe from etcd cluster with this PR: #423
So, I'm opening this issue just to track the liveness Probe for etcd cluster and whenever we will upgrade our etcd version then we can introduce the livenessProbe. Please refer to this upstream PR: etcd-io/etcd#13525

/on-hold

@gardener-robot gardener-robot added the status/on-hold Issue on hold (e.g. because work was suspended) label Sep 22, 2022
@ashwani2k ashwani2k added release/ga Planned for GA(General Availability) release of the Feature and removed release/beta Planned for Beta release of the Feature labels Oct 11, 2022
@abdasgupta abdasgupta added the priority/4 Priority (lower number equals higher priority) label Jan 6, 2023
@ishan16696
Copy link
Member Author

This issue needs to pick after #445 etcd version upgrade.

@shreyas-s-rao shreyas-s-rao added priority/1 Priority (lower number equals higher priority) and removed priority/4 Priority (lower number equals higher priority) labels May 3, 2023
@shreyas-s-rao
Copy link
Contributor

/assign @ishan16696

@ashwani2k ashwani2k added priority/2 Priority (lower number equals higher priority) and removed priority/1 Priority (lower number equals higher priority) labels May 26, 2023
@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Enhancement, improvement, extension lifecycle/stale Nobody worked on this for 6 months (will further age) priority/2 Priority (lower number equals higher priority) release/ga Planned for GA(General Availability) release of the Feature status/on-hold Issue on hold (e.g. because work was suspended)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants