Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Readiness Probe in multi-node etcd #7

Open
Tracked by #107
ishan16696 opened this issue Jan 31, 2022 · 9 comments
Open
Tracked by #107

[Feature] Readiness Probe in multi-node etcd #7

ishan16696 opened this issue Jan 31, 2022 · 9 comments
Labels
kind/enhancement Enhancement, improvement, extension lifecycle/rotten Nobody worked on this for 12 months (final aging stage) priority/4 Priority (lower number equals higher priority) status/on-hold Issue on hold (e.g. because work was suspended)

Comments

@ishan16696
Copy link
Member

ishan16696 commented Jan 31, 2022

Feature (What you would like to be added):
Currently, readinessProbe of etcd is set to an endpoint /healthz of HTTP server running in a backup sidecar.
This behaviour needed to be updated or improved as readinessProbe of clustered-etcd should depend on whether there is etcd-leader present or not then only it should serve the incoming write requests.

Motivation (Why is this needed?):

Approach/Hint to the implement solution (optional):
Approaches :

  1. ETCDCTL_API=3 etcdctl endpoint health --endpoints=${ENDPOINTS} --command-timeout=Xs
    etcdctl endpoint health command performs a GET on the "health" key(source)

    • fails when there is no etcd leader or when Quorum is lost as GET request will fail if there is no etcd leader present.

    Advantages of this Method (etcdctl endpoint health).

    • We don't have to worry about such scenarios of causing outage as now snapshotter failure won't fails the readinessProbe of etcd.
    • If there is no Quorum present, kubelet will mark the etcd-members as NotReady and they won't able to serve the write as well as read requests.

    Disadvantages of this Method (etcdctl endpoint health).

    • Owner check feature depends on endpoint /healthz of HTTP server because when Owner check fails it fails the readinessProbe of etcd by setting the HTTP status to 503 but this Owner check in multi-node scenario is already being discussed here.
    • It completely decouples the snapshotter of backup sidecar and readinessProbe of etcd, backup sidecar won't able to control when to let the traffic come in.
  2. /health endpoint of etcd.
    /health endpoint returns false if one of the following conditions is met (source):

    • there is no etcd leader or leader-election is currently going on.
    • the latency of a QGET request exceeds 1sec

    Advantages and Disadvantage of Method 2 (/health endpoint).

    • similar to method 1.
  3. Use endpoint /healthz of HTTP server running in backup sidecar with modifications in such a way that whenever backup-restore leader is elected it should set HTTP server status to 200 for itself as well for all backup-restore followers and set the HTTP server status to 503 when there is no etcd-leader present.
    Advantages of this Method (/healthz).

    • We still have some coupling between snapshotter of backup sidecar and readinessProbe of etcd, backup sidecar will able to control when to let the traffic come in for etcd.

    Disadvantages of this Method (/healthz).

    • It will takes time to implement as well as to handle edge cases.

    Future Scope:

    • Go with method 2 as it give us flexibility to set the readinessProbe from backup-sidecar and switch to gRPC instead of sending REST requests.
@ishan16696
Copy link
Member Author

ishan16696 commented Jan 31, 2022

I'm open to change if we found new way to implement the ReadinessProbe for etcd.

cc @timuthy @abdasgupta @aaronfern @shreyas-s-rao

@timuthy
Copy link
Member

timuthy commented Jan 31, 2022

Thanks for the summary @ishan16696. I have two comments or suggestions in general:

  1. We should introduce a /readyz endpoint for readiness probes. It can have whatever implementation we think is the best w/o touching the pod template, i.e. we are very open to tweak the checks in the future.

  2. In general I prefer your first approach because it executes a "real" check (w/o being influenced or redirected to another member) in the member pod which makes the check simple and independent. The disadvantages you mentioned can be eliminated by covering everything relevant behind a /readyz endpoint.

@ishan16696
Copy link
Member Author

We should introduce a /readyz endpoint for readiness probes

I didn’t get how /readyz will be different from /healthz.
Do you want to execute etcdctl endpoint health from backup-restore and then set /readyz accordingly or something else?

@ishan16696
Copy link
Member Author

added One more method: /health endpoint of etcd
Although it is quite similar to Method 1 etcdctl endpoint health but I thought it is worth to mention.

@timuthy
Copy link
Member

timuthy commented Jan 31, 2022

My suggestion is just to hide the readiness checks (whatever they are) behind an endpoint because then we are and will be very flexible about the implementation behind it w/o changing the pod spec. Please lets not write something like etcdctl endpoint health directly to pod spec.

That endpoint can be called /readyz or we change /healthz which is also fine for me.

@ishan16696
Copy link
Member Author

Please lets not write something like etcdctl endpoint health directly to pod spec

yes, that’s why I updated the description of issue with one more approach of /health endpoint provided by etcd.

My suggestion is just to hide the readiness checks (whatever they are) behind an endpoint because then we are and will be very flexible about the implementation behind it w/o changing the pod spec

make sense

@timuthy
Copy link
Member

timuthy commented Jan 31, 2022

Discussed with @ishan16696, @shreyas-s-rao, @aaronfern the following plan to proceed:

  1. Etcd-Druid configures the following readiness probe in the StatefulSet if multi-node is used (.spec.replicas > 1):
spec:
  template:
    spec:
      containers:
        name: etcd
        readinessProbe:
          exec:
            command:
            - curl
            - --cert
            - /var/etcd/ssl/client/tls.crt
            - --key
            - /var/etcd/ssl/client/tls.key
            - --cacert
            - /var/etcd/ssl/ca/ca.crt
            - https://0.0.0.0:2379/health
  1. Once we support owner checks for the multi-node scenario, we modify the /healthz endpoint of the etcd-backup-restore sidecar to aggregate:
  • Calling etcd's built-in /health check.
  • Executing owner check.
  1. Etcd-Druid reverts readinessProbe to use /healthz again.

@ishan16696
Copy link
Member Author

ishan16696 commented May 24, 2022

As discussed in our internal sync that right now we have many unknowns since we don't know the what edge cases might arise in production w.r.t readinessProbe in multi-node etcd as well as multi-AZ etcd so, we can go ahead with the above approach(#7) for time being (unless current way of readiness probe show some serious issue) and later we can pick this up with better Knowledge of edge cases, testing scenarios as well as better Knowledge of features requirements.

cc @timuthy @abdasgupta @aaronfern

/on-hold

@gardener-robot gardener-robot added the status/on-hold Issue on hold (e.g. because work was suspended) label May 24, 2022
@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Jan 2, 2023
@abdasgupta abdasgupta added the priority/4 Priority (lower number equals higher priority) label Jan 5, 2023
@ishan16696
Copy link
Member Author

This issue needs to be revisited after gardener/etcd-druid#445 etcd version upgrade.

@ishan16696 ishan16696 transferred this issue from gardener/etcd-backup-restore Jun 27, 2023
@gardener-robot gardener-robot added lifecycle/rotten Nobody worked on this for 12 months (final aging stage) and removed lifecycle/stale Nobody worked on this for 6 months (will further age) labels Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Enhancement, improvement, extension lifecycle/rotten Nobody worked on this for 12 months (final aging stage) priority/4 Priority (lower number equals higher priority) status/on-hold Issue on hold (e.g. because work was suspended)
Projects
None yet
Development

No branches or pull requests

4 participants