☂️ [GEP-20] Highly Available Seed and Shoot Clusters #6529

shreyas-s-rao · 2022-08-18T18:52:09Z

How to categorize this issue?

/area high-availability
/kind enhancement

What would you like to be added:

This is an umbrella issue to track the implementation of GEP-20 Highly Available Shoot Control Planes.

Tasks

[GEP-20] Introduce shoot spec field for HA control planes, add validations #6530
Add validations for updating the shoot.spec.controlPlanes field
- Allow non-HA shoot -> HA shoot
- Only allow non-HA -> multi-zone if assigned seed is multi-zonal
- Single-zone HA shoot <-> multi-zone HA shoot must not be allowed
- HA shoot -> non-HA shoot must not be allowed (until etcd scale-down is implemented)
Enhance validations for shoot HA annotation
[GEP-20] Introducing Seed API changes for HA configuration #6723
- Introduce .spec.provider.zones in Seed, deprecate .spec.highAvailability and drop seed.gardener.cloud/multi-zonal label #6914
- Drop .spec.highAvailability from Seed API #6960
[GEP-20] Zone pinning for shoots on multi-zonal seeds 📌 #6579
- [GEP-20] Remove zone pinning #6934
~~Enhance Pod eviction in case of zone outage (delete Pods in Terminating state), see [GEP-20] Make shoot control plane components HA #6646 (comment)~~ (@ialidzhikov) -> see ☂️ [GEP-20] Highly Available Seed and Shoot Clusters #6529 (comment)
[GEP-20] Add Pod Topology Spread Constraint Webhook to gardener-resource-manager #6665
Configure high-availability settings for {seed system, shoot control plane, shoot system} components
- [GEP-20] Configure high-availability settings for gardenlet #6750
- [GEP-20] Configure high-availability settings for gardener-resource-manager #6685
- Introduce new high-availability-config webhook in gardener-resource-manager #6967
- Remainders/special cases
  - @MartinWeindel: vpn-seed-server and vpn-shoot, [GEP-20] High Availability for reversed VPN connection #6890
    - High availability deployment for VPN #6978
  - @ScheererJ: istio-ingressgateway
    - Reduce cross-zonal traffic in istio ingress gateway #6997
  - HighAvailabilityConfig webhook to handle HPA and HVPA objects #7105
- Adapt all extensions
  - provider-openstack: Adapt high-availability configuration gardener-extension-provider-openstack#515, released with v1.31.0
  - provider-equinix-metal: Adapt high-availability configuration gardener-extension-provider-equinix-metal#237, ~~released with v?~~ (no need to wait for it)
  - provider-vsphere: Adapt high-availability configuration gardener-attic/gardener-extension-provider-vsphere#321, released with v1.23.0
  - provider-aws: Adapt high-availability configuration gardener-extension-provider-aws#645, released with v1.41.0
  - provider-gcp: Adapt high-availability configuration gardener-extension-provider-gcp#521, released with v?
  - provider-azure: Adapt high-availability configuration gardener-extension-provider-azure#599, released with v1.33.0
  - provider-alicloud: Adapt high-availability configuration gardener-extension-provider-alicloud#545, released with v1.43.0
  - networking-cilium: Adapt high-availability configuration gardener-extension-networking-cilium#138, released with v1.20.0
  - networking-calico: Adapt high-availability configuration gardener-extension-networking-calico#222, released with v1.28.0
  - shoot-networking-filter: Adapt high-availability configuration gardener-extension-shoot-networking-filter#41, released with v0.9.0
  - shoot-networking-problem-detector: Adapt high-availability configuration gardener-extension-shoot-networking-problemdetector#37, released with v0.9.0
  - os-ubuntu: Adapt high-availability configuration gardener-extension-os-ubuntu#72, released with v1.20.0
  - os-gardenlinux: Adapt high-availability configuration gardener-extension-os-gardenlinux#70, released with v0.16.0
  - os-suse-chost: Adapt high-availability configuration gardener-extension-os-suse-chost#76, released with v1.20.0
  - os-coreos: Adapt high-availability configuration gardener-extension-os-coreos#52, released with v1.16.0
  - shoot-dns-service: Adapt high-availability configuration gardener-extension-shoot-dns-service#170, released with v1.28.0
  - shoot-cert-service: Adapt high-availability configuration gardener-extension-shoot-cert-service#138, released with v1.27.0
  - shoot-oidc-service: Adapt high-availability configuration gardener-extension-shoot-oidc-service#75, released with v0.15.0
  - runtime-gvisor: Adapt high-availability configuration gardener-extension-runtime-gvisor#66, released with v0.7.0
  - registry-cache: Adapt high-availability configuration gardener-extension-registry-cache#5, released with v0.1.0
  - kupid: Adapt high-availability configuration kupid#51, released with v0.5.0
  - ~~auditlog-extension: kubernetes/auditlog-extension#388, released with v?~~
Drop support for alpha.control-plane.shoot.gardener.cloud/high-availability annotation
- [GEP-20] Remove support for HA alpha annotation #6998
- Remove annotation entirely from code after v1.63 got released: Remove High Availability annotation support #7493
Switch default KCM --node-monitor-grace-period to 40s #7688
Change pod NotReady/Unreachable tolerations from 300s to something much smaller, e.g. 60s #7689
Set topology spread constraint minDomains to number of zones #7690
Reintroduce seed zones admission checks #7695
Support control-plane migration for HA shoots: [CPM] Adds restoration of etcd-main for HA shoots #7626
- Add e2e test for CP migration of HA Shoot: [CPM] Adds make targets and e2e test for migration of HA Shoots #7742
- Enable the e2e test for CP migration of HA Shoot in ci-infra/prow: Adds a job to run cpm tests for ha shoots ci-infra#705
Means to by-pass readiness checks in reconciliation flow #7224
Promote HAControlPlanes feature gate to beta #7867
Promote HAControlPlanes feature gate to GA: Maintain feature gates #8008
Drop GA-ed HAControlPlanes and FullNetworkPoliciesInRuntimeCluster feature gates #8083

The text was updated successfully, but these errors were encountered:

shreyas-s-rao · 2022-08-18T19:04:49Z

/assign
/assign @timuthy

ashwani2k · 2022-08-22T09:48:54Z

Introduce shoot spec field for enabling HA control planes

Add validations for updating the shoot.spec.controlPlanes field

Allow non-HA shoot -> HA shoot

Only allow non-HA -> multi-zone if assigned seed is multi-zonal

Single-zone HA shoot <-> multi-zone HA shoot must not be allowed

HA shoot -> non-HA shoot must not be allowed (until etcd scale-down is implemented)

This needs some modifications along with a change required for Seed.
Change needs to be part of GEP -> GEP enhancement PR | Implementation--> Review

We reprioritised in discussion with @timuthy

Support zone-pinning for single-zone control HA planes (via GRM mutating webhook)

ashwani2k · 2022-09-08T04:15:06Z

Below bullet points highlight the api contract for introducing HA control planes via -

  controlPlane:
    highAvailability:
      failureTolerance:
        type:  <node|zone>

non-HA shoots can be scheduled on non-HA or HA(multi-zone) seeds.
single-zone shoots can be scheduled on non-HA or HA(multi-zone) seeds.
multi-zone shoots can only be scheduled ONLY on HA(multi-zone) seeds.
non-HA shoots can be upgraded to single-zone on non-HA or HA seeds. **
non-HA shoots can be upgraded to multi-zone only on HA seeds. **
single-zone shoots shall not be allowed to upgrade to multi-zone shoots and shall be stopped by admission plugins.

** this can lead to a short disruption|downtime when etcd sts is rolled

Legend:
non-HA shoot : any shoot which has no faultTolerance defined.
single-zone shoot: any shoot which has faultTolerance defined as type node.
multi-zone shoot: any shoot which has faultTolerance defined as type zone.
non-HA seed: any seed which has worker pools for etcd/cpu running only on a single availability zone.
HA seed: any seed which has worker pools for etcd/cpu defined across 3 availability zones and has the label seed.gardener.cloud/multi-zonal: "true".

ialidzhikov · 2022-09-21T10:35:32Z

Wrt to Enhance Pod eviction in case of zone outage (delete Pods in Terminating state): for Deployments the kube-controller-manager behaviour is to create new Pods right away when the old Pods are Terminating.

Example:

Expand to see the Deployment!

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: nginx
        image: centos
        command: ["/bin/sh"]
        args: ["-c", "sleep 3600"]
        ports:
        - containerPort: 80

Above we have a Deployment. Its container does not handle SIGTERM and it will hang in Terminating until it is force killed after terminationGracePeriodSeconds.

$ k get po
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-746759f465-z95lj   1/1     Running   0          11m

$ k delete po nginx-deployment-746759f465-z95lj
pod "nginx-deployment-746759f465-z95lj" deleted

$ k get po
NAME                                READY   STATUS              RESTARTS   AGE
nginx-deployment-746759f465-z95lj   1/1     Terminating         0          11m
nginx-deployment-746759f465-ztz65   0/1     ContainerCreating   0          2s

$ k get po
NAME                                READY   STATUS        RESTARTS   AGE
nginx-deployment-746759f465-z95lj   1/1     Terminating   0          12m
nginx-deployment-746759f465-ztz65   1/1     Running       0          54s

Above you can see that when old replica is deleted, the new one is created right away.

I suspect that in the experiments of @unmarshall (ref #6287 (comment)) there was a webhook preventing creation of new Pods (for some unknown reason) or kube-controller-manager was down (for some unknown reason). These are the 2 potential things that could explain #6287 (comment).

Anyways, I will try to simulate a zone outage and check why KCM does not create the new Pods when the old ones are terminating.

ialidzhikov · 2022-09-22T09:59:13Z

We had a sync with @unmarshall and we are able to confirm that in a simulation of zone outage (simulated via network acl that denies all ingress and egress traffic for a zone) the recovery for a (multi-zone) control plane worked well as outlined in #6529 (comment):

For Deployments kube-controller-manager creates new replicas right away when the old replicas are terminating. The new replicas start successfully on a healthy zone.
I think during @unmarshall's simulations kube-controller-manager was down for some reason. I also revised the webhooks we deploy and whether we could have a deadlock situation that could block new Pod creation but I didn't see anything abnormal.

PS: We also found that the existing garbage-collector (shoot-care-controller of gardenlet) already deletes Terminating pods in the Shoot's control plane after 5min.

gardener/pkg/operation/care/garbage_collection.go

Lines 85 to 93 in 24b667c

    
           // PerformGarbageCollectionSeed performs garbage collection in the Shoot namespace in the Seed cluster 
        
           func (g *GarbageCollection) performGarbageCollectionSeed(ctx context.Context) error { 
        
           	podList := &corev1.PodList{} 
        
           	if err := g.seedClient.List(ctx, podList, client.InNamespace(g.shoot.SeedNamespace)); err != nil { 
        
           		return err 
        
           	} 
        
           	return g.deleteStalePods(ctx, g.seedClient, podList) 
        
           }

But this is not a recovery mechanism and does not bring to the recovery. For Deployments kube-controller-manager already creates the new replicas. For StatefulSets, even when the old Terminating replicas are forcefully deleted, this does not lead to a recovery as the new StatefulSet Pods fail to be scheduled - they have scheduling requirements that cannot be satisfied during the zone outage (etcd Pod to run on the outage zone or loki/prometheus Pods to run on the outage zone because their volume is already on provisioned on this zone).

TL;DR: We will resolve the corresponding item as completed as nothing has to be done. Let us know if you have additional comments on this topic. We have to update GEP-20 with the new learnings.

vlerenc · 2022-09-23T04:50:16Z

Great to hear! Thank you!

timuthy · 2022-09-30T08:23:09Z

I added another item Support control-plane migration for HA shoots since this doesn't seem to work out of the box. We should create a separate issue once we have more certainty and details and find a proper way to support this use-case.

cc @plkokanov @vlerenc

plkokanov · 2022-09-30T10:22:41Z

I added another item Support control-plane migration for HA shoots since this doesn't seem to work out of the box. We should create a separate issue once we have more certainty and details and find a proper way to support this use-case.

cc @plkokanov @vlerenc

Should we (for now) add validation that forbids migration for HA shoots?

timuthy · 2022-11-22T10:35:45Z

/assign @plkokanov @ishan16696
for tasks related to

Support control-plane migration for HA shoots

ishan16696 · 2023-02-06T06:49:32Z

/assign @plkokanov @ishan16696
for tasks related to

Please see the approaches possible to achieve CPM in multi-node etcd: gardener/etcd-druid#479 (comment)

rfranzke · 2023-05-16T08:53:58Z

All tasks have been completed.
/close

gardener-prow · 2023-05-16T08:54:03Z

@rfranzke: Closing this issue.

In response to this:

All tasks have been completed.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

gardener-prow bot added area/high-availability High availability related kind/enhancement Enhancement, improvement, extension labels Aug 18, 2022

shreyas-s-rao mentioned this issue Aug 18, 2022

[GEP-20] Introduce shoot spec field for HA control planes, add validations #6530

Merged

gardener-prow bot assigned shreyas-s-rao and timuthy Aug 18, 2022

timebertt pinned this issue Aug 19, 2022

This was referenced Sep 12, 2022

[GEP-20] Add Pod Topology Spread Constraint Webhook to gardener-resource-manager #6665

Merged

[GEP-20] Add Topology Spread Constraints for Kube-Apiserver #6674

Merged

[GEP-20] Configure high-availability settings for gardener-resource-manager #6685

Merged

unmarshall mentioned this issue Sep 21, 2022

[GEP-20] Introducing Seed API changes for HA configuration #6723

Merged

ialidzhikov mentioned this issue Sep 23, 2022

[GEP-20] Update the existing recovery mechanisms in the proposal #6732

Merged

timuthy mentioned this issue Sep 28, 2022

[GEP-20] Configure high-availability settings for gardenlet #6750

Merged

gardener-prow bot assigned ishan16696 and plkokanov Nov 22, 2022

MartinWeindel mentioned this issue Nov 24, 2022

Adapt to network-problem-detector v0.9.0 and updated dependencies gardener/gardener-extension-shoot-networking-problemdetector#40

Merged

timuthy mentioned this issue Jan 28, 2023

Reintroduce seed zones admission checks #7403

Merged

timuthy mentioned this issue Feb 15, 2023

Remove High Availability annotation support #7493

Merged

timuthy mentioned this issue Mar 9, 2023

Adapt high-availability configuration gardener/kupid#51

Merged

plkokanov mentioned this issue Mar 13, 2023

[CPM] Adds restoration of etcd-main for HA shoots #7626

Merged

timuthy mentioned this issue Mar 22, 2023

Reintroduce seed zones admission checks #7695

Merged

plkokanov mentioned this issue Mar 31, 2023

[CPM] Adds make targets and e2e test for migration of HA Shoots #7742

Merged

timuthy mentioned this issue May 3, 2023

Promote HAControlPlanes feature gate to beta #7867

Merged

plkokanov mentioned this issue May 4, 2023

Adds a job to run cpm tests for ha shoots gardener/ci-infra#705

Merged

rfranzke unpinned this issue May 15, 2023

gardener-prow bot closed this as completed May 16, 2023

rfranzke mentioned this issue Jun 1, 2023

Maintain feature gates #8008

Merged

rfranzke mentioned this issue Jun 16, 2023

Drop GA-ed HAControlPlanes and FullNetworkPoliciesInRuntimeCluster feature gates #8083

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

☂️ [GEP-20] Highly Available Seed and Shoot Clusters #6529

☂️ [GEP-20] Highly Available Seed and Shoot Clusters #6529

shreyas-s-rao commented Aug 18, 2022 •

edited by rfranzke

shreyas-s-rao commented Aug 18, 2022

ashwani2k commented Aug 22, 2022 •

edited

ashwani2k commented Sep 8, 2022 •

edited

ialidzhikov commented Sep 21, 2022 •

edited

ialidzhikov commented Sep 22, 2022 •

edited

vlerenc commented Sep 23, 2022

timuthy commented Sep 30, 2022

plkokanov commented Sep 30, 2022

timuthy commented Nov 22, 2022

ishan16696 commented Feb 6, 2023

rfranzke commented May 16, 2023

gardener-prow bot commented May 16, 2023

☂️ [GEP-20] Highly Available Seed and Shoot Clusters #6529

☂️ [GEP-20] Highly Available Seed and Shoot Clusters #6529

Comments

shreyas-s-rao commented Aug 18, 2022 • edited by rfranzke

Tasks

shreyas-s-rao commented Aug 18, 2022

ashwani2k commented Aug 22, 2022 • edited

ashwani2k commented Sep 8, 2022 • edited

ialidzhikov commented Sep 21, 2022 • edited

ialidzhikov commented Sep 22, 2022 • edited

vlerenc commented Sep 23, 2022

timuthy commented Sep 30, 2022

plkokanov commented Sep 30, 2022

timuthy commented Nov 22, 2022

ishan16696 commented Feb 6, 2023

rfranzke commented May 16, 2023

gardener-prow bot commented May 16, 2023

shreyas-s-rao commented Aug 18, 2022 •

edited by rfranzke

ashwani2k commented Aug 22, 2022 •

edited

ashwani2k commented Sep 8, 2022 •

edited

ialidzhikov commented Sep 21, 2022 •

edited

ialidzhikov commented Sep 22, 2022 •

edited