Skip to content

Latest commit

 

History

History
233 lines (160 loc) · 19.5 KB

controller-manager.md

File metadata and controls

233 lines (160 loc) · 19.5 KB

Gardener Controller Manager

The Gardener Controller Manager (often refered to as "GCM") is a component that runs next to the Gardener API server, similar to the Kubernetes Controller Manager. It runs several control loops that do not require talking to any seed or shoot cluster. Also, as of today it exposes a HTTPS server that is serving several endpoints for webhooks for certain resources.

This document explains the various functionalities of the Gardener Controller Manager and their purpose.

Control Loops

Project Controller

This controller consists out of three reconciliation loops: The main loop is reconciling Project resources while the second loop is controlling the necessary actions for stale projects.

"Main" Reconciler

This reconciler will create a dedicated Namespace prefixed with garden- for each Project resource. The name of the namespace can either be stated in the .spec.namespace, or it will be auto-generated by the reconciler. If .spec.namespace is set then it creates it if it does not exist yet. Otherwise, it tries to adopt it. This will only succeed if the Namespace was previously labeled with gardener.cloud/role=project and project.gardener.cloud/name=<project-name>. This is to prevent that end-users can adopt arbitrary namespaces and escalate their privileges, e.g. the kube-system namespace.

After the namespace was created/adopted the reconciler creates several ClusterRoles and ClusterRoleBindings that allow the project members to access related resources based on their roles. These RBAC resources are prefixed with gardener.cloud:system:project{-member,-viewer}:<project-name>. Gardener administrators and extension developers can define their own roles, see this document for more information.

In addition, operators can configure the Project controller to maintain a default ResourceQuota for project namespaces. Quotas can especially limit the creation of user facing resources, e.g. Shoots, SecretBindings, Secrets and thus protect the Garden cluster from massive resource exhaustion but also enable operators to align quotas with respective enterprise policies.

⚠️ Gardener itself is not exempted from configured quotas. For example, Gardener creates Secrets for every shoot cluster in the project namespace and at the same time increases the available quota count. Please mind this additional resource consumption.

The GCM configuration provides a template section controllers.project.quotas where such a ResourceQuota (see example below) can be deposited.

controllers:
  project:
    quotas:
    - config:
        apiVersion: v1
        kind: ResourceQuota
        spec:
          hard:
            count/shoots.core.gardener.cloud: "100"
            count/secretbindings.core.gardener.cloud: "10"
            count/secrets: "800"
      projectSelector: {}

The Project controller takes the shown config and creates a ResourceQuota with the name gardener in the project namespace. If a ResourceQuota resource with the name gardener already exists, the controller will only update fields in spec.hard which are unavailable at that time. Labels and annotations on the ResourceQuota config get merged with the respective fields on existing ResourceQuotas. An optional projectSelector narrows down the amount of projects that are equipped with the given config. If multiple configs match for a project, then only the first match in the list is applied to the project namespace.

The .status.phase of the Project resources will be set to Ready or Failed by the reconciler to indicate whether the reconciliation loop was performed successfully. Also, it will generate Events to provide further information about its operations.

"Stale Projects" Reconciler

As Gardener is a large-scale Kubernetes as a Service it is designed for being used by a large amount of end-users. Over time, it is likely to happen that some of the hundreds or thousands of Project resources are no longer actively used.

Gardener offers the "stale projects" reconciler which will take care of identifying such stale projects, marking them with a "warning", and eventually deleting them after a certain time period. This reconciler is enabled by default and works as following:

  1. Projects are considered as "stale"/not actively used when all of the following conditions apply: The namespace associated with the Project does not have any...
    1. Shoot resources.
    2. Plant resources.
    3. BackupEntry resources.
    4. Secret resources that are referenced by a SecretBinding that is in use by a Shoot (not necessarily in the same namespace).
    5. Quota resources that are referenced by a SecretBinding that is in use by a Shoot (not necessarily in the same namespace).
    6. The time period when the project was used for the last time (status.lastActivityTimestamp) is longer than the configured minimumLifetimeDays

If a project is considered "stale" then its .status.staleSinceTimestamp will be set to the time when it was first detected to be stale. If it gets actively used again this timestamp will be removed. After some time the .status.staleAutoDeleteTimestamp will be set to a timestamp after which Gardener will auto-delete the Project resource if it still is not actively used.

The component configuration of the Gardener Controller Manager offers to configure the following options:

  • minimumLifetimeDays: Don't consider newly created Projects as "stale" too early to give people/end-users some time to onboard and get familiar with the system. The "stale project" reconciler won't set any timestamp for Projects younger than minimumLifetimeDays. When you change this value then projects marked as "stale" may be no longer marked as "stale" in case they are young enough, or vice versa.
  • staleGracePeriodDays: Don't compute auto-delete timestamps for stale Projects that are unused for only less than staleGracePeriodDays. This is to not unnecessarily make people/end-users nervous "just because" they haven't actively used their Project for a given amount of time. When you change this value then already assigned auto-delete timestamps may be removed again if the new grace period is not yet exceeded.
  • staleExpirationTimeDays: Expiration time after which stale Projects are finally auto-deleted (after .status.staleSinceTimestamp). If this value is changed and an auto-delete timestamp got already assigned to the projects then the new value will only take effect if it's increased. Hence, decreasing the staleExpirationTimeDays will not decrease already assigned auto-delete timestamps.

Gardener administrators/operators can exclude specific Projects from the stale check by annotating the related Namespace resource with project.gardener.cloud/skip-stale-check=true.

"Activity" Reconciler

Since the other two reconcilers are unable to actively monitor the relevant objects that are used in a Project (Shoot, Plant, etc.), there could be a situation where the user creates and deletes objects in a short period of time. In that case the Stale Project Reconciler could not see that there was any activity on that project and it will still mark it as a Stale, even though it is actively used.

The Project Activity Reconciler is implemented to take care of such cases. An event handler will notify the reconciler for any acitivity and then it will update the status.lastActivityTimestamp. This update will also trigger the Stale Project Reconciler.

Event Controller

With the Gardener Event Controller you can prolong the lifespan of events related to Shoot clusters. This is an optional controller which will become active once you provide the below mentioned configuration.

All events in K8s are deleted after a configurable time-to-live (controlled via a kube-apiserver argument called --event-ttl (defaulting to 1 hour)). The need to prolong the time-to-live for Shoot cluster events frequently arises when debugging customer issues on live systems. This controller leaves events involving Shoots untouched while deleting all other events after a configured time. In order to activate it, provide the following configuration:

  • concurrentSyncs: The amount of goroutines scheduled for reconciling events.
  • ttlNonShootEvents: When an event reaches this time-to-live it gets deleted unless it is a Shoot-related event (defaults to 1h, equivalent to the event-ttl default).

⚠️ In addition, you should also configure the --event-ttl for the kube-apiserver to define an upper-limit of how long Shoot-related events should be stored. The --event-ttl should be larger than the ttlNonShootEvents or this controller will have no effect.

Shoot Reference Controller

Shoot objects may specify references to further objects in the Garden cluster which are required for certain features. For example, users can configure various DNS providers via .spec.dns.providers and usually need to refer to a corresponding secret with valid DNS provider credentials inside. Such objects need a special protection against deletion requests as long as they are still being referenced by one or multiple shoots.

Therefore, the Shoot Reference Controller scans shoot clusters for referenced objects and adds the finalizer gardener.cloud/reference-protection to their .metadata.finalizers list. The scanned shoot also gets this finalizer to enable a proper garbage collection in case the Gardener-Controller-Manager is offline at the moment of an incoming deletion request. When an object is not actively referenced anymore because the shoot specification has changed or all related shoots were deleted (are in deletion), the controller will remove the added finalizer again, so that the object can safely be deleted or garbage collected.

The Shoot Reference Controller inspects the following references:

  • DNS provider secrets (.spec.dns.provider)
  • Audit policy configmaps (.spec.kubernetes.kubeAPIServer.auditConfig.auditPolicy.configMapRef)

Further checks might be added in the future.

Shoot Retry Controller

The Shoot Retry Controller is responsible for retrying certain failed Shoots. Currently the controller retries only failed Shoots with error code ERR_INFRA_RATE_LIMITS_EXCEEDED.

Seed Controller

The Seed controller in the Gardener Controller Manager reconciles Seed objects with the help of the following reconcilers.

"Main" Reconciler

This reconciliation loop takes care about seed related operations in the Garden cluster. When a new Seed object is created the reconciler creates a new Namespace in the garden cluster seed-<seed-name>. Namespaces dedicated to single seed clusters allow us to segregate access permissions i.e., a Gardenlet must not have permissions to access objects in all Namespaces in the Garden cluster. There are objects in a Garden environment which are created once by the operator e.g., default domain secret, alerting credentials, and required for operations happening in the Gardenlet. Therefore, we not only need a seed specific Namespace but also a copy of these "shared" objects.

The "main" reconciler takes care about this replication:

Kind Namespace Label Selector
Secret garden gardener.cloud/role

"Backup Bucket" Reconciler

Every time a BackupBucket object is created or updated, the referenced Seed object is enqueued for reconciliation. It's the reconciler's task to check the status subresource of all existing BackupBuckets that belong to this seed. If at least one BackupBucket has .status.lastError, the seed condition BackupBucketsReady will turn false and consequently the seed is considered as NotReady. Once the BackupBucket is healthy again, the seed will be re-queued and the condition will turn true.

"Lifecycle" Reconciler

The "Lifecycle" reconciler processes Seed objects which are enqueued every 10 seconds in order to check if the responsible Gardenlet is still responding and operable. Therefore, it checks renewals via Lease objects of the seed in the garden cluster which are renewed regularly by the Gardenlet.

In case a Lease is not renewed for the configured amount in config.controllers.seed.monitorPeriod.duration:

  1. The reconciler assumes that the Gardenlet stopped operating and updates the GardenletReady condition to Unknown.
  2. Additionally, conditions and constraints of all Shoot resources scheduled on the affected seed are set to Unknown as well because a striking Gardenlet won't be able to maintain these conditions any more.
  3. If the gardenlet's client certificate has expired (identified based on the .status.clientCertificateExpirationTimestamp field in the Seed resource) and if it is managed by a ManagedSeed then this will be triggered for a reconciliation. This will trigger the bootstrapping process again and allows gardenlets to obtain a fresh client certificate.

ControllerRegistration Controller

The ControllerRegistration controller makes sure that the required Gardener extensions specified by the ControllerRegistration resources are present in the seed clusters. It also takes care of the creation and deletion of ControllerInstallation objects for a given seed cluster. The controller has three reconciliation loops.

"Main" Reconciler

This reconciliation loop watches the Seed objects and determines which ControllerRegistrations are required for them and creates/deletes the corresponding extension controller to reach the determined state. To begin with, it computes the kind/type combinations of extensions required for the seed. For this, the controller examines a live list of ControllerRegistrations, ControllerInstallations, BackupBuckets, BackupEntrys, Shoots, and Secrets from the garden cluster. For example, it examines the shoots running on the seed and deducts kind/type like Infrastructure/gcp. It also decides whether they should always be deployed based on the .spec.deployment.policy. For the configuration options, please see this section.

Based on these required combinations, each of them are mapped to ControllerRegistration objects and then to their corresponding ControllerInstallation objects (if existing). The controller then creates or updates the required ControllerInstallation objects for the given seed. It also deletes every existing ControllerInstallation whose referenced ControllerRegistration is not part of the required list. For example, if the shoots in the seed are no longer using the DNS provider aws-route53, then the controller proceeds to delete the respective ControllerInstallation object.

"ControllerRegistration" Reconciler

This reconciliation loop watches the ControllerRegistration resource and adds finalizers to it when they are created. In case a deletion request comes in for the resource, i.e., if a .metadata.deletionTimestamp is set, it actively scans for a ControllerInstallation resource using this ControllerRegistration, and decides whether the deletion can be allowed. In case no related ControllerInstallation is present, it removes the finalizer and marks it for deletion.

"Seed" Reconciler

This loop also watches the Seed object and adds finalizers to it at creation. If a .metadata.deletionTimestamp is set for the seed then the controller checks for existing ControllerInstallation objects which reference this seed. If no such objects exist then it removes the finalizer and allows the deletion.

"CertificateSigningRequest" controller

After the gardenlet gets deployed on the Seed cluster it needs to establish itself as a trusted party to communicate with the Gardener API server. It runs through a bootstrap flow similar to the kubelet bootstrap process.

On startup the gardenlet uses a kubeconfig with a bootstrap token which authenticates it as being part of the system:bootstrappers group. This kubeconfig is used to create a CertificateSigningRequest (CSR) against the Gardener API server.

The controller in gardener-controller-manager checks whether the CertificateSigningRequest has the expected organisation, common name and usages which the gardenlet would request.

It only auto-approves the CSR if the client making the request is allowed to "create" the certificatesigningrequests/seedclient subresource. Clients with the system:bootstrappers group are bound to the gardener.cloud:system:seed-bootstrapper ClusterRole, hence, they have such privileges. As the bootstrap kubeconfig for the gardenlet contains a bootstrap token which is authenticated as being part of the systems:bootstrappers group, its created CSR gets auto-approved.

"Bastion" Controller

Bastion resources have a limited lifetime, which can be extended up to a certain amount by performing a heartbeat on them. The Bastion controller is responsible for deleting expired or rotten Bastions.

  • "expired" means a Bastion has exceeded its status.ExpirationTimestamp.
  • "rotten" means a Bastion is older than the configured maxLifetime.

The maxLifetime is an option on the Bastion controller and defaults to 24 hours.

The deletion triggers the gardenlet to perform the necessary cleanups in the Seed cluster, so some time can pass between deletion and the Bastion actually disappearing. Clients like gardenctl are advised to not re-use Bastions whose deletion timestamp has been set already.

Refer to GEP-15 for more information on the lifecycle of Bastion resources.

"Plant" Controller

Using the Plant resource, an external Kubernetes cluster (not managed by Gardener) can be registered to Gardener. Gardener Controller Manager is the component that is responsible for the Plant resource reconciliation. As part of the reconciliation loop, the Gardener Controller Manager performs health checks on the external Kubernetes cluster and gathers more information about it - all of this information serves for monitoring purposes of the external Kubernetes cluster.

The component configuration of the Gardener Controller Manager offers to configure the following options for the plant controller:

  • syncPeriod: The duration of how often the Plant resource is reconciled, i.e., how often health checks are performed. The default value is 30s.
  • concurrentSyncs: The number of goroutines scheduled for reconciling events, i.e., the number of possible parallel reconciliations. The default value is 5.

The Plant resource reports the following information for the external Kubernetes cluster:

  • Cluster information
    • Cloud provider information - the cloud provider type and region are maintained in the Plant status (.status.clusterInfo.cloud).
    • Kubernetes version - the Kubernetes version is maintained in the Plant status (.status.clusterInfo.kubernetes.version).
  • Cluster status
    • API Server availability - maintained as condition with type APIServerAvailable.
    • Cluster Nodes healthiness - maintained as condition with type EveryNodeReady.