Skip to content

Commit

Permalink
feat(lifecycle-operator): automatically decide for scheduler installa…
Browse files Browse the repository at this point in the history
…tion based on k8s version (#2212)

Signed-off-by: realanna <anna.reale@dynatrace.com>
Signed-off-by: RealAnna <89971034+RealAnna@users.noreply.github.com>
Co-authored-by: Giovanni Liva <giovanni.liva@dynatrace.com>
Co-authored-by: Moritz Wiesinger <moritz.wiesinger@dynatrace.com>
Co-authored-by: Meg McRoberts <meg.mcroberts@dynatrace.com>
Co-authored-by: odubajDT <93584209+odubajDT@users.noreply.github.com>
  • Loading branch information
5 people committed Oct 3, 2023
1 parent 6945069 commit 25976ea
Show file tree
Hide file tree
Showing 16 changed files with 87 additions and 24 deletions.
6 changes: 6 additions & 0 deletions docs/assets/scheduler-gates/gate-removed.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v1
kind: Pod
metadata:
name: test-pod
annotations:
keptn.sh/scheduling-gate-removed: "true"
7 changes: 7 additions & 0 deletions docs/assets/scheduler-gates/gated.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
schedulingGates:
- name: "keptn-prechecks-gate"
6 changes: 6 additions & 0 deletions docs/assets/scheduler-gates/scheduler.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
schedulerName: keptn-scheduler
1 change: 0 additions & 1 deletion docs/content/en/docs/architecture/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ title: Architecture
linktitle: Architecture
description: Understand the details of how Keptn works
weight: 50
cascade:
---

### Keptn Components
1 change: 0 additions & 1 deletion docs/content/en/docs/architecture/components/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ title: Keptn Components
linktitle: Components
description: Basic understanding of Keptn Components
weight: 20
cascade:
---

### Keptn Components
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ title: Keptn Lifecycle Operator
linktitle: Lifecycle Operator
description: Basic understanding of the Keptn Lifecycle Operator
weight: 80
cascade:
---


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ title: Keptn Metrics Operator
linktitle: Metrics Operator
description: Basic understanding of Keptn's Metrics Operator
weight: 80
cascade:
---

The Keptn Metrics Operator collects, processes,
Expand Down
70 changes: 60 additions & 10 deletions docs/content/en/docs/architecture/components/scheduler/_index.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,51 @@
---
title: Keptn Lifecycle Scheduler
linktitle: Scheduler
description: Basic understanding of the Keptn Scheduler
title: Keptn integration with Scheduling
linktitle: Scheduler and Scheduling Gates
description: Basic understanding of how Keptn integrates with Kubernetes Pod Scheduling
weight: 80
cascade:
---

The **Keptn Scheduler** is an integral component of Keptn that orchestrates
the deployment process.
Keptn needs to integrate with Kubernetes scheduling to block
the deployment of applications that do not satisfy Keptn defined pre-deployment checks.

On Kubernetes versions 1.26 and older,
Keptn uses the **Keptn Scheduler** to block application deployment when appropriate
and orchestrate the deployment process.

If the Keptn helm chart value `schedulingGatesEnabled` is set to `true`, and Keptn is running on a Kubernetes version
greater than 1.26, Keptn does not install a scheduler plugin.
Instead, it uses
the [Pod Scheduling Readiness K8s API](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-scheduling-readiness)
to gate Pods until the required deployment checks pass.

## Keptn Scheduling Gates for K8s 1.27 and above

When you apply a workload to a K8s cluster,
the Mutating Webhook checks each Pod for annotations
to see if it is annotated with
[Keptn specific annotations](../../../implementing/integrate/#basic-annotations).
If the annotations are present, the Webhook adds a gate to the Pod called `keptn-prechecks-gate`.
This spec tells the Kubernetes scheduling framework
to wait for the Keptn checks before assigning the pod to a node.

For instance a pod gated by keptn looks like the following:

{{< embed path="/docs/assets/scheduler-gates/gated.yaml" >}}

The **WorkloadInstance CRD** contains information about the `pre-deployment` checks that
need to be performed before the Pod can be scheduled.
If the `pre-deployment` checks have finished successfully, the WorkloadInstance Controller removes the gate from the
Pod.
The scheduler can then allow the Pod to be scheduled to a node.
If the `pre-deployment` checks have not yet finished, the gate stays and the Pod remains in the pending state.
When removing the gate, the WorkloadInstance controller also adds the following annotation so that,
if the spec is updated,
the Pod is not gated again:

{{< embed path="/docs/assets/scheduler-gates/gate-removed.yaml" >}}

## Keptn Scheduler for K8s 1.26 and earlier

The **Keptn Scheduler** works by registering itself as a Permit plugin within the Kubernetes
scheduling cycle that ensures that Pods are scheduled to a node until and unless the
pre-deployment checks have finished successfully.
Expand All @@ -20,14 +58,18 @@ scheduler has (typically CPU and memory values).
The Keptn Scheduler uses the Kubernetes
[Scheduler Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/) and is based on the
[Scheduler Plugins Repository](https://github.com/kubernetes-sigs/scheduler-plugins/tree/master).
Additionally it registers itself as a [Permit plugin](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/#permit).
Additionally, it registers itself as
a [Permit plugin](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/#permit).

## How does the Keptn Scheduler works
### How does the Keptn Scheduler works

Firstly the Mutating Webhook checks for annotations on Pods to see if it is annotated with
[Keptn specific annotations](https://main.lifecycle.keptn.sh/docs/implementing/integrate/#basic-annotations).
[Keptn specific annotations](../../../implementing/integrate/#basic-annotations).
If the annotations are present, the Webhook assigns the **Keptn Scheduler** to the Pod.
This ensures that the Keptn Scheduler only gets Pods that have been annotated for it.
A Pod `test-pod` modified by the Mutating Webhook looks as follows:

{{< embed path="/docs/assets/scheduler-gates/scheduler.yaml" >}}

If the Pod is annotated with Keptn specific annotations, the Keptn Scheduler retrieves
the WorkloadInstance CRD that is associated with the Pod.
Expand All @@ -53,7 +95,8 @@ The Keptn Scheduler processes the following information from the WorkloadInstanc
- The status of the pre-deployment checks.
- The deadline for the pre-deployment checks to be completed.
- The Keptn Scheduler checks the status of the `pre-deployment` checks every 10 seconds.
If the checks have not finished successfully within 5 minutes, the Keptn Scheduler will not allow the Pod to be scheduled.
If the checks have not finished successfully within 5 minutes,
the Keptn Scheduler does not allow the Pod to be scheduled.

If all of the `pre-deployment` checks have finished successfully and the deadline has not been reached,
the Keptn Scheduler allows the Pod to be scheduled.
Expand All @@ -63,3 +106,10 @@ been reached, the Keptn Scheduler tells Kubernetes to check again later.
Also the Keptn Scheduler will not schedule Pods to nodes that have failed `pre-deployment`
checks in the past.
This helps to prevent Pods from being scheduled to nodes that are not ready for them.

## Integrating Keptn with your custom scheduler

Keptn scheduling logics are compatible with
the [Scheduler Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/).
Keptn does not work with a custom scheduler unless it is implemented as
a [scheduler plugin](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/#plugin-configuration).
3 changes: 1 addition & 2 deletions docs/content/en/docs/architecture/keptn-apps/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ title: KeptnApp and KeptnWorkload resources
linktitle: Keptn Applications and Keptn Workloads
description: How Keptn applications work
weight: 50
cascade:
---

## Keptn Workloads
Expand All @@ -19,7 +18,7 @@ and run pre- and post-deployment tasks.
In its state, it tracks the currently active `Workload Instances`,
(`Pod`, `DaemonSet`, `StatefulSet`, and `ReplicaSet` resources),
as well as the overall state of the Pre Deployment phase,
which the scheduler can use to determine
which Keptn can use to determine
whether the pods belonging to a workload
should be created and assigned to a node.
When it detects that the referenced object has reached its desired state
Expand Down
1 change: 0 additions & 1 deletion docs/content/en/docs/architecture/working/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ title: How Keptn works
linktitle: How Keptn works
description: Understand How Keptn Works
weight: 30
cascade:
---

### How does Keptn Work?
2 changes: 1 addition & 1 deletion docs/content/en/docs/implementing/otel.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ kubectl edit configmap otel-collector-conf \
```

When the `otel-collector` pod is up and running,
restart the `keptn-scheduler` and `lifecycle-operator`
restart the `keptn-scheduler` (if installed) and `lifecycle-operator`
so they can pick up the new configuration:

```shell
Expand Down
2 changes: 1 addition & 1 deletion docs/content/en/docs/migrate/strategy/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ Some key points:
* Keptn provides an operator
that can observe and orchestrate application-aware workload life cycles.
This operator leverages Kubernetes webhooks
and extends the Kubernetes scheduler
and the Kubernetes scheduler
to support pre- and post-deployment hooks.
When the operator detects a new version of a service
(implemented as a Kubernetes
Expand Down
4 changes: 2 additions & 2 deletions lifecycle-operator/chart/templates/deployment.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{{- if not .Values.schedulingGatesEnabled }}
{{- if or (le .Capabilities.KubeVersion.Minor "26") (not .Values.schedulingGatesEnabled) }}
---
apiVersion: v1
kind: ServiceAccount
Expand Down Expand Up @@ -167,7 +167,7 @@ spec:
tolerations: {{- include "tplvalues.render" (dict "value" .Values.tolerations "context" .) | nindent 8 }}
{{- end }}

{{- if not .Values.schedulingGatesEnabled }}
{{- if or (le .Capabilities.KubeVersion.Minor "26") (not .Values.schedulingGatesEnabled) }}
---
apiVersion: apps/v1
kind: Deployment
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{{- if not .Values.schedulingGatesEnabled }}
{{- if or (le .Capabilities.KubeVersion.Minor "26") (not .Values.schedulingGatesEnabled) }}
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{{- if not .Values.schedulingGatesEnabled }}
{{- if or (le .Capabilities.KubeVersion.Minor "26") (not .Values.schedulingGatesEnabled) }}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
Expand Down
2 changes: 1 addition & 1 deletion lifecycle-operator/chart/templates/scheduler-config.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{{- if not .Values.schedulingGatesEnabled }}
{{- if or (le .Capabilities.KubeVersion.Minor "26") (not .Values.schedulingGatesEnabled) }}
apiVersion: v1
kind: ConfigMap
metadata:
Expand Down

0 comments on commit 25976ea

Please sign in to comment.