docs: updated deprovsioning docs (#2943)

aws · Nov 30, 2022 · bb1a437 · bb1a437
1 parent 874c5f8
commit bb1a437
Show file tree

Hide file tree

Showing 32 changed files with 149 additions and 219 deletions.
diff --git a/website/content/en/preview/provisioner.md b/website/content/en/preview/provisioner.md
@@ -116,19 +116,7 @@ spec:
 
 ## Node deprovisioning
 
-If neither of these values are set, Karpenter will *not* delete instances. It is recommended to set the `ttlSecondsAfterEmpty` value, to enable scale down of the cluster.
-
-### spec.ttlSecondsAfterEmpty
-
-Setting a value here enables Karpenter to delete empty/unnecessary instances. DaemonSets are excluded from considering a node "empty". This value is in seconds.
-
-### spec.ttlSecondsUntilExpired
-
-Setting a value here enables node expiry. After nodes reach the defined age in seconds, they will be deleted, even if in use. This enables nodes to effectively be periodically "upgraded" by replacing them with newly provisioned instances.
-
-Note that Karpenter does not automatically add jitter to this value. If multiple instances are created in a small amount of time, they will expire at very similar times. Consider defining a [pod disruption budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) to prevent excessive workload disruption.
-
-
+You can configure Karpenter to deprovision instances through your Provisioner in multiple ways. You can use `spec.TTLSecondsAfterEmpty`, `spec.ttlSecondsUntilExpired` or `spec.consolidation.enabled`. Read [Deprovisioning](../tasks/deprovisioning/) for more.
 
 ## spec.requirements
 
@@ -271,7 +259,7 @@ For more information on the default `--system-reserved` and `--kube-reserved` co
 
 ### Eviction Thresholds
 
-The kubelet supports eviction thresholds by default. When enough memory or file system pressure is exerted on the node, the kubelet will begin to evict pods to ensure that system daemons and other system processes can continue to run in a healthy manner. 
+The kubelet supports eviction thresholds by default. When enough memory or file system pressure is exerted on the node, the kubelet will begin to evict pods to ensure that system daemons and other system processes can continue to run in a healthy manner.
 
 Kubelet has the notion of [hard evictions](https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/#hard-eviction-thresholds) and [soft evictions](https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/#soft-eviction-thresholds). In hard evictions, pods are evicted as soon as a threshold is met, with no grace period to terminate. Soft evictions, on the other hand, provide an opportunity for pods to be terminated gracefully. They do so by sending a termination signal to pods that are planning to be evicted and allowing those pods to terminate up to their grace period.
 

diff --git a/website/content/en/preview/tasks/deprovisioning.md b/website/content/en/preview/tasks/deprovisioning.md
@@ -35,6 +35,8 @@ example, if a cluster brings up all nodes at once, all the pods on those nodes w
 the same batching window on expiration.
 
 - Pods without an ownerRef (also called "controllerless" or "naked" pods) will be evicted during voluntary node disruption, such as expiration or consolidation. A pod with the annotation `karpenter.sh/do-not-evict: true` will cause its node to be opted out from voluntary node disruption workflows.
+
+- Using preferred anti-affinity and topology spreads can reduce the effectiveness of consolidation. At node launch, Karpenter attempts to satisfy affinity and topology spread preferences. In order to reduce node churn, consolidation must also attempt to satisfy these constraints to avoid immediately consolidating nodes after they launch. This means that consolidation may not deprovision nodes in order to avoid violating preferences, even if kube-scheduler can fit the host pods elsewhere.
 {{% /alert %}}
 
 * **Node deleted**: You could use `kubectl` to manually remove a single Karpenter node:
@@ -64,8 +66,8 @@ All the pod objects get deleted by a garbage collection process later, because t
 ## Consolidation
 
 Karpenter has two mechanisms for cluster consolidation:
-- Deletion - A node is eligible for deletion if all of its pods can run on free capacity of other nodes in the cluster.  
-- Replace - A node can be replaced if all of its pods can run on a combination of free capacity of other nodes in the cluster and a single cheaper replacement node. 
+- Deletion - A node is eligible for deletion if all of its pods can run on free capacity of other nodes in the cluster.
+- Replace - A node can be replaced if all of its pods can run on a combination of free capacity of other nodes in the cluster and a single cheaper replacement node.
 
 Consolidation has three mechanisms that are performed in order to attempt to identify a consolidation action:
 1) Empty Node Consolidation - Delete any entirely empty nodes in parallel
@@ -81,7 +83,7 @@ When there are multiple nodes that could be potentially deleted or replaced, Kar
 * nodes with lower priority pods
 
 {{% alert title="Note" color="primary" %}}
-For spot nodes, Karpenter only uses the deletion consolidation mechanism.  It will not replace a spot node with a cheaper spot node.  Spot instance types are selected with the `price-capacity-optimized` strategy and often the cheapest spot instance type is not launched due to the likelihood of interruption. Consolidation would then replace the spot instance with a cheaper instance negating the `price-capacity-optimized` strategy entirely and increasing interruption rate.  
+For spot nodes, Karpenter only uses the deletion consolidation mechanism.  It will not replace a spot node with a cheaper spot node.  Spot instance types are selected with the `price-capacity-optimized` strategy and often the cheapest spot instance type is not launched due to the likelihood of interruption. Consolidation would then replace the spot instance with a cheaper instance negating the `price-capacity-optimized` strategy entirely and increasing interruption rate.
 {{% /alert %}}
 
 ## Interruption

diff --git a/website/content/en/v0.16.0/provisioner.md b/website/content/en/v0.16.0/provisioner.md
@@ -19,14 +19,14 @@ spec:
   # that can't be removed.  Mutually exclusive with the ttlSecondsAfterEmpty parameter.
   consolidation:
     enabled: true
-    
+
   # If omitted, the feature is disabled and nodes will never expire.  If set to less time than it requires for a node
   # to become ready, the node may expire before any pods successfully start.
   ttlSecondsUntilExpired: 2592000 # 30 Days = 60 * 60 * 24 * 30 Seconds;
 
   # If omitted, the feature is disabled, nodes will never scale down due to low utilization
   ttlSecondsAfterEmpty: 30
-  
+
   # Priority given to the provisioner when the scheduler considers which provisioner
   # to select. Higher weights indicate higher priority when comparing provisioners.
   # Specifying no weight is equivalent to specifying a weight of 0.
@@ -91,19 +91,7 @@ spec:
 
 ## Node deprovisioning
 
-If neither of these values are set, Karpenter will *not* delete instances. It is recommended to set the `ttlSecondsAfterEmpty` value, to enable scale down of the cluster.
-
-### spec.ttlSecondsAfterEmpty
-
-Setting a value here enables Karpenter to delete empty/unnecessary instances. DaemonSets are excluded from considering a node "empty". This value is in seconds.
-
-### spec.ttlSecondsUntilExpired
-
-Setting a value here enables node expiry. After nodes reach the defined age in seconds, they will be deleted, even if in use. This enables nodes to effectively be periodically "upgraded" by replacing them with newly provisioned instances.
-
-Note that Karpenter does not automatically add jitter to this value. If multiple instances are created in a small amount of time, they will expire at very similar times. Consider defining a [pod disruption budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) to prevent excessive workload disruption.
-
-
+You can configure Karpenter to deprovision instances through your Provisioner in multiple ways. You can use `spec.TTLSecondsAfterEmpty`, `spec.ttlSecondsUntilExpired` or `spec.consolidation.enabled`. Read [Deprovisioning](./tasks/deprovisioning.md) for more.
 
 ## spec.requirements
 
@@ -191,7 +179,7 @@ Karpenter also allows `karpenter.sh/capacity-type` to be used as a topology key
 
 ## spec.weight
 
-Karpenter allows you to describe provisioner preferences through a `weight` mechanism similar to how weight is described with [pod and node affinities](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity). 
+Karpenter allows you to describe provisioner preferences through a `weight` mechanism similar to how weight is described with [pod and node affinities](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity).
 
 For more information on weighting provisioners, see the [Weighting Provisioners section](../tasks/scheduling#weighting-provisioners) in the scheduling details.
 

diff --git a/website/content/en/v0.16.0/tasks/deprovisioning.md b/website/content/en/v0.16.0/tasks/deprovisioning.md
@@ -32,6 +32,10 @@ default values for them and will not terminate nodes for that purpose.
 - Keep in mind that a small NodeExpiry results in a higher churn in cluster activity. So, for
 example, if a cluster brings up all nodes at once, all the pods on those nodes would fall into
 the same batching window on expiration.
+
+- Note that Karpenter does not automatically add jitter to this value. If multiple instances are created in a small amount of time, they will expire at very similar times. Consider defining a [pod disruption budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) to prevent excessive workload disruption.
+
+- Using preferred anti-affinity and topology spreads can reduce the effectiveness of consolidation. At node launch, Karpenter attempts to satisfy affinity and topology spread preferences. In order to reduce node churn, consolidation must also attempt to satisfy these constraints to avoid immediately consolidating nodes after they launch. This means that consolidation may not deprovision nodes in order to avoid violating preferences, even if kube-scheduler can fit the host pods elsewhere.
 {{% /alert %}}
 
 * **Node deleted**: You could use `kubectl` to manually remove a single Karpenter node:
@@ -62,8 +66,8 @@ All the pod objects get deleted by a garbage collection process later, because t
 
 
 Karpenter has two mechanisms for cluster consolidation:
-- Deletion - A node is eligible for deletion if all of its pods can run on free capacity of other nodes in the cluster.  
-- Replace - A node can be replaced if all of its pods can run on a combination of free capacity of other nodes in the cluster and a single cheaper replacement node. 
+- Deletion - A node is eligible for deletion if all of its pods can run on free capacity of other nodes in the cluster.
+- Replace - A node can be replaced if all of its pods can run on a combination of free capacity of other nodes in the cluster and a single cheaper replacement node.
 
 When there are multiple nodes that could be potentially deleted or replaced, Karpenter choose to consolidate the node that overall disrupts your workloads the least by preferring to terminate:
 
@@ -72,7 +76,7 @@ When there are multiple nodes that could be potentially deleted or replaced, Kar
 * nodes with lower priority pods
 
 {{% alert title="Note" color="primary" %}}
-For spot nodes, Karpenter only uses the deletion mechanism for consolidation.  It will not replace a spot node with a cheaper spot node.  Spot instance types are selected with the `capacity-optimized-prioritized` strategy and often the cheapest spot instance type is not launched due to the likelihood of interruption. Consolidation would then replace the spot instance with a cheaper instance negating the `capacity-optimized-prioritized` strategy entirely and increasing interruption rate.  
+For spot nodes, Karpenter only uses the deletion mechanism for consolidation.  It will not replace a spot node with a cheaper spot node.  Spot instance types are selected with the `capacity-optimized-prioritized` strategy and often the cheapest spot instance type is not launched due to the likelihood of interruption. Consolidation would then replace the spot instance with a cheaper instance negating the `capacity-optimized-prioritized` strategy entirely and increasing interruption rate.
 {{% /alert %}}
 
 ## What can cause deprovisioning to fail?
@@ -104,7 +108,7 @@ Review what [disruptions are](https://kubernetes.io/docs/concepts/workloads/pods
 
 ### Pod set to do-not-evict
 
-If a pod exists with the annotation `karpenter.sh/do-not-evict: true` on a node, and a request is made to delete the node, Karpenter will not drain any pods from that node or otherwise try to delete the node. Nodes that have pods with a `do-not-evict` annotation are not considered for consolidation, though their unused capacity is considered for the purposes of running pods from other nodes which can ber consolidated. This annotation will have no effect for static pods, pods that tolerate `NoSchedule`, or pods terminating past their graceful termination period. 
+If a pod exists with the annotation `karpenter.sh/do-not-evict: true` on a node, and a request is made to delete the node, Karpenter will not drain any pods from that node or otherwise try to delete the node. Nodes that have pods with a `do-not-evict` annotation are not considered for consolidation, though their unused capacity is considered for the purposes of running pods from other nodes which can ber consolidated. This annotation will have no effect for static pods, pods that tolerate `NoSchedule`, or pods terminating past their graceful termination period.
 
 This is useful for pods that you want to run from start to finish without interruption.
 Examples might include a real-time, interactive game that you don't want to interrupt or a long batch job (such as you might have with machine learning) that would need to start over if it were interrupted.

diff --git a/website/content/en/v0.16.0/tasks/provisioning.md b/website/content/en/v0.16.0/tasks/provisioning.md
@@ -101,7 +101,8 @@ kind: Provisioner
 metadata:
   name: gpu
 spec:
-  ttlSecondsAfterEmpty: 60
+  consolidation:
+    enabled: true
   requirements:
   - key: node.kubernetes.io/instance-type
     operator: In
@@ -123,7 +124,8 @@ kind: Provisioner
 metadata:
   name: cilium-startup
 spec:
-  ttlSecondsAfterEmpty: 60
+  consolidation:
+    enabled: true
   startupTaints:
   - key: node.cilium.io/agent-not-ready
     value: "true"

diff --git a/website/content/en/v0.16.1/provisioner.md b/website/content/en/v0.16.1/provisioner.md
@@ -91,19 +91,7 @@ spec:
 
 ## Node deprovisioning
 
-If neither of these values are set, Karpenter will *not* delete instances. It is recommended to set the `ttlSecondsAfterEmpty` value, to enable scale down of the cluster.
-
-### spec.ttlSecondsAfterEmpty
-
-Setting a value here enables Karpenter to delete empty/unnecessary instances. DaemonSets are excluded from considering a node "empty". This value is in seconds.
-
-### spec.ttlSecondsUntilExpired
-
-Setting a value here enables node expiry. After nodes reach the defined age in seconds, they will be deleted, even if in use. This enables nodes to effectively be periodically "upgraded" by replacing them with newly provisioned instances.
-
-Note that Karpenter does not automatically add jitter to this value. If multiple instances are created in a small amount of time, they will expire at very similar times. Consider defining a [pod disruption budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) to prevent excessive workload disruption.
-
-
+You can configure Karpenter to deprovision instances through your Provisioner in multiple ways. You can use `spec.TTLSecondsAfterEmpty`, `spec.ttlSecondsUntilExpired` or `spec.consolidation.enabled`. Read [Deprovisioning](../tasks/deprovisioning/) for more.
 
 ## spec.requirements