New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add blog article about minReadySeconds for StatefulSets and maxSurge for DaemonSets #35440
[WIP] Add blog article about minReadySeconds for StatefulSets and maxSurge for DaemonSets #35440
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Pull request preview available for checkingBuilt without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify site settings. |
/retitle [WIP] Add blog article about minReadySeconds for StatefulSets and maxSurge for DaemonSets |
Related to #35538 |
and related to #35539 |
Hi from the Comms team! Just a reminder that the Ready to Review deadline for feature blogs is Tuesday, August 16. You will also be assigned a publication date. Is there anything we can do to help you right now? |
a96b5fc
to
794299f
Compare
|
||
### MinReadySeconds for StatefulSets | ||
`minReadySeconds` ensures that the statefulset workload is `Ready` for the given number of seconds before calling the | ||
pod `Available`. The notion of being `Ready` and `Available` is quiet important for workloads. For example, some workloads like Prometheus with multiple instances of Alertmanager should be considered `Available` only when the Alertmanager's state transfer is complete. `minReadySeconds` also helps when using loadbalancers with cloud providers. Since the pod should be `Ready` for the given number of seconds, it provides buffer time to prevent killing pods in rotation before new pods show up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pod `Available`. The notion of being `Ready` and `Available` is quiet important for workloads. For example, some workloads like Prometheus with multiple instances of Alertmanager should be considered `Available` only when the Alertmanager's state transfer is complete. `minReadySeconds` also helps when using loadbalancers with cloud providers. Since the pod should be `Ready` for the given number of seconds, it provides buffer time to prevent killing pods in rotation before new pods show up. | |
pod `Available`. The notion of being `Ready` and `Available` is quite important for workloads. For example, some workloads like Prometheus with multiple instances of Alertmanager should be considered `Available` only when the Alertmanager's state transfer is complete. `minReadySeconds` also helps when using loadbalancers with cloud providers. Since the pod should be `Ready` for the given number of seconds, it provides buffer time to prevent killing pods in rotation before new pods show up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the last sentence says it, but maybe we should mention specifically rollouts? We discussed something similar in the docs PR: #35539 (comment)
|
||
**Authors:** Ravi Gudimetla (Apple), Filip Krepensky (Red Hat), Maciej Szulik (Red Hat) | ||
|
||
This blog describes the two features namely `minReadySeconds for StatefulSets` and `maxSurge for Daemonsets` that sig-apps is happy to graduate to stable in 1.25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This blog describes the two features namely `minReadySeconds for StatefulSets` and `maxSurge for Daemonsets` that sig-apps is happy to graduate to stable in 1.25 | |
This blog describes the two features namely `minReadySeconds for StatefulSets` and `maxSurge for DaemonSets` that sig-apps is happy to graduate to stable in 1.25 |
|
||
You are required to download and install a kubectl greater than v1.22.0 version | ||
|
||
Specify a value for `minReadySeconds` for any StatefulSet and you check if pods are available or not by checking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify a value for `minReadySeconds` for any StatefulSet and you check if pods are available or not by checking | |
Specify a value for `minReadySeconds` for any StatefulSet and check if pods are available or not by inspecting |
|
||
|
||
### MaxSurge for DaemonSets | ||
`MaxSurge` allows a daemonset workload to run multiple instances of the same pod on a node during rollout to minimize the downtime of the daemonset to other consumers. Kubernetes system-level components like CNI, CSI are typically run as daemonsets. These components can have impact on the availablity of the workloads if those daemonsets go down momentarily during the upgrades. The feature allows daemonset pods to surge, there by ensuring zero-downtime for the daemonsets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`MaxSurge` allows a daemonset workload to run multiple instances of the same pod on a node during rollout to minimize the downtime of the daemonset to other consumers. Kubernetes system-level components like CNI, CSI are typically run as daemonsets. These components can have impact on the availablity of the workloads if those daemonsets go down momentarily during the upgrades. The feature allows daemonset pods to surge, there by ensuring zero-downtime for the daemonsets. | |
`MaxSurge` allows a daemonset workload to run multiple instances of the same pod on a node during rollout to minimize the downtime of the daemonset to other consumers. Kubernetes system-level components like CNI, CSI are typically run as daemonsets. These components can have impact on the availablity of the workloads if those daemonsets go down momentarily during the upgrades. The feature allows daemonset pods to surge, thereby ensuring zero-downtime for the daemonsets. |
### MaxSurge for DaemonSets | ||
|
||
Specify the update strategy to `RollingUpdate` and set `.spec.updateStrategy.rollingUpdate.maxSurge` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and observe a faster rollout and higher number of pods running at the same time in the next rollout
what about adding the observe part as well in some form?
slug: "sig-apps features graduating to stable in 1.25" | ||
--- | ||
|
||
**Authors:** Ravi Gudimetla (Apple), Filip Krepensky (Red Hat), Maciej Szulik (Red Hat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**Authors:** Ravi Gudimetla (Apple), Filip Krepensky (Red Hat), Maciej Szulik (Red Hat) | |
**Authors:** Ravi Gudimetla (Apple), Filip Krepinsky (Red Hat), Maciej Szulik (Red Hat) |
Hi from Comms! Your assigned publication date is September 16. Thank you! |
@ravisantoshgudimetla it'd be great to get this PR ready for review. Would you like help with that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd start with a simple use-case describing the necessity for both of these features and also add d add that this effort is to align higher-level workloads controllers between each other, since Deployments already support both of these features. Only then go into details how this works and how is implemented.
## What problems does these features solve? | ||
|
||
### MinReadySeconds for StatefulSets | ||
`minReadySeconds` ensures that the statefulset workload is `Ready` for the given number of seconds before calling the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep the resource names always upper case:
`minReadySeconds` ensures that the statefulset workload is `Ready` for the given number of seconds before calling the | |
`minReadySeconds` ensures that the StatefulSet workload is `Ready` for the given number of seconds before calling the |
|
||
|
||
### MaxSurge for DaemonSets | ||
`MaxSurge` allows a daemonset workload to run multiple instances of the same pod on a node during rollout to minimize the downtime of the daemonset to other consumers. Kubernetes system-level components like CNI, CSI are typically run as daemonsets. These components can have impact on the availablity of the workloads if those daemonsets go down momentarily during the upgrades. The feature allows daemonset pods to surge, there by ensuring zero-downtime for the daemonsets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`MaxSurge` allows a daemonset workload to run multiple instances of the same pod on a node during rollout to minimize the downtime of the daemonset to other consumers. Kubernetes system-level components like CNI, CSI are typically run as daemonsets. These components can have impact on the availablity of the workloads if those daemonsets go down momentarily during the upgrades. The feature allows daemonset pods to surge, there by ensuring zero-downtime for the daemonsets. | |
`MaxSurge` allows a DaemonSet workload to run multiple instances of the same pod on a node during rollout to minimize the downtime of the DaemonSet to other consumers. Kubernetes system-level components like CNI, CSI are typically run as DaemonSets. These components can have impact on the availability of the workloads if those DaemonSets go down momentarily during the upgrades. The feature allows DaemonSet pods to surge, there by ensuring zero-downtime for the DaemonSets. |
### MaxSurge for DaemonSets | ||
`MaxSurge` allows a daemonset workload to run multiple instances of the same pod on a node during rollout to minimize the downtime of the daemonset to other consumers. Kubernetes system-level components like CNI, CSI are typically run as daemonsets. These components can have impact on the availablity of the workloads if those daemonsets go down momentarily during the upgrades. The feature allows daemonset pods to surge, there by ensuring zero-downtime for the daemonsets. | ||
|
||
Please note that the usage of `HostPort` in conjunction with `MaxSurge` in daemonsets is not allowed as daemonset pods are tied to a single node and two active pods cannot share the same port on the same node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that the usage of `HostPort` in conjunction with `MaxSurge` in daemonsets is not allowed as daemonset pods are tied to a single node and two active pods cannot share the same port on the same node. | |
Please note that the usage of `HostPort` in conjunction with `MaxSurge` in DaemonSets is not allowed as DaemonSet pods are tied to a single node and two active pods cannot share the same port on the same node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And through the rest of the docs as well...
|
||
### MaxSurge for DaemonSets | ||
|
||
The `DaemonSet` controller creates the additional pods based on the value given in `.spec.strategy.rollingUpdate.maxSurge`. The additional pods would run on the same node where the old daemonset pod is running till the old pod gets killed. This value cannot be `0` when `MaxUnavailable` is 0. The default value is 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The `DaemonSet` controller creates the additional pods based on the value given in `.spec.strategy.rollingUpdate.maxSurge`. The additional pods would run on the same node where the old daemonset pod is running till the old pod gets killed. This value cannot be `0` when `MaxUnavailable` is 0. The default value is 0. | |
The DaemonSet controller creates the additional pods (above the desired number resulting from DaemonSet spec) based on the value given in `.spec.strategy.rollingUpdate.maxSurge`. The additional pods would run on the same node where the old daemonset pod is running till the old pod gets killed. This value cannot be `0` when `MaxUnavailable` is 0. The default value is 0. |
or something similar where we'd explicitly call out the fact that .maxSurge
is above the usual replicas.
|
||
### MinReadySeconds for StatefulSets | ||
|
||
You are required to download and install a kubectl greater than v1.22.0 version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Irrelevant, I'd drop it.
After a dicsussion with @ravisantoshgudimetla, I have started a new PR #36763 that tries to address the suggested changes from here. So that hopefully we can publish it soon. |
/close |
@soltysh: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
sig-apps is excited to promote 2 features to GA this release: