Skip to content

Introduce Node Lifecycle WG #8396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 24, 2025

Conversation

atiratree
Copy link
Member

No description provided.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/community-management area/slack-management Issues or PRs related to the Slack Management subproject labels Mar 24, 2025
@k8s-ci-robot k8s-ci-robot requested review from ahg-g and ardaguclu March 24, 2025 12:17
@k8s-ci-robot k8s-ci-robot added committee/steering Denotes an issue or PR intended to be handled by the steering committee. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Mar 24, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Scheduling Mar 24, 2025
@atiratree atiratree changed the title Introduce Node Lifecycle WG WIP: Introduce Node Lifecycle WG Mar 24, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2025
@atiratree
Copy link
Member Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 24, 2025
@rthallisey
Copy link
Contributor

Looks like I'm not a member of kubernetes org anymore. I was a few years back, but didn't keep up with contributions recently. You can remove me as a lead and I can reapply after some contributions to this WG.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Mar 24, 2025
@atiratree
Copy link
Member Author

We have had impactful conversations with Ryan about this group and its goals. He has experience with cluster maintenance and I look forward to his participation in the WG.

@marquiz
Copy link
Contributor

marquiz commented Mar 25, 2025

/cc

@k8s-ci-robot k8s-ci-robot requested a review from marquiz March 25, 2025 17:09
@atiratree atiratree force-pushed the wg-node-lifecycle branch from 3aed2af to 39e3bde Compare June 10, 2025 10:13
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2025
@atiratree atiratree force-pushed the wg-node-lifecycle branch 2 times, most recently from f920fdf to ddcdc26 Compare June 16, 2025 08:23
@BenTheElder
Copy link
Member

I think we finally have all SIG +1s now?
cc @kubernetes/steering-committee

This group will have a lot to do :-)
+1, thanks for pulling this together.

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

/hold
to get sufficient majority from steering

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 19, 2025
* Humble Chirammal (**[@humblec](https://github.com/humblec)**), VMware
* Lucy Sweet (**[@intUnderflow](https://github.com/intUnderflow)**), Uber
* Krzysztof Wilczyński (**[@kwilczynski](https://github.com/kwilczynski)**), Independent
* Ryan Hallisey (**[@rthallisey](https://github.com/rthallisey)**), NVIDIA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being a lead is a certain responsibility. It goes beyond "is interested in the topic". But I'm okay with letting you figure out among yourself who is really showing up consistently to keep the WG moving and then perhaps do some pruning.

## Timelines and Disbanding

The working group will disband once the features and core APIs defined in the following
KEPs/Features have reached a stable state (GA) and ongoing maintenance ownership is established
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no "following KEPs/Features"... so your work is already done? 😛

You probably had this in a different order initially and lost them during some reshuffling. I think this refers to the features under "Prioritization" now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, updated!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 24, 2025
@pohly
Copy link
Contributor

pohly commented Jun 24, 2025

/approve

For Steering.

https://github.com/kubernetes/community/pull/8396/files#r2161652316 should better get resolved before merging. Also, needs a rebase...

@atiratree atiratree force-pushed the wg-node-lifecycle branch from ddcdc26 to 90f9a50 Compare June 24, 2025 10:18
@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 24, 2025
Co-authored-by: Ryan Hallisey <rhallisey@nvidia.com>
@atiratree atiratree force-pushed the wg-node-lifecycle branch from 90f9a50 to 149f04c Compare June 24, 2025 10:22
@atiratree
Copy link
Member Author

Updated and rebased. Thanks everyone!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 24, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: atiratree, pacoxu, pohly, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [pacoxu,pohly,soltysh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@soltysh
Copy link
Contributor

soltysh commented Jun 24, 2025

With 4 steering members +1-ing (Paco, Patrick, Ben and myself) this is good to go as is based on the rules.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 24, 2025
@k8s-ci-robot k8s-ci-robot merged commit a324cc8 into kubernetes:master Jun 24, 2025
2 of 3 checks passed
@github-project-automation github-project-automation bot moved this from Needs Triage to Done in SIG Scheduling Jun 24, 2025
Copy link

@evrardjp evrardjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few extra comments for the sake of history/documentation. I just hope someone will read them and have them in mind when we'll produce solutions.

- Consider improving the pod lifecycle of DaemonSets and static pods during a node maintenance.
- Explore the cloud provider use cases and how they can hook into the node lifecycle. So that the
users can use the same APIs or configurations across the board.
- Migrate users of the eviction based kubectl-like drain (kubectl, cluster autoscaler, karpenter,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit sad that kured was removed here, while it was in the initial comments on a previous issue.

I would like to adapt kured to this framework at least.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evrardjp no worries! This list is not supposed to be exhaustive. We will properly explore the migration topic when the time comes.

FYI, kured is still present in https://github.com/kubernetes/community/blob/master/wg-node-lifecycle/charter.md#relevant-projects

- Explore a unified way of draining the nodes and managing node maintenance by introducing new APIs
and extending the current ones. This includes exploring extension to or interactions with the Node
object.
- Analyze the node lifecycle, the Node API, and possible interactions. We want to explore augmenting

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could have been more specific here. For example, analyse the possibility to set new conditions onto nodes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WG charter does not replace a proper KEP; it only indicates the direction in which we would like to proceed. So we will even explore options that are not explicitly listed here.

We expect to provide reference implementations of the new APIs including but not limited to
controllers (kube-controller-manager), API validation, integration with existing core components and
extension points for the ecosystem. This should be accompanied by E2E / Conformance tests.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And cli...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI might be too broad. Nevertheless we talk about the kubectl above and will also further analyze it in our KEPs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/community-management area/slack-management Issues or PRs related to the Slack Management subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. committee/steering Denotes an issue or PR intended to be handled by the steering committee. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.