Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REQUEST: Migrate github.com/aws/karpenter-core #4258

Closed
2 tasks done
ellistarn opened this issue May 31, 2023 · 53 comments
Closed
2 tasks done

REQUEST: Migrate github.com/aws/karpenter-core #4258

ellistarn opened this issue May 31, 2023 · 53 comments
Assignees
Labels
sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling.

Comments

@ellistarn
Copy link

ellistarn commented May 31, 2023

Describe the issue

Karpenter is an open-source Kubernetes node autoscaling and management solution created by AWS. It can help improve application availability, reduce operational overhead, and lower compute costs in Kubernetes clusters. The Karpenter community has expressed interest in additional cloud provider implementations as well as a vendor neutral home for the project. We believe that SIG Autoscaling is the right home for the project and are looking for feedback from the community.

AWS approached SIG Autoscaling in 2019 and later in 2020, described the challenges our customers were facing with the Cluster Autoscaler, and proposed changes to better meet their needs. The SIG expressed reasonable concerns about how to prove out these ideas while maintaining backwards compatibility with the broadly adopted Kubernetes Cluster Autoscaler, and recommended we explore the ideas in a separate project, so we made github.com/aws/karpenter.

Karpenter improves application availability, reduces operational overhead, and increases cluster compute cost efficiency. It observes the pods in your cluster and launches, updates, or terminates nodes so that they always have the compute resources they request. Karpenter evaluates the compute and scheduling requirements of these pods to determine the types and locations of compute resources required to satisfy them. Karpenter is designed to work with any Kubernetes cluster running in any environment through a cloud provider interface.

In November 2021, we announced Karpenter v0.5 as ready for production. Since then, the team at AWS and the broader community have worked diligently to build features and solve bugs, and many AWS customers have found Karpenter to be a good fit for their operational, performance, and economic requirements. We’ve earned 4.4k Github stars, merged code from 200 contributors, and discussed Karpenter with 1.5k community members on the Kubernetes Slack.

Kubernetes SIGs are at the core of the Kubernetes community and Karpenter fits most closely with the charter of SIG Autoscaling. From the beginning, we built Karpenter as an open-source, vendor-neutral solution. Contribution to the SIG will help Karpenter meet the needs of users operating in multiple environments and enable the community to build additional cloud provider implementations. Karpenter will benefit from the community and well-established governance of the SIG, while bringing new energy, contributors, and ideas to the SIG.

We’re looking for feedback from the Kubernetes community on the following proposal:

Approval Request

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label May 31, 2023
@ellistarn
Copy link
Author

cc: @akestner, @dims, @lachie83

@ellistarn
Copy link
Author

/sig autoscaling

@k8s-ci-robot k8s-ci-robot added sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 31, 2023
@mrbobbytables
Copy link
Member

/transfer org

@k8s-ci-robot k8s-ci-robot transferred this issue from kubernetes/community May 31, 2023
@mrbobbytables mrbobbytables changed the title Request: Karpenter to join SIG Autoscaling REQUEST: Migrate github.com/aws/karpenter May 31, 2023
@mrbobbytables
Copy link
Member

/assign @mwielgus @gjtempleton

@mrbobbytables mrbobbytables changed the title REQUEST: Migrate github.com/aws/karpenter REQUEST: Migrate github.com/aws/karpenter-core May 31, 2023
@jrsapi
Copy link

jrsapi commented May 31, 2023

@ellistarn thanks for opening this issue. As an end user member +1 to all 4 points of this proposal. The community will benefit with this under SIG-Autoscaling.

@tuananh
Copy link

tuananh commented Jun 1, 2023

Is there an implementation of cloud provider other than aws? Or do azure/gcp plan to adopt karpenter core?

@olemarkus
Copy link
Member

This is a great move. +1 from me!

@palnabarun
Copy link
Member

I will take care of the transfer once there's ack from SIG Autoscaling leads.

/assign

@bplasmeijer
Copy link

"Climate change is not a distant future; it is the defining challenge of our present. Today is the day for communities to unite, embrace sustainable practices, and become the architects of a better tomorrow. Every action taken today is a step towards safeguarding our planet, ensuring a vibrant future for future generations. Let us act now, for communities' collective efforts lies the power to create a resilient and thriving world where nature and humanity coexist in harmony."

Please CNCF today.

@PixelRobots
Copy link

The proposal for Karpenter to become a SIG Autoscaling Subproject is highly beneficial to the broader Kubernetes community.

By joining forces with SIG Autoscaling, Karpenter will become more accessible and inclusive for users operating in multiple environments, including AKS. This collaboration will facilitate the development of additional cloud provider implementations, further enhancing the flexibility and versatility of Karpenter. Moreover, integrating Karpenter as a subproject under SIG Autoscaling's well-established governance will provide the necessary structure and support for its growth, while fostering an environment for new contributors to join and contribute their ideas and expertise.

In addition to the operational, performance, and economic benefits already demonstrated by Karpenter, this move towards a vendor-neutral home will also contribute to green computing by enabling efficient resource utilization and optimization. Overall, embracing Karpenter within SIG Autoscaling is a significant step towards advancing autoscaling capabilities in Kubernetes and empowering users with enhanced scalability, cost-efficiency, and sustainability in their cloud-native deployments.

This request has my backing.

@ehrnst
Copy link

ehrnst commented Jun 1, 2023

Karpenter as part of CNCF sounds like the most reasonable idea. Sustainability and cost-efficiency won't be less important going forward.

@PaulusTM
Copy link

PaulusTM commented Jun 1, 2023

Lets safe 🌳 and the planet. This project should be part of the CNCF and usable by all providers to enable sustainability.

@MBOps
Copy link

MBOps commented Jun 1, 2023

This would be a great move

@gbaeke
Copy link

gbaeke commented Jun 1, 2023

This would be great indeed. Looking forward to integration with other cloud providers and beyond.

@mbevc1
Copy link

mbevc1 commented Jun 1, 2023

Sounds good @ellistarn and having this as part of SIG would send a positive signal for wider ecosystem adoption 👍

@Aditya-Narayan-Nayak
Copy link

That Would be a great move +1 from my side

@Vlaaaaaaad
Copy link

A massive +1 from me too. Karpenter dramatically improved scaling for Kubernetes clusters on AWS and it's something that is strongly missed when running Kubernetes outside of AWS. Adoption of this project by SIG Autoscaling will open up a whole new chapter for scaling k8s workers.

Is there an implementation of cloud provider other than aws? Or do azure/gcp plan to adopt karpenter core?

Azure had an announcement that got pulled (you can see the Wayback Machine snapshot here) and they did mention that "Karpenter specifically it's a single-entity governed project at the moment so we cannot commit one way or another to supporting its interfaces and the project itself in AKS. We are happy to consider it if it ever becomes a community governed project."

@sam-cogan
Copy link

This would definitely be a good thing to be able to expand usage to other cloud providers.

@AdminTurnedDevOps
Copy link

This is very much needed. Ironically enough, one of the biggest reasons that I recommend EKS is specifically because of Karpenter. I've done several benchmarks and Karpenter is certainly faster than Cluster Autoscaler. In many cases, getting a new Worker Node up and running as fast as possible is the difference between Pods handling scale properly and Pods sitting in pending while customers are experiencing timeouts.

@awdelyea
Copy link

awdelyea commented Jun 1, 2023

I wholeheartedly support this proposal. Incorporating Karpenter into SIG Autoscaling promotes diversity and vendor neutrality, while enhancing Kubernetes autoscaling capabilities. I'm particularly excited for the broader community involvement and the potential for increased cloud provider implementations. Looking forward to Karpenter's continued evolution!

@whaakman
Copy link

whaakman commented Jun 1, 2023

Definitely supporting this!

@dims
Copy link
Member

dims commented Jun 2, 2023

@gjtempleton Will let @ellistarn respond formally. But i do have a quick question here that will help with discussion.

Wearing my personal hat:
Who knows how any project progress(es), so any commitment made at a point in time may end up depending on who shows up to do the work and what they bring to the table. We don't know who what the composition of the maintainers of karpenter or any other project will be a year from now. What we can do is to set up process(es) etc to make sure there is active discussion and back and forth between projects to make sure the drift between them gets smaller over time. How about a WG that composed of team members from the different projects that can discuss and coordinate how things evolve? After all both can learn from each other as well. It would only benefit everyone if there is cross pollination of ideas, implementation etc over time. We can't just expect CA to be just static / as-is and Karpenter to only follow what CA does. right?

So how about we start a WG to help with this with active members from the SIG? Can you think of anything other techniques we can apply?

@jackfrancis
Copy link

@vadasambar

I don't see a clear path on how we are going to make Karpenter vendor neutral.

I think this request is laying out the path. Opting into CNCF governance is pretty definitionally a vendor neutral path.

I have seen lot of resources around this but do we see any plans to clearly lay out pros and cons for both CA and Karpenter side-by-side as a part of some official documentation in a vendor neutral way to help users choose better if Karpenter becomes part of SIG-autoscaling (can be a separate README)? My impression as a user (and I think I am not the only one), Karpenter is publicized as the better autoscaling solution. As someone interested in contributing to Karpenter and as an existing contributor to CA, I am personally not a fan of one sub-project giving the impression that it is better than the other sub-project.

Agree 100%. Is there official documentation for karpenter that describes it as just "better", rather than outlining the functional differences? Because cluster-autoscaler is the more mature, and widely adopted solution for node autoscaling it would make sense why karpenter documentation might include some self-advocacy, as its potential user community includes folks who are already using cluster-autoscaler and might need some reasons to consider an alternative. But "better than cluster-autoscaler" is not a tractable value-proposition and I'm skeptical that this is the reason why karpenter has so many AWS users, and why so many non-AWS users are eager to see implementations in their environments.

Outlining exactly what cluster-autoscaler does and what karpenter does and why some use-cases might be better solved by one or the other should be a goal.

My impression of sig-autoscaling is, we don't have as many contributors in the project (feel free to correct me). My concern is Karpenter is going to add more work (especially for chairs).

Adding more project participation should have the side-effect of increasing contributors, no?

@dtzar
Copy link
Contributor

dtzar commented Jun 2, 2023

@gjtempleton Will let @ellistarn respond formally. But i do have a quick question here that will help with discussion.

Wearing my personal hat: Who knows how any project progress(es), so any commitment made at a point in time may end up depending on who shows up to do the work and what they bring to the table. We don't know who what the composition of the maintainers of karpenter or any other project will be a year from now. What we can do is to set up process(es) etc to make sure there is active discussion and back and forth between projects to make sure the drift between them gets smaller over time. How about a WG that composed of team members from the different projects that can discuss and coordinate how things evolve? After all both can learn from each other as well. It would only benefit everyone if there is cross pollination of ideas, implementation etc over time. We can't just expect CA to be just static / as-is and Karpenter to only follow what CA does. right?

So how about we start a WG to help with this with active members from the SIG? Can you think of anything other techniques we can apply?

I agree with dims. I believe bringing karpenter into the sig-autoscaling group is the WG to help sort this out and it will take time. There is a significant engineering cost to converging the projects into one, which needs to be justified not just because they are "similar projects" and/or the desire for an easy switch between projects. It will be clearer over time the direction to take (combining the projects or not, having clearer areas of separation, alignment to make it easier to switch between the two, etc.) and having the shared knowledge and collaboration in the community in this space under a single WG will be valuable with whatever the direction.

+1 on the value proposition / decision making for people on which one to choose today. Both projects have unique advantages. People are going to do this analysis anyways with the projects under the same WG or not.

@ellistarn
Copy link
Author

Thanks @gjtempleton and @mwielgus for the response.

Users rely on Karpenter for not just node autoscaling, but also node configuration, and node lifecycle management. Cluster Autoscaler delegates these other concerns (e.g. launch, upgrade, repair, interruption handling) to external systems. These differences in scope introduce ambiguity into what alignment means, but we’re absolutely interested in exploring this topic more concretely.

Per @dims’s suggestion, we propose a meeting series to more deeply understand the details of the SIG’s view of convergence and to identify concrete opportunities for technical alignment between the projects — both as they are today and how they could evolve in the future. If you agree, let’s align over Slack on a weekly one hour meeting time (PST and CEST friendly) and share the invite with the community.

@guidemetothemoon
Copy link

I would also like to share my support for the proposal of putting Karpenter under CNCF governance. It is a great tool that contributes significantly to important areas of Kubernetes ecosystem like cost optimization and sustainability. Choosing this path in order to make the tool vendor neutral, get wider community adoption, support, further development and maintenance is undoubtedly beneficial.

I do share the same concerns regarding getting community maintainers and having the project follow the same processes and guidelines as any other CNCF subproject. But all in all I fully support the outlined strategy of including Karpenter as part of CNCF landscape, and I will be happy to engage and contribute in further discussions and development of this initiative.

@hugobarona
Copy link

Fully support this proposal, +1 👏

@elmiko
Copy link
Contributor

elmiko commented Jun 5, 2023

So how about we start a WG to help with this with active members from the SIG? Can you think of anything other techniques we can apply?

+1 to organizing a working group to help answer the questions around what does inclusion in the SIG mean, what are the points of alignment, and what does the future look like with a SIG that has multiple similar projects (e.g. how to organize, cross-promote, etc.)

@faermanj
Copy link

faermanj commented Jun 8, 2023

@elmiko Let's get this working on okd.io :)

@gjtempleton
Copy link
Member

Thanks @dims and @ellistarn for the responses.
We had some productive conversations on the weekly SIG call last week, and community members had a number of areas they agreed are ripe for discussion by the working group to work on convergence of the configuration where the two projects provide the same functionality (e.g. the ability to mark pods as safe/unsafe to evict).

Given this, we'd like to set up the meeting series for those interested and agree on these areas before we commit to take karpenter-core under the SIG's governance. To also clarify a point we believe has led to some confusion, we are not pushing for the merge of two projects or anything that could inhibit innovation on any side, we just want to follow a principle of interoperability(to the extent that is possible and reasonable) and least surprise for users of both projects.

Let's work on agreeing some times that work for as many people as possible via the SIG Slack and invite along the members of the community with an interest (especially given the interest this proposal has shown people have).

@dims
Copy link
Member

dims commented Jun 12, 2023

Thanks @gjtempleton ! i will watch and participate as needed. over to @ellistarn @jonathan-innis @njtran @jackfrancis and others who are hands on to lead the way!

@jackfrancis
Copy link

@gjtempleton +1 to those sentiments, I think defining well known areas of re-use, interoperability, roadmap (insert other important dimensions of collaborative software engineering) between existing SIG Autoscaling projects and karpenter will better predict for best-case long-term outcomes.

Adding some formality with a working group and documenting the progress of consensus can be a model for other SIGs, so we'll be doing good work here for the k8s ecosystem. :)

@tehlers320
Copy link

Karpenter seems to clash with Kubernetes core concepts, such as making "preferred" basically "required" for pod affinity or has that changed?

@jonathan-innis
Copy link
Contributor

jonathan-innis commented Jun 28, 2023

such as making "preferred" basically "required" for pod affinity or has that changed

Hey @tehlers320, I think this has changed at this point. Karpenter attempts to satisfy "preferred" pod constraints (such as topologySpreadConstraints or antiAffinity) on its first pass of scheduling. If it isn't able to satisfy these preferred requirements, then it will then relax the requirements until it is able to schedule the pod, so it effectively has a fallback method for attempting preferred constraints.

Karpenter seems to clash with Kubernetes core concepts

As far as I am aware, Karpenter should respect all Kubernetes core scheduling concepts, but feel free to correct me if I'm wrong and there's areas of Karpenter that don't properly conform the Kubernetes core scheduling specification. We definitely want to correct any deviation from this spec as soon as possible.

@roberthstrand
Copy link

Just adding a +1 to this, makes a lot of sense.

@jonathan-innis
Copy link
Contributor

Added some approvers boxes to mark when @mwielgus and @gjtempleton approve this request.

@ellistarn
Copy link
Author

FYI, from the Kubernetes SIG Autoscaling Slack Channel:

mwielgus 2:56 AM
Hello. We have reached an agreement regarding the alignment and adoption of Karpenter to SIG Autoscling.
The details are here: https://docs.google.com/document/d/1rHhltfLV5V1kcnKr_mKRKDC4ZFPYGP4Tde2Zy-LE72w/edit#heading=h.iof64m6gewln
Please take a look and let us know if you have any objections, questions or comments. We are waiting for your feedback until Friday, Oct 27th.

🎉

@gjtempleton
Copy link
Member

Lazy consensus period has now passed without any blocking objections, happy to give this my full backing.

🚀

@dims
Copy link
Member

dims commented Oct 29, 2023

giphy (3)

@jonathan-innis
Copy link
Contributor

jonathan-innis commented Oct 30, 2023

Linking a separate issue for the actual repo move since that issue has all of the relevant information around permissions, template files, security contacts, etc: #4562

@ellistarn
Copy link
Author

🎉 https://github.com/kubernetes-sigs/karpenter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling.
Projects
None yet
Development

No branches or pull requests