Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add windows support #3698

Merged
merged 10 commits into from
Jun 20, 2023
Merged

feat: Add windows support #3698

merged 10 commits into from
Jun 20, 2023

Conversation

topikachu
Copy link
Contributor

@topikachu topikachu commented Apr 4, 2023

Fixes #1131

Description
This change try to integrate with the latest EKS window support.

Add two new Windows AMI Family, Windows2019 and Windows2022.
Only Core are supported OOTB.
Windows relevant code, "kubernetes.io/os" and "vpc.amazonaws.com/PrivateIPv4Address" is only active when Windows AMI Family defined in the AwsNodeTemplate

Users can use amiSelector to choose other AMIs.
Test on a Linux and Windows mixed system for provision and de-provision.

How was this change tested?

  • Test on a live system with mixed Windows and Linux workload.
  • Both provision and de-provision are tested

Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: #
  • No

New document is required after this change is approved.
Release Note

Add Windows support. A new Windows AMI Family is introduced.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@topikachu topikachu requested a review from a team as a code owner April 4, 2023 14:21
@netlify
Copy link

netlify bot commented Apr 4, 2023

Deploy Preview for karpenter-docs-prod canceled.

Name Link
🔨 Latest commit 4435349
🔍 Latest deploy log https://app.netlify.com/sites/karpenter-docs-prod/deploys/6490a06a4c745b00081c1da7

@ellistarn
Copy link
Contributor

Wow, thanks for writing this up. @jonathan-innis can you review?

@jonathan-innis jonathan-innis self-assigned this Apr 4, 2023
@topikachu topikachu force-pushed the feat_windows branch 2 times, most recently from 5c9cc40 to 99af8e5 Compare April 5, 2023 00:42
@topikachu
Copy link
Contributor Author

@jonathan-innis @ellistarn
Please tell me if more tests are required.
Not sure what kind of tests to add.

pkg/apis/v1alpha1/register.go Outdated Show resolved Hide resolved
pkg/providers/amifamily/bootstrap/windows.go Show resolved Hide resolved
pkg/providers/amifamily/windows.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/amifamily/windows.go Outdated Show resolved Hide resolved
pkg/providers/amifamily/resolver.go Outdated Show resolved Hide resolved
@topikachu
Copy link
Contributor Author

topikachu commented Apr 6, 2023

@jonathan-innis @rawahars
I need your suggestion about these:

  1. Is it OK to give four new Windows AMI Family names? If yes, shall we use the same long name as eksctl, like WindowsServer2019CoreContainer? I add 4 new Windows AMI families in short name, like WindowsServer2019Core .
  2. Shall we use a single value of the node requirement "kubernetes.io/os" or two values that have both Linux and Windows? If we choose two values, how to handle the default node labels created by schedule? This change may break the test should apply labels to the node Fixed.
  3. Is there a clear rule about the ec2 instances supported for Windows workload?

Please review the new change.

Thanks

@jonathan-innis
Copy link
Contributor

Is it OK to give four new Windows AMI Family names

Our general feeling is that an AMIFamily should be centered around the userData that is used to startup the node and attach it to the cluster. An AMIFamily should be a grouping of AMIs that have effectively the same configuration. With that in mind, I think it would be preferable to have Windows be a single AMIFamily and we can differentiate the different versions using a different mechanism.

Musing: I wonder if we can do this through the windows build number that we know will get populated to the node when the node registers with the cluster. We can support different AMI SSM aliases and have them be compatible with different sets of node.kubernetes.io/windows-build requirements

@jonathan-innis
Copy link
Contributor

Is there a clear rule about the ec2 instances supported for Windows workload

@rawahars Can you comment on this one?

@topikachu
Copy link
Contributor Author

topikachu commented Apr 7, 2023

Our general feeling is that an AMIFamily should be centered around the userData that is used to startup the node and attach it to the cluster. An AMIFamily should be a grouping of AMIs that have effectively the same configuration. With that in mind, I think it would be preferable to have Windows be a single AMIFamily and we can differentiate the different versions using a different mechanism.

It's reasonable but we need to first decide on the source of AMI. There are two options

  1. The AMI can be resolved by the AWSNodeTemplate. In this case, all parameters must be defined in this file. This can be implemented by multiple AMIFamilies, annotations, or new fields in AWSNodeTemplate
  2. The AMI can be resolved by the AWSNodeTemplate and Provisioner together. A user only defines a single AMI Family and declares the OS requirement in the provisioner. This needs to change both the code of provision and drift.

All are doable, and please shed some light on this.

@jonathan-innis
Copy link
Contributor

declares the OS requirement in the provisioner. This needs to change both the code of provision and drift.

I'd argue either the Provisioner or the workload is the best place. At a high-level, the philosophy behind the different components is:

  1. Provisioner defines a "world of possibilities" at the Cluster Admin level for a group of workloads
  2. Workload-based scheduling defines the "world of possibilities" at the App Dev level for single workloads. I think in general, most users are going to be nodeSelecting on a version of windows if their workloads only support a specific version of windows.

You can imagine a scenario where a version of Windows isn't defined in the Provisioner at all and its just defined at the workload level, which will be passed into the launched machine requirements and be used to select the correct Windows AMI version for that workload

@jonathan-innis
Copy link
Contributor

Referencing the Upstream Kubernetes Windows User Guide for more info on the windows build version and windows best-practices

@topikachu
Copy link
Contributor Author

Referencing the Upstream Kubernetes Windows User Guide for more info on the windows build version and windows best-practices

I'm afraid I have no idea to implement the Windows build version without AWS EKS team support.
There's no enough information to resolve the matched AMI until AWS can provide more metadata on each AMI.

I notice that the new AMIFamily DefaultAMIs method returns multiple AMIs and each AMI output has its own requirements. https://github.com/aws/karpenter/blob/629859c790b8410b78d9b9c134b02fe02e32165f/pkg/providers/amifamily/resolver.go#L75
It's possible that a single Windows AMI Family returns all the Windows AMIs with 2019/2022, core/full as part of the requirements.
However, it still needs some code change around https://github.com/aws/karpenter/blob/629859c790b8410b78d9b9c134b02fe02e32165f/pkg/cloudprovider/cloudprovider.go#L224 and https://github.com/aws/karpenter/blob/629859c790b8410b78d9b9c134b02fe02e32165f/pkg/providers/amifamily/resolver.go#L122

@jonathan-innis Any idea?

@jonathan-innis
Copy link
Contributor

It's possible that a single Windows AMI Family returns all the Windows AMIs with 2019/2022, core/full as part of the requirements.

@topikachu Yes, this is the idea, that the DefaultAMIs that are returned from the AMIFamily would contain the windows build numbers as requirements. Happy to help you along this path. Also, I'm not sure GH is the easiest place to have ad-hoc convos like this so feel free to reach out to me on the Kubernetes slack (I'm Jonathan Innis) on there as well or in our "karpenter-dev" channel on the Kubernetes slack.

@topikachu topikachu force-pushed the feat_windows branch 2 times, most recently from ff013a8 to a3b59a5 Compare April 21, 2023 10:10
@topikachu
Copy link
Contributor Author

@jonathan-innis Please check this new change.
It uses two ami families with core as default.

@jonathan-innis
Copy link
Contributor

@topikachu Can you rebase the change? It looks like you have a few conflicts in a few files in the PR

pkg/apis/v1alpha1/register.go Outdated Show resolved Hide resolved
pkg/providers/amifamily/windows.go Show resolved Hide resolved
pkg/providers/amifamily/windows.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Show resolved Hide resolved
Copy link
Contributor

@jonathan-innis jonathan-innis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just some high-level nits

pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/amifamily/bootstrap/windows.go Show resolved Hide resolved
pkg/providers/amifamily/bootstrap/windows.go Outdated Show resolved Hide resolved
pkg/providers/amifamily/bootstrap/windows.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/suite_test.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/suite_test.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/suite_test.go Outdated Show resolved Hide resolved
pkg/providers/amifamily/windows.go Show resolved Hide resolved
pkg/apis/v1alpha1/register.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/suite_test.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Outdated Show resolved Hide resolved
pkg/providers/instancetype/types.go Show resolved Hide resolved
Copy link
Contributor

@jonathan-innis jonathan-innis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/karpenter snapshot

@github-actions
Copy link
Contributor

Snapshot successfully published to oci://public.ecr.aws/karpenter/karpenter:v0-1b4fec14fcf13723a6cf9ecbf386894b1e0d653c. Find the image tag and installation instructions at https://gallery.ecr.aws/karpenter/karpenter/

topikachu and others added 10 commits June 19, 2023 11:37
Fixes aws#1131

This change try to integrate with the latest EKS window support.

Add two new Windows AMI Family, Windows2019 and Windows2022.
Only Core are supported OOTB.
Windows relevant code, "kubernetes.io/os" and "vpc.amazonaws.com/PrivateIPv4Address" is only active when Windows AMI Family defined in the AwsNodeTemplate

Users can use amiSelector to choose other AMIs.
Test on a Linux and Windows mixed system for provision and de-provision.

Test on a live system with mixed Windows and Linux workload.
Both provision and de-provision are tested
1. Remove variant in windows family since we only support "Core" now.
2. Follow the test case convention to test labels.
1. Remove unused consts
2. Add SupportsENILimitedPodDensity feature flag.
3. Make kubeletExtraArgs as a generic method of amifamily.Options
4. Add ContainerRuntime parameter in windows bootstrap
5. Change ENILimitedPods, maxPods, privateIPv4Address
1. Fix the issue that double ContainerRuntime is configured in Windows
2. Refactor getOSRequirement
3. Refine test and one new to test pods number on windows node
Copy link
Contributor

@jonathan-innis jonathan-innis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/karpenter snapshot

@github-actions
Copy link
Contributor

Snapshot successfully published to oci://public.ecr.aws/karpenter/karpenter:v0-4435349baab8eb6cee4d04469a1524da6237bdd6. Find the image tag and installation instructions at https://gallery.ecr.aws/karpenter/karpenter/

Copy link
Contributor

@jonathan-innis jonathan-innis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/karpenter snapshot

@github-actions
Copy link
Contributor

Snapshot successfully published to oci://public.ecr.aws/karpenter/karpenter:v0-4435349baab8eb6cee4d04469a1524da6237bdd6. Find the image tag and installation instructions at https://gallery.ecr.aws/karpenter/karpenter/

Copy link
Contributor

@jonathan-innis jonathan-innis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link

@rawahars rawahars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚢

@jonathan-innis jonathan-innis enabled auto-merge (squash) June 20, 2023 17:11
@jonathan-innis jonathan-innis merged commit 1fef30f into aws:main Jun 20, 2023
17 checks passed
@topikachu
Copy link
Contributor Author

Thank you for accepting my code change. I'm excited that the addition of Windows support to Karpeneter will benefit all users.

If there are any further modifications needed, please let me know. I'm eager to contribute more to the Karpeneter project and assist in any way I can.

Thanks again for the opportunity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Windows support
7 participants