Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Managed Node Group support #2350

Closed
bryantbiggs opened this issue Dec 17, 2022 · 19 comments · Fixed by #2858
Closed

Windows Managed Node Group support #2350

bryantbiggs opened this issue Dec 17, 2022 · 19 comments · Fixed by #2858
Assignees
Milestone

Comments

@bryantbiggs
Copy link
Member

Is your request related to a new offering from AWS?

Is your request related to a problem? Please describe.

Describe the solution you'd like.

  • Ability to create EKS managed node groups with Windows based nodes

Describe alternatives you've considered.

Additional context

@bryantbiggs
Copy link
Member Author

Requires the Terraform aws-sdk version to be updated hashicorp/terraform-provider-aws#28438

@chandrasekharkolla
Copy link

Any update on this?

@bryantbiggs
Copy link
Member Author

There aren't any code changes required so you can in theory use it today, but we will be adding an example and checking to see how it aligns with the rest of the Linux AL2 and Bottlerocket OS usage

@sebas-w
Copy link

sebas-w commented Feb 6, 2023

If you want to create a windows managed node group using this module, I can confirm that on version 18.31.2 you can specify the following for a windows eks managed node group as long as the following requirements are fulfilled.

Requirements

  • Your AWS Terraform Provider is at least version v4.48.0 to allow you to pass in the correct AMI_TYPE for Windows EKS Managed Node Group Instances.
  • You already Have a linux EKS Node Group and nodes on your cluster. I confirmed with AWS Support you're not able to run just a windows EKS Cluster so you need to already have a linux node in place to launch any windows nodes via the Managed Node Group option.
  • Your EKS node Role has the policy AmazonEKSVPCResourceController, which it should if you use this module since it's here;
  • You have enabled the Windows support by adding the configmap:
apiVersion: v1
data:
  enable-windows-ipam: "true"
immutable: false
kind: ConfigMap
metadata:
  name: amazon-vpc-cni
  namespace: kube-system

Example

eks_managed_node_groups = {
  windows = {
    min_size          = 1
    desired_size      = 1
    max_size          = 5
    platform          = "windows"
    ami_type          = "WINDOWS_CORE_2019_x86_64"
    capacity_type     = "SPOT"
    enable_monitoring = true
    disk_size         = "100"
    use_name_prefix   = true
    cluster_version   = var.aws_eks_cluster_version
    instance_types    = ["m5d.xlarge", "m5ad.xlarge"]
    taints = [
      {
        key    = "os"
        value  = "windows"
        effect = "NO_SCHEDULE"
      }
    ]
  },
},

@bryantbiggs
Copy link
Member Author

thank you for sharing @sebas-w !

@enver
Copy link

enver commented Feb 10, 2023

@sebas-w Thank you for sharing an example!
I was able to create windows managed node pool as you described above and run a test pod on it. However, I'm unable to connect to any pod via the cluster's internal network. Access to other resources in VPC or the internet works without issue (except for obvious DNS resolution problems). Did you have such problems?

@aamoctz
Copy link
Contributor

aamoctz commented Feb 13, 2023

@sebas-w This does indeed work unless you set var.manage_aws_auth_configmap = true.
If that var is enabled then the module overwrites aws-auth configmap values set by EKS and in the process removes the eks:kube-proxy-windows line from the Windows node group in the aws-auth configmap.

local.node_iam_role_arns_windows currently does not look at module.eks_managed_node_groups to determine if platform == "windows". So the module assumes MNGs are Linux or Bottlerocket and that line in the config is removed.

When var.manage_aws_auth_configmap = false:

mapRoles: |
  - "groups":
    - "eks:kube-proxy-windows"
    - "system:bootstrappers"
    - "system:nodes"
    "rolearn": "<windows_mng_role_arn>"
    "username": "system:node:{{EC2PrivateDNSName}}"

When var.manage_aws_auth_configmap = true:

mapRoles: |
  - "groups":
    - "system:bootstrappers"
    - "system:nodes"
    "rolearn": "<windows_mng_role_arn>"
    "username": "system:node:{{EC2PrivateDNSName}}"

@aamoctz
Copy link
Contributor

aamoctz commented Feb 14, 2023

Has any work started related to this issue? I have some changes I can contribute to at least resolve the issue with manage_aws_auth_configmap removing eks:kube-proxy-windows, but if there's already work in progress I would rather not step on anyone's toes on this.

@noamgreen
Copy link

#2477

see this PR if someone can help push it pls

@trippinnik
Copy link

If you want to create a windows managed node group using this module, I can confirm that on version 18.31.2 you can specify the following for a windows eks managed node group as long as the following requirements are fulfilled.

Requirements

  • Your AWS Terraform Provider is at least version v4.48.0 to allow you to pass in the correct AMI_TYPE for Windows EKS Managed Node Group Instances.
  • You already Have a linux EKS Node Group and nodes on your cluster. I confirmed with AWS Support you're not able to run just a windows EKS Cluster so you need to already have a linux node in place to launch any windows nodes via the Managed Node Group option.
  • Your EKS node Role has the policy AmazonEKSVPCResourceController, which it should if you use this module since it's here;
  • You have enabled the Windows support by adding the configmap:
apiVersion: v1
data:
  enable-windows-ipam: "true"
immutable: false
kind: ConfigMap
metadata:
  name: amazon-vpc-cni
  namespace: kube-system

Example

eks_managed_node_groups = {
  windows = {
    min_size          = 1
    desired_size      = 1
    max_size          = 5
    platform          = "windows"
    ami_type          = "WINDOWS_CORE_2019_x86_64"
    capacity_type     = "SPOT"
    enable_monitoring = true
    disk_size         = "100"
    use_name_prefix   = true
    cluster_version   = var.aws_eks_cluster_version
    instance_types    = ["m5d.xlarge", "m5ad.xlarge"]
    taints = [
      {
        key    = "os"
        value  = "windows"
        effect = "NO_SCHEDULE"
      }
    ]
  },
},

I'm following this example but the vpc-admission controller is not created. I see the AmazonEKSVPCResourceController role on the clusterrole that was created.

Am I missing something else?

@robertobandini
Copy link

Hi, I want to thank @sebas-w and @aamoctz, i was facing the same problems.

I started from version 18.31.2, already having Linux managed node groups, EKS 1.22, platform version eks.10."
Then I set the AWS Terraform provider to 4.48 version and I created the amazon-vpc-cni configMap.

resource "kubernetes_config_map" "amazon_vpc_cni" {
  metadata {
    name      = "amazon-vpc-cni"
    namespace = "kube-system"
  }

  data = {
    enable-windows-ipam = "true"
  }
}

In the definition of the node group I just specified the platform and the ami:

myManagedNodeGroup =  {
      name         = "my-managed-node-group"
      platform     = "windows"
      ami_type     = "WINDOWS_CORE_2019_x86_64"
      ...
}

The node group was created, then I made changes to the module that builds EKS to correctly update the auth-conf configMap.
I then later saw that @aamoctz has already proposed them here: #2477

In main.tf

 ...
 node_iam_role_arns_non_windows = distinct(
    compact(
      concat(
        [for group in module.eks_managed_node_group : group.iam_role_arn if group.platform != "windows"],
        [for group in module.self_managed_node_group : group.iam_role_arn if group.platform != "windows"],
        var.aws_auth_node_iam_role_arns_non_windows,
      )
    )
  )

  node_iam_role_arns_windows = distinct(
    compact(
      concat(
        [for group in module.eks_managed_node_group : group.iam_role_arn if group.platform == "windows"],
        [for group in module.self_managed_node_group : group.iam_role_arn if group.platform == "windows"],
        var.aws_auth_node_iam_role_arns_windows,
      )
    )
  )
  ...

In modules/eks-managed-node-group/outputs.tf

output "platform" {
  description = "Identifies if the OS platform is `bottlerocket`, `linux`, or `windows` based"
  value       = var.platform
}

If it can be useful I add that to avoid the "failed to parse Kubernetes args: pod does not have label vpc.amazonaws.com/PrivateIPv4Address" error when scheduling a pod it is also important to set the appropriate nodeSelector:

nodeSelector:
     kubernetes.io/os: windows

I confirm that in this way I was able to correctly create a Windows node group, apply a test deployment and automatically scale the replicas and therefore the number of nodes.

Surely as soon as the module supports the mentioned modifications it will be very useful.

@github-actions
Copy link

github-actions bot commented May 2, 2023

This issue has been automatically marked as stale because it has been open 30 days
with no activity. Remove stale label or comment or this issue will be closed in 10 days

@github-actions github-actions bot added the stale label May 2, 2023
@bryantbiggs bryantbiggs added wip and removed stale labels May 2, 2023
@davidedmondsMPG
Copy link

Is there anything that can be done to help get the associated PR reviewed and merged? It looks like it should solve this issue, which is a reasonably big impediment to working with working with windows nodes in EKS.

@mlschindler
Copy link

Bump for updates... Can we get this PR merged?

Is there anything that can be done to help get the associated PR reviewed and merged? It looks like it should solve this issue, which is a reasonably big impediment to working with working with windows nodes in EKS.

@bryantbiggs
Copy link
Member Author

#2477 (comment)

@mlschindler
Copy link

With the merge of #2477 does this make it possible to have the module provision EKS managed windows nodes?

@bryantbiggs
Copy link
Member Author

you can deploy Windows nodes with this module - but you will need to use the default launch template provided by EKS or provide your own launch template or user data when using a custom launch template. As I stated here, #2477 only addresses one small part of this, which is maintaining the IAM role mapping in the aws-auth configmap

The Windows node support currently does not match that of AL2 and Bottlerocket in terms of native custom launch template and user data support

@antonbabenko
Copy link
Member

This issue has been resolved in version 20.0.0 🎉

Copy link

github-actions bot commented Mar 4, 2024

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet