Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When var.manage_aws_auth_configmap = true, the Windows Managed Node Group reports AccessDenied health issues in the AWS Console #2471

Closed
aamoctz opened this issue Feb 13, 2023 · 2 comments

Comments

@aamoctz
Copy link
Contributor

aamoctz commented Feb 13, 2023

Description

When creating a Windows Managed Node Group and var.manage_aws_auth_configmap = true, the AWS Console reports an AccessDenied error for worker nodes.

The instance role is present and configured correctly, however the configuration in the aws-node configmap is not correct. It is missing eks:kube-proxy-windows, which then leads to the AccessDenied issues reported in the Console.

When var.manage_aws_auth_configmap = false:

mapRoles: |
  - "groups":
    - "eks:kube-proxy-windows"
    - "system:bootstrappers"
    - "system:nodes"
    "rolearn": "<windows_mng_role_arn>"
    "username": "system:node:{{EC2PrivateDNSName}}"

When var.manage_aws_auth_configmap = true:

mapRoles: |
  - "groups":
    - "system:bootstrappers"
    - "system:nodes"
    "rolearn": "<windows_mng_role_arn>"
    "username": "system:node:{{EC2PrivateDNSName}}"

The root cause of this appears to be that local.node_iam_role_arns_windows currently does not look at module.eks_managed_node_groups to determine if platform == "windows". So the module assumes MNGs are Linux or Bottlerocket and eks:kube-proxy-windows in the config for the Windows MNG is removed.

If your request is for a new feature, please use the Feature request template.

  • [ x] ✋ I have searched the open/closed issues and my issue is not listed.

⚠️ Note

Before you submit an issue, please perform the following first:

  1. Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
  2. Re-initialize the project root to pull down modules: terraform init
  3. Re-attempt your terraform plan or apply and check if the issue still persists

Versions

  • Module version [Required]: v18.31.2

  • Terraform version: 1.2.2

  • Provider version(s): 4.53.0

Reproduction Code [Required]

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "18.31.2"

  manage_aws_auth_configmap = true

  eks_managed_node_groups = {
    bottlerocket-test-group = {
      platform  = "bottlerocket"
      ami_type = "BOTTLEROCKET_x86_64"

      subnet_ids = <subnet_ids>

      instance_types = ["m5.xlarge"]

      desired_size = 2
      max_size     = 3
      min_size     = 2

      block_device_mappings = {
        xvda = {
          device_name = "/dev/xvda"
          ebs = {
            volume_size           = 2
            volume_type           = "gp3"
            iops                  = 3000
            throughput            = 125
            encrypted             = true
            delete_on_termination = true
          }
        }
        xvdb = {
          device_name = "/dev/xvdb"
          ebs = {
            volume_size           = 100
            volume_type           = "gp3"
            iops                  = 3000
            throughput            = 125
            encrypted             = true
            delete_on_termination = true
          }
        }
      }
    }

    windows-test-group = {
      platform = "windows"
      ami_type = "WINDOWS_CORE_2019_x86_64"

      subnet_ids = <subnet_ids>
    
      instance_types = ["r5.xlarge"]

      desired_size = 2
      max_size     = 3
      min_size     = 2

      block_device_mappings = {
        sda1 = {
          device_name = "/dev/sda1"
          ebs = {
            volume_size           = 100
            volume_type           = "gp3"
            iops                  = 3000
            throughput            = 125
            encrypted             = true
            delete_on_termination = true
          }
        }
      }
    }
  }
}

Steps to reproduce the behavior:

We make heavy use of terragrunt and wrap this module inside another module, but all that is needed to reproduce is to set manage_aws_auth_configmap = true and create an 2 eks_managed_node_groups. Where one is platform = "linux" or "bottlerocket", and the other node group is platform = "windows".

Expected behavior

A Windows Managed Node Group that with no connectivity issues.

Actual behavior

Windows nodes join the cluster, but have connectivity issues.
AWS Console reports AccessDenied errors under node group Health Issues.

Terminal Output Screenshot(s)

Additional context

This was tested using v18.32.1 but this most likely affects the latest release as well.

@bryantbiggs
Copy link
Member

this will be addressed in #2350

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants