New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NodeCreationFailure: Instances failed to join the kubernetes cluster. This is happening on a fresh cluster. #2149
Comments
Maybe you need to raise the fleet quota. |
@tanvp112 Im not sure if this is about any quota increase as I'm creating just 1 node. Have you faced the same issue? |
I have the same issue. Anybody knows why? |
The following config worked for me. I still don't know why it worked though. There seems to be some race condition terraform {
required_version = "~> 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}
provider "aws" {
region = "us-east-1"
profile = "ADD NAME OF AWS PROFILE OR SET CREDS EXPLICITLY"
}
data "aws_eks_cluster" "default" {
name = module.eks_default.cluster_id
depends_on = [
module.eks_default.cluster_id,
]
}
data "aws_eks_cluster_auth" "default" {
name = module.eks_default.cluster_id
depends_on = [
module.eks_default.cluster_id,
]
}
provider "kubernetes" {
host = data.aws_eks_cluster.default.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.default.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.default.token
}
provider "helm" {
kubernetes {
host = data.aws_eks_cluster.default.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.default.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.default.token
}
}
################################################################################
# Common Locals
################################################################################
locals {
# Used to determine correct partition (i.e. - `aws`, `aws-gov`, `aws-cn`, etc.)
partition = data.aws_partition.current.partition
}
################################################################################
# Common Data
################################################################################
data "aws_partition" "current" {}
data "aws_caller_identity" "current" {}
################################################################################
# Common Modules
################################################################################
module "tags" {
# tflint-ignore: terraform_module_pinned_source
source = "github.com/clowdhaus/terraform-tags"
application = "someclustername"
environment = "nonprod"
repository = "https://github.com/clowdhaus/eks-reference-architecture"
}
################################################################################
# EKS Modules
################################################################################
module "vpc" {
# https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/latest
source = "terraform-aws-modules/vpc/aws"
version = "~> 3.12"
name = "someclustername"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
one_nat_gateway_per_az = false
enable_dns_hostnames = true
manage_default_network_acl = true
default_network_acl_tags = { Name = "someclustername-default" }
manage_default_route_table = true
default_route_table_tags = { Name = "someclustername-default" }
manage_default_security_group = true
default_security_group_tags = { Name = "someclustername-default" }
public_subnet_tags = {
"kubernetes.io/cluster/someclustername-default" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/someclustername-default" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
tags = module.tags.tags
}
module "eks_default" {
source = "terraform-aws-modules/eks/aws"
version = "~> 18.26"
cluster_name = "someclustername-default"
cluster_version = "1.22"
# EKS Addons
cluster_addons = {
coredns = {
resolve_conflicts = "OVERWRITE"
}
kube-proxy = {}
vpc-cni = {
resolve_conflicts = "OVERWRITE"
}
}
# Encryption key
create_kms_key = true
cluster_encryption_config = [{
resources = ["secrets"]
}]
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
eks_managed_node_groups = {
default = {
# By default, the module creates a launch template to ensure tags are propagated to instances, etc.,
# so we need to disable it to use the default template provided by the AWS EKS managed node group service
create_launch_template = false
launch_template_name = ""
# list of pods per instance type: https://github.com/awslabs/amazon-eks-ami/blob/master/files/eni-max-pods.txt
# or run: kubectl get node -o yaml | grep pods
instance_types = ["t2.xlarge"]
disk_size = 50
# Is deprecated and will be removed in v19.x
create_security_group = false
min_size = 1
max_size = 3
desired_size = 1
update_config = {
max_unavailable_percentage = 33
}
}
}
tags = module.tags.tags
}
|
@arnav13081994 Below is my code IAM Role for EKS to have access to the appropriate resourcesresource "aws_iam_role" "eks-iam-role" { path = "/" assume_role_policy = <<EOF } Attach the IAM policy to the IAM roleresource "aws_iam_role_policy_attachment" "AmazonEKSClusterPolicy" { Create the EKS clusterresource "aws_eks_cluster" "devopsthehardway-eks" { vpc_config { depends_on = [ Worker Nodesresource "aws_iam_role" "workernodes" { assume_role_policy = jsonencode({ resource "aws_iam_role_policy_attachment" "AmazonEKSWorkerNodePolicy" { resource "aws_iam_role_policy_attachment" "AmazonEKS_CNI_Policy" { resource "aws_iam_role_policy_attachment" "EC2InstanceProfileForImageBuilderECRContainerBuilds" { resource "aws_iam_role_policy_attachment" "AmazonEC2ContainerRegistryReadOnly" { resource "aws_eks_node_group" "worker-node-group" { scaling_config { depends_on = [ |
Same error. It's new AWS account with very few EC2. Something else is wrong when done via TF automation or eksctl. unexpected state 'CREATE_FAILED', wanted target 'ACTIVE'. last error: 2 errors occurred: |
Same error - tried on |
Anyone figured this out? |
I am receiving this error as well. In the CloudTrail logs for the {
"errorCode": "Client.InvalidParameterValue",
"errorMessage": "Value (eks-xxxx) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name
} A possible workaround is creating your own EC2 launch template and then using that in the node_group definition; however, you would need to replicate the launch template EKS uses by default: https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html I have not yet been able to do this. |
Getting the same error as well today. Currently looking into it. |
Be sure you're not creating in a private subnet that was the issue for me. |
[FIXED] Run the automated runbook to see the actual issue In our case, it was an issue with security group and user data script. |
The issue was that I had restricted the
|
@danvau7 I'm getting error even after setting the cluster_endpoint_private_access to true. Can anyone help out here, it's really frustrating. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Description
I followed the docs and have exhausted all the resources online but still not able to create an
EKS
cluster withEKS Managed Nodes
. I always get the following error:Versions
Reproduction Code [Required]
Steps to reproduce the behavior:
Just run
terraform apply --auto-approve
and after waiting for about20 minutes
you will see the aforementioned error.Expected behavior
The eks cluster with 1 EKS managed group gets created.
Actual behavior
The following error is thrown:
Additional context
I have read other similar issues and have experimented with
iam_role_attach_cni_policy = true
but still get the same issue. Any help would be greatly appreciated. This has been extremely frustrating for me.The text was updated successfully, but these errors were encountered: