Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't add nodegroup for existing cluster which created with old eksctl #518

Closed
kycfeel opened this Issue Feb 7, 2019 · 16 comments

Comments

Projects
None yet
4 participants
@kycfeel
Copy link

kycfeel commented Feb 7, 2019

Please search existing issues (open & closed) to see if there is a similar one. If there is, add comments or vote where appropriate.

If you did not find a similar issue or if it did not help then ask for help in our slack channel

What help do you need?

Hey there.

So I recently updated my local eksctl to latest (0.1.20-rc.3) and tried to add new nodegroup for existing cluster which created with way way older version of eksctl (I'm not sure exactly which version i used), but got this error at first,

cluster compatibility check failed: shared node security group missing, to fix this run 'eksctl utils update-cluster-stack --name=testcluster --region=eu-west-1'

and after I ran eksctl utils update-cluster-stack --name=testcluster --region=eu-west-1 to solve that problem, I got this.

2019-02-07T18:48:11+09:00 [✖]  inssuficient number of subnets, at least 2x public and/or 2x private subnets are required

I already got 3 subnets for my cluster, but it's still telling me there's no subnets.

I believe this is caused by difference between templates for older version and latest version.

Is there any way to solve this conflict (or bug) without messing around everything? I don't want to create a new cluster and migrate everything just for it.

Thanks a lot.

@errordeveloper

This comment has been minimized.

Copy link
Member

errordeveloper commented Feb 11, 2019

Could you please run 'eksctl utils describe-cluster-stacks --name=testcluster --region=eu-west-1' and share the output here?

@nesv

This comment has been minimized.

Copy link

nesv commented Feb 11, 2019

@errordeveloper I'm running into the same issue; I initialized my cluster using v0.1.8 (I think), and recently upgraded to 0.1.20-rc4. Here's my output for the eksctl utils describe-stacks:

[ℹ]  stack/eksctl-testing-nodegroup-0 = {
  Capabilities: ["CAPABILITY_IAM"],
  CreationTime: 2018-10-30 21:59:54.974 +0000 UTC,
  Description: "EKS nodes (Amazon Linux 2 with SSH)  [created and managed by eksctl]",
  DisableRollback: false,
  EnableTerminationProtection: false,
  LastUpdatedTime: 2019-01-22 16:22:00.103 +0000 UTC,
  Outputs: [{
      ExportName: "eksctl-testing-nodegroup-0::InstanceRoleARN",
      OutputKey: "InstanceRoleARN",
      OutputValue: "arn:aws:iam::805157463669:role/eksctl-testing-nodegroup-0-NodeInstanceRole-UG6NPE8RO1OB"
    }],
  RollbackConfiguration: {

  },
  StackId: "arn:aws:cloudformation:us-east-1:805157463669:stack/eksctl-testing-nodegroup-0/22591e60-dc8f-11e8-b0f0-500c5cc81217",                                                                                                            
  StackName: "eksctl-testing-nodegroup-0",
  StackStatus: "UPDATE_COMPLETE",
  Tags: [{
      Key: "eksctl.cluster.k8s.io/v1alpha1/cluster-name",
      Value: "testing"
    }]
}
[ℹ]  stack/eksctl-testing-cluster = {
  Capabilities: ["CAPABILITY_IAM"],
  CreationTime: 2018-10-30 21:48:09.618 +0000 UTC,
  Description: "EKS cluster (with dedicated VPC & IAM role)  [created and managed by eksctl]",
  DisableRollback: false,
  EnableTerminationProtection: false,
  Outputs: [
    {
      ExportName: "eksctl-testing-cluster::SubnetsPrivate",
      OutputKey: "SubnetsPrivate",
      OutputValue: "subnet-017b0b7d9f2e5dcc4,subnet-0ff3aff13e3116fe1,subnet-05a6a77a3cd2490db"
    },
    {
      ExportName: "eksctl-testing-cluster::SubnetsPublic",
      OutputKey: "SubnetsPublic",
      OutputValue: "subnet-015fc34e1ba60ce71,subnet-01f0a996581cc5a8b,subnet-0144d8640a31167df"
    },
    {
      ExportName: "eksctl-testing-cluster::Endpoint",
      OutputKey: "Endpoint",
      OutputValue: "https://86107AFF7F84DC204487F23211E3A852.sk1.us-east-1.eks.amazonaws.com"
    },
    {
      ExportName: "eksctl-testing-cluster::VPC",
      OutputKey: "VPC",
      OutputValue: "vpc-0fbeb540a8ed33d54"
    },
    {
      OutputKey: "ClusterStackName",
      OutputValue: "eksctl-testing-cluster"
    },
    {
      OutputKey: "CertificateAuthorityData",
      OutputValue: "...redacted..."
    },
    {
      ExportName: "eksctl-testing-cluster::SecurityGroup",
      OutputKey: "SecurityGroup",
      OutputValue: "sg-00ef3c2edb0ac9bcd"
    },
    {
      ExportName: "eksctl-testing-cluster::ARN",
      OutputKey: "ARN",
      OutputValue: "arn:aws:eks:us-east-1:805157463669:cluster/testing"
    }
  ],
  RollbackConfiguration: {

  },
  StackId: "arn:aws:cloudformation:us-east-1:805157463669:stack/eksctl-testing-cluster/7de13210-dc8d-11e8-80c1-50d5ca6326f2",
  StackName: "eksctl-testing-cluster",
  StackStatus: "CREATE_COMPLETE",
  Tags: [{
      Key: "eksctl.cluster.k8s.io/v1alpha1/cluster-name",
      Value: "testing"
    }]
}
@errordeveloper

This comment has been minimized.

Copy link
Member

errordeveloper commented Feb 11, 2019

Just so that you know, there is a workaround...
If the cluster in question is not running any critical workloads, you should be able to update the cluster stack, then delete old nodegroup via CloudFormation console, and add new nodegroup after that; you will have downtime, but your workloads should eventually find new home on new nodegroup.
If it is critical to make it work without downtime, we can find a solution, but best to connect on Slack.

@nesv

This comment has been minimized.

Copy link

nesv commented Feb 11, 2019

@errordeveloper I was running into the exact same issue as the original poster: running eksctl utils update-cluster-stack ... resulted in the following error:

[ℹ]  creating cluster stack "eksctl-testing-cluster"
[✖]  inssuficient number of subnets, at least 2x public and/or 2x private subnets are required

@errordeveloper errordeveloper added this to the 0.1.20 - v1alpha4 milestone Feb 12, 2019

@errordeveloper

This comment has been minimized.

Copy link
Member

errordeveloper commented Feb 12, 2019

@nesv I can see this is a regression introduced in #513, I'll fix this and cut 0.1.20-rc.5 within the next two hours.

@errordeveloper

This comment has been minimized.

Copy link
Member

errordeveloper commented Feb 12, 2019

Looks like we will be able to cut 0.1.20 instead of 0.1.20-rc.5.

@nesv

This comment has been minimized.

Copy link

nesv commented Feb 12, 2019

Awesome! Thank you very much, @errordeveloper!

EDIT: Works like a charm. 😄

@kycfeel

This comment has been minimized.

Copy link
Author

kycfeel commented Feb 12, 2019

@errordeveloper Wanna say thank you too! I'll try the new version very soon.

@kycfeel

This comment has been minimized.

Copy link
Author

kycfeel commented Feb 13, 2019

@errordeveloper I'm experiencing an issue on eksctl 0.1.20 (upgraded via brew) while doing eksctl utils update-cluster-stack 😭.

$ eksctl utils update-cluster-stack --name=testcluster --region=eu-west-1

[ℹ]  creating cluster stack "eksctl-testcluster-cluster"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x1c67139]

goroutine 1 [running]:
github.com/weaveworks/eksctl/pkg/apis/eksctl.io/v1alpha4.(*ClusterConfig).PublicSubnetIDs(0xc000638780, 0x48, 0x3114200, 0xc000bbe200)
	/go/src/github.com/weaveworks/eksctl/pkg/apis/eksctl.io/v1alpha4/vpc.go:87 +0x59
github.com/weaveworks/eksctl/pkg/apis/eksctl.io/v1alpha4.(*ClusterConfig).HasSufficientSubnets(0xc000638780, 0x0, 0xc000061000)
	/go/src/github.com/weaveworks/eksctl/pkg/apis/eksctl.io/v1alpha4/vpc.go:158 +0x2f
github.com/weaveworks/eksctl/pkg/cfn/builder.(*ClusterResourceSet).AddAllResources(0xc0009f7770, 0xc0009f7770, 0xc000b1f878)
	/go/src/github.com/weaveworks/eksctl/pkg/cfn/builder/cluster.go:38 +0x55
github.com/weaveworks/eksctl/pkg/cfn/manager.(*StackCollection).AppendNewClusterStackResource(0xc000686540, 0xc000638701, 0xc000686540, 0xd)
	/go/src/github.com/weaveworks/eksctl/pkg/cfn/manager/cluster.go:91 +0x66a
github.com/weaveworks/eksctl/pkg/ctl/utils.doUpdateClusterStacksCmd(0xc000074d00, 0xc000638780, 0x0, 0x0, 0x0, 0x32ab2e9)
	/go/src/github.com/weaveworks/eksctl/pkg/ctl/utils/update_cluster_stack.go:85 +0x1c4
github.com/weaveworks/eksctl/pkg/ctl/utils.updateClusterStackCmd.func1(0xc00063d680, 0xc00076d560, 0x0, 0x2)
	/go/src/github.com/weaveworks/eksctl/pkg/ctl/utils/update_cluster_stack.go:28 +0x87
github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra.(*Command).execute(0xc00063d680, 0xc00076d520, 0x2, 0x2, 0xc00063d680, 0xc00076d520)
	/go/src/github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra/command.go:766 +0x2cc
github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x516bce0, 0x1, 0x5b, 0x2b)
	/go/src/github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra/command.go:852 +0x2fd
github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra.(*Command).Execute(0x516bce0, 0x0, 0xc00062a1f8)
	/go/src/github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra/command.go:800 +0x2b
main.main()
	/go/src/github.com/weaveworks/eksctl/cmd/eksctl/main.go:65 +0x2d

Can you help me to solve this problem?

@errordeveloper

This comment has been minimized.

Copy link
Member

errordeveloper commented Feb 13, 2019

@kycfeel I found the root cause, but it'd very be helpful if you could also provide the output of:

  1. eksctl utils describe-cluster-stacks --name=testcluster --region=eu-west-1
  2. eksctl utils update-cluster-stack --name=testcluster --region=eu-west-1 --verbose=4

If could provide these, I'll add a fix very shortly.

@errordeveloper errordeveloper self-assigned this Feb 13, 2019

@kycfeel

This comment has been minimized.

Copy link
Author

kycfeel commented Feb 13, 2019

Output from eksctl utils describe-stacks --name=testcluster --region=eu-west1

[ℹ]  stack/eksctl-testcluster-nodegroup-0 = {
  Capabilities: ["CAPABILITY_IAM"],
  CreationTime: 2019-01-03 02:57:38.155 +0000 UTC,
  Description: "EKS nodes (Amazon Linux 2 with SSH)  [created and managed by eksctl]",
  DisableRollback: false,
  EnableTerminationProtection: false,
  LastUpdatedTime: 2019-01-24 08:40:07.012 +0000 UTC,
  Outputs: [{
      ExportName: "eksctl-testcluster-nodegroup-0::NodeInstanceRoleARN",
      OutputKey: "NodeInstanceRoleARN",
      OutputValue: "arn:aws:iam::702434047870:role/eksctl-testcluster-nodegroup-0-NodeInstanceRole-O3CJX0MM98KZ"
    }],
  RollbackConfiguration: {

  },
  StackId: "arn:aws:cloudformation:eu-west-1:702434047870:stack/eksctl-testcluster-nodegroup-0/5414ab30-0f03-11e9-8488-0646713dba72",
  StackName: "eksctl-testcluster-nodegroup-0",
  StackStatus: "UPDATE_COMPLETE",
  Tags: [{
      Key: "eksctl.cluster.k8s.io/v1alpha1/cluster-name",
      Value: "testcluster"
    },{
      Key: "eksctl.cluster.k8s.io/v1alpha1/nodegroup-id",
      Value: "0"
    }]
}
[ℹ]  stack/eksctl-testcluster-cluster = {
  Capabilities: ["CAPABILITY_IAM"],
  CreationTime: 2019-01-03 02:47:59.055 +0000 UTC,
  Description: "EKS cluster (with dedicated VPC & IAM role)  [created and managed by eksctl]",
  DisableRollback: false,
  EnableTerminationProtection: false,
  Outputs: [
    {
      ExportName: "eksctl-testcluster-cluster::Subnets",
      OutputKey: "Subnets",
      OutputValue: "subnet-08615af6d19b539d7,subnet-01b1e00bdebe805f3,subnet-08c2f9be709c0bc68"
    },
    {
      ExportName: "eksctl-testcluster-cluster::Endpoint",
      OutputKey: "Endpoint",
      OutputValue: "https://9D0970526A71313E11147ED0409C4071.sk1.eu-west-1.eks.amazonaws.com"
    },
    {
      ExportName: "eksctl-testcluster-cluster::VPC",
      OutputKey: "VPC",
      OutputValue: "vpc-0d4a17337019a3366"
    },
    {
      OutputKey: "ClusterStackName",
      OutputValue: "eksctl-testcluster-cluster"
    },
    {
      OutputKey: "CertificateAuthorityData",
      OutputValue: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRFNU1ERXdNekF5TlRZeE0xb1hEVEk0TVRJek1UQXlOVFl4TTFvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBS25JCjFSV09LbTE4bXBSRHcrc01abmI3U1ZuenVXL3N1bWR0SUFOTG43YVhMQlRNZ2dLV2p3dzNBUUxBTGQ4aDNLM0gKQmZoOENaV2d1a1VncGU2RG41ckhsM1hkWVNFNjB4ZEdYeFRQN25VT2VzQ1NCMEFvZ3NCMVNYQ1d5dEhyQmdHdwpObjdmRytoQWlzb3BmeExXblU1MmdBUzNjdUNpS3V0WkUwMzd6NkxhcGhGbzZqQmZROEErVUY3R3pmMmN0MGxCCmViMFYzZHBuOHYwSmpoQk4yemJscE1MNlY2cTNrSUoxcW80RnA0Q3Ayck9SRjdTV0Fabnc5ZUxMbmxZZm1DQWgKL2dIQThCbE42dEF3b05lVE81K1gxSDA5L2xraGlleEJuMmlDUVpwSisxNHlPdU9neVkwZW1rVzBSbWVobE9uQQpMdkNHSWRxZVdNRkFXUTZOTStFQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFCdkJKL25kVzJwOWxjR0hkczVNQWE5cXRnMncKdURTQWg0UlRLQVc2am9NREZrTk1Pb0U1NkJob1RSK3ZwWHV4K2VJNlFiV2xJWTlmQ3I2N2JqTC9QbGcveFlRSQptcFhSUnF4TG9TT25aMjFpWEY5akJhTTgzTjZ4MUFtTW1Wd1dPQ0VsYXBYSVIyZEtuV0hQSzhqMWlQQkhlakRhCnVGQVZ0SlI5eW9QWWtJRmUyL09mMDBTTUxOdngrb0w0L3BVbHlKRjgvV0dndi9nd0NITU53akVGWi9tRjhENjcKaENCQmhibjFxeHFXYVNkeTJxWFc5enZMV1I5RHlZdGQ1M0RZRVdQNVNUNXRONlZycFhMc216dE83UEJnRkt1eQpuNnpHRkZhUUZ5VXlrR2hNV2RzVHlDWTBNVitDUEsvNXpkMUdKUWRMa3N0U1REcW85azNxcXp3S3hNND0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo="
    },
    {
      ExportName: "eksctl-testcluster-cluster::SecurityGroup",
      OutputKey: "SecurityGroup",
      OutputValue: "sg-02b27112c5d0b83d3"
    },
    {
      ExportName: "eksctl-testcluster-cluster::ARN",
      OutputKey: "ARN",
      OutputValue: "arn:aws:eks:eu-west-1:702434047870:cluster/testcluster"
    }
  ],
  RollbackConfiguration: {

  },
  StackId: "arn:aws:cloudformation:eu-west-1:702434047870:stack/eksctl-testcluster-cluster/fae89040-0f01-11e9-bbff-503abe701c8d",
  StackName: "eksctl-testcluster-cluster",
  StackStatus: "CREATE_COMPLETE",
  Tags: [{
      Key: "eksctl.cluster.k8s.io/v1alpha1/cluster-name",
      Value: "testcluster"
    }]
}

also Output from eksctl utils update-cluster-stack --name=testcluster --region=eu-west-1 --verbose=4.

2019-02-13T18:21:24+09:00 [▶]  role ARN for the current session is "arn:aws:iam::702434047870:user/yechan"
2019-02-13T18:21:26+09:00 [▶]  cfg.json = \
{
    "kind": "ClusterConfig",
    "apiVersion": "eksctl.io/v1alpha4",
    "metadata": {
        "name": "testcluster",
        "region": "eu-west-1",
        "version": "1.11"
    },
    "iam": {},
    "vpc": {
        "id": "vpc-0d4a17337019a3366",
        "securityGroup": "sg-02b27112c5d0b83d3"
    }
}
2019-02-13T18:21:26+09:00 [ℹ]  creating cluster stack "eksctl-testcluster-cluster"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x1c67139]

goroutine 1 [running]:
github.com/weaveworks/eksctl/pkg/apis/eksctl.io/v1alpha4.(*ClusterConfig).PublicSubnetIDs(0xc000638980, 0x48, 0x3114200, 0xc00075cd80)
	/go/src/github.com/weaveworks/eksctl/pkg/apis/eksctl.io/v1alpha4/vpc.go:87 +0x59
github.com/weaveworks/eksctl/pkg/apis/eksctl.io/v1alpha4.(*ClusterConfig).HasSufficientSubnets(0xc000638980, 0x0, 0x517ab80)
	/go/src/github.com/weaveworks/eksctl/pkg/apis/eksctl.io/v1alpha4/vpc.go:158 +0x2f
github.com/weaveworks/eksctl/pkg/cfn/builder.(*ClusterResourceSet).AddAllResources(0xc000be1c20, 0xc000be1c20, 0xc000b0d878)
	/go/src/github.com/weaveworks/eksctl/pkg/cfn/builder/cluster.go:38 +0x55
github.com/weaveworks/eksctl/pkg/cfn/manager.(*StackCollection).AppendNewClusterStackResource(0xc00075c510, 0xc000638901, 0xc00075c510, 0xd)
	/go/src/github.com/weaveworks/eksctl/pkg/cfn/manager/cluster.go:91 +0x66a
github.com/weaveworks/eksctl/pkg/ctl/utils.doUpdateClusterStacksCmd(0xc0002cf2c0, 0xc000638980, 0x0, 0x0, 0x0, 0x32ab2e9)
	/go/src/github.com/weaveworks/eksctl/pkg/ctl/utils/update_cluster_stack.go:85 +0x1c4
github.com/weaveworks/eksctl/pkg/ctl/utils.updateClusterStackCmd.func1(0xc00063f400, 0xc0003ee9c0, 0x0, 0x3)
	/go/src/github.com/weaveworks/eksctl/pkg/ctl/utils/update_cluster_stack.go:28 +0x87
github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra.(*Command).execute(0xc00063f400, 0xc0003ee570, 0x3, 0x3, 0xc00063f400, 0xc0003ee570)
	/go/src/github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra/command.go:766 +0x2cc
github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x516bce0, 0x1, 0x5b, 0x2b)
	/go/src/github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra/command.go:852 +0x2fd
github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra.(*Command).Execute(0x516bce0, 0x0, 0xc0000e6260)
	/go/src/github.com/weaveworks/eksctl/vendor/github.com/spf13/cobra/command.go:800 +0x2b
main.main()
	/go/src/github.com/weaveworks/eksctl/cmd/eksctl/main.go:65 +0x2d

Thanks a lot!

@errordeveloper

This comment has been minimized.

Copy link
Member

errordeveloper commented Feb 13, 2019

@kycfeel ok, looks like you have an old cluster and it only has one set of subnets (public), instead of public and private. I'll fix the panic, but I won't be able to make eksctl utils update-cluster-stack function correctly in this particular case. It's best if you migrate to a new cluster. If this critical for you, please let me know and we can discuss how this can be solved.

@kycfeel

This comment has been minimized.

Copy link
Author

kycfeel commented Feb 13, 2019

@errordeveloper My production cluster which created with same version of eksctl that i used to create testcluster got very critical workloads so I really want to avoid migrate everything to new cluster. I hope we can find a way to upgrade existing one.

@errordeveloper

This comment has been minimized.

Copy link
Member

errordeveloper commented Feb 13, 2019

@kycfeel I've set you an email.

@errordeveloper

This comment has been minimized.

Copy link
Member

errordeveloper commented Feb 13, 2019

The panic was fixed in ee18cd9. I'll be looking into #539 soon, so that we mitigate expectations earlier.

@Aspekt112

This comment has been minimized.

Copy link

Aspekt112 commented Mar 17, 2019

@errordeveloper can you sent this e-mail for me too? Run unto the same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.