-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
What were you trying to accomplish?
I was trying to recreate nodegroup that was deleted before, and eksctl have discovered that previous CloudFormation stack was not cleaned up properly (it was in DELETE_FAILED status) and recommended to run eksctl delete nodegroup again for failed nodegroup, and so I did.
What happened?
eksctl delete nodegroup --region=us-east-2 --cluster=sdg-production --name=ng-1 returned panic:
2021-12-20 16:14:04 [ℹ] eksctl version 0.77.0
2021-12-20 16:14:04 [ℹ] using region us-east-2
2021-12-20 16:14:05 [ℹ] 1 nodegroup (ng-1) was included (based on the include/exclude rules)
2021-12-20 16:14:06 [!] stack's status of nodegroup named eksctl-sdg-production-nodegroup-ng-1 is DELETE_FAILED
2021-12-20 16:14:06 [!] continuing with deletion, error occurred: error getting instance role ARN for nodegroup "ng-1": stack not found for nodegroup "ng-1"
2021-12-20 16:14:06 [ℹ] will drain 1 nodegroup(s) in cluster "sdg-production"
2021-12-20 16:14:07 [!] no nodes found in nodegroup "ng-1" (label selector: "alpha.eksctl.io/nodegroup-name=ng-1")
2021-12-20 16:14:07 [ℹ] will delete 1 nodegroups from cluster "sdg-production"
2021-12-20 16:14:08 [!] stack's status of nodegroup named eksctl-sdg-production-nodegroup-ng-1 is DELETE_FAILED
2021-12-20 16:14:08 [ℹ] 1 task: { no tasks }
2021-12-20 16:14:08 [ℹ] will delete 1 nodegroups from auth ConfigMap in cluster "sdg-production"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x1e2e22b]
goroutine 1 [running]:
github.com/weaveworks/eksctl/pkg/authconfigmap.RemoveNodeGroup({0x343b8f0, 0xc0001269a0}, 0xc0004b9c20)
github.com/weaveworks/eksctl/pkg/authconfigmap/authconfigmap.go:319 +0x4b
github.com/weaveworks/eksctl/pkg/ctl/delete.doDeleteNodeGroup(0xc0001c2160, 0xc000c0cdc0, 0x1, 0x1, 0x0, 0x15, 0x0)
github.com/weaveworks/eksctl/pkg/ctl/delete/nodegroup.go:142 +0xa4d
github.com/weaveworks/eksctl/pkg/ctl/delete.deleteNodeGroupCmd.func1(0xc000a3f8f0, 0xc000cbfce0, 0x7, 0x0, 0x0, 0xc0001c2160, 0x39)
github.com/weaveworks/eksctl/pkg/ctl/delete/nodegroup.go:21 +0x19
github.com/weaveworks/eksctl/pkg/ctl/delete.deleteNodeGroupWithRunFunc.func1(0xc000c0e780, {0xc000a3f8f0, 0x3, 0x3})
github.com/weaveworks/eksctl/pkg/ctl/delete/nodegroup.go:38 +0xdd
github.com/spf13/cobra.(*Command).execute(0xc000c0e780, {0xc000a3f8c0, 0x3, 0x3})
github.com/spf13/cobra@v1.2.1/command.go:856 +0x60e
github.com/spf13/cobra.(*Command).ExecuteC(0xc000b27400)
github.com/spf13/cobra@v1.2.1/command.go:974 +0x3bc
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/cobra@v1.2.1/command.go:902
main.main()
github.com/weaveworks/eksctl/cmd/eksctl/main.go:97 +0x525
I guess this is because this nodegroup was already removed from auth ConfigMap by previous invocation of eksctl delete nodegroup.
How to reproduce it?
Have a nodegroup that somehow was left in DELETE_FAILED, then try to delete it.
Anything else we need to know?
Adding --update-auth-configmap=false bypasses panic, but as CloudFormation stack still in the DELETE_FAILED status, eksctl delete nodegroup still crashes with the same error.
Versions
$ eksctl info
eksctl version: 0.77.0
kubectl version: v1.22.4
OS: linux