cluster/aws/util.sh: kube-down doesn't respect custom VPC_ID #17219

Closed
peterbourgon opened this Issue Nov 13, 2015 · 24 comments

Comments

@peterbourgon
  1. export custom VPC_ID
  2. Invoke kube-up on AWS
  3. Verify cluster works
  4. Invoke kube-down

Expected behavior: all created k8s resources are removed.

Actual behavior: the script exits almost immediately, having not deleted anything.

It appears that kube-down blindly invokes get_vpc_id but that should only happen if VPC_ID isn't already set. kube-up has the correct behavior.

@peterbourgon

This comment has been minimized.

Show comment
Hide comment
@peterbourgon

peterbourgon Nov 13, 2015

This patch seems to resolve it, modulo style concerns.

diff --git a/cluster/aws/util.sh b/cluster/aws/util.sh
index 0c3f3a3..115abb0 100755
--- a/cluster/aws/util.sh
+++ b/cluster/aws/util.sh
@@ -1149,7 +1149,10 @@ function check-cluster() {
 }

 function kube-down {
-  local vpc_id=$(get_vpc_id)
+  local vpc_id=${VPC_ID}
+  if [[ "${vpc_id}" == "" ]]; then
+    vpc_id=$(get_vpc_id)
+  fi
   if [[ -n "${vpc_id}" ]]; then
     local elb_ids=$(get_elbs_in_vpc ${vpc_id})
     if [[ -n "${elb_ids}" ]]; then

This patch seems to resolve it, modulo style concerns.

diff --git a/cluster/aws/util.sh b/cluster/aws/util.sh
index 0c3f3a3..115abb0 100755
--- a/cluster/aws/util.sh
+++ b/cluster/aws/util.sh
@@ -1149,7 +1149,10 @@ function check-cluster() {
 }

 function kube-down {
-  local vpc_id=$(get_vpc_id)
+  local vpc_id=${VPC_ID}
+  if [[ "${vpc_id}" == "" ]]; then
+    vpc_id=$(get_vpc_id)
+  fi
   if [[ -n "${vpc_id}" ]]; then
     local elb_ids=$(get_elbs_in_vpc ${vpc_id})
     if [[ -n "${elb_ids}" ]]; then
@peterbourgon

This comment has been minimized.

Show comment
Hide comment
@peterbourgon

peterbourgon Nov 13, 2015

...but with that, it still tries to delete the VPC, which it shouldn't do. And several resources, like route tables, are unable to be deleted.

...but with that, it still tries to delete the VPC, which it shouldn't do. And several resources, like route tables, are unable to be deleted.

@peterbourgon

This comment has been minimized.

Show comment
Hide comment
@peterbourgon

peterbourgon Nov 16, 2015

Additionally, when I create a k8s cluster with a declared VPC_ID, I can't create LoadBalancers (ELBs) with my services anymore:

$ kubectl describe svc helloworld
 . . .
CreatingLoadBalancerFailed      Error creating load balancer (will retry): 
  Failed to create load balancer for service default/helloworld: ValidationError: 
    Either AvailabilityZones or SubnetIds must be specified
 . . . 
CreatingLoadBalancerFailed      Error creating load balancer (will retry): 
  Failed to create load balancer for service default/helloworld: Throttling: 
    Rate exceeded
 . . .
CreatingLoadBalancerFailed      Error creating load balancer (will retry): 
  Failed to create load balancer for service default/helloworld: Throttling: 
    Rate exceeded

Additionally, when I create a k8s cluster with a declared VPC_ID, I can't create LoadBalancers (ELBs) with my services anymore:

$ kubectl describe svc helloworld
 . . .
CreatingLoadBalancerFailed      Error creating load balancer (will retry): 
  Failed to create load balancer for service default/helloworld: ValidationError: 
    Either AvailabilityZones or SubnetIds must be specified
 . . . 
CreatingLoadBalancerFailed      Error creating load balancer (will retry): 
  Failed to create load balancer for service default/helloworld: Throttling: 
    Rate exceeded
 . . .
CreatingLoadBalancerFailed      Error creating load balancer (will retry): 
  Failed to create load balancer for service default/helloworld: Throttling: 
    Rate exceeded
@davidopp

This comment has been minimized.

Show comment
Hide comment
@davidopp

davidopp Nov 27, 2015

Member

@justinsb any thoughts on this?

Member

davidopp commented Nov 27, 2015

@justinsb any thoughts on this?

@royklopper

This comment has been minimized.

Show comment
Hide comment
@royklopper

royklopper Dec 21, 2015

@peterbourgon Kubernetes checks the Amazon API for resources with the tag named "KubernetesCluster". This must resemble the name of the cluster. Hope this will solve your problem.

@peterbourgon Kubernetes checks the Amazon API for resources with the tag named "KubernetesCluster". This must resemble the name of the cluster. Hope this will solve your problem.

@peterbourgon

This comment has been minimized.

Show comment
Hide comment
@peterbourgon

peterbourgon Dec 22, 2015

@royklopper That's fine, and thank you, but it doesn't address the original issue...

@royklopper That's fine, and thank you, but it doesn't address the original issue...

@justinsb justinsb modified the milestones: v1.2, v1.2-candidate Feb 10, 2016

@davidopp davidopp modified the milestones: v1.2, v1.2-candidate Feb 19, 2016

@bgrant0607 bgrant0607 added the sig/aws label Mar 9, 2016

@justinsb justinsb modified the milestones: v1.2, v1.3 Apr 1, 2016

@yissachar

This comment has been minimized.

Show comment
Hide comment
@yissachar

yissachar Apr 7, 2016

I was able to get a little further by commenting out the lines that delete the VPC itself. Unfortunately, the script deletes all ELBs in the VPC regardless of whether they were created by Kubernetes or not. I ended up in a weird state with some resources deleted that I didn't want deleted, and some Kubernetes resources left around which I had to manually clean up.

I was able to get a little further by commenting out the lines that delete the VPC itself. Unfortunately, the script deletes all ELBs in the VPC regardless of whether they were created by Kubernetes or not. I ended up in a weird state with some resources deleted that I didn't want deleted, and some Kubernetes resources left around which I had to manually clean up.

@thenayr

This comment has been minimized.

Show comment
Hide comment
@thenayr

thenayr Apr 26, 2016

This seems slightly concerning at the very least. I'm sure there are more than a few people who use custom VPC's (myself included) that would like to be able to tear down a cluster without clobbering any existing infrastructure inside of that VPC.

I just experienced this same problem, brought up a second cluster in a custom VPC, proceeded to run kube-down to destroy the cluster and ended up completely clobbering an existing cluster in a different VPC. Thought I missed some env vars, but double checked and they were all present.

thenayr commented Apr 26, 2016

This seems slightly concerning at the very least. I'm sure there are more than a few people who use custom VPC's (myself included) that would like to be able to tear down a cluster without clobbering any existing infrastructure inside of that VPC.

I just experienced this same problem, brought up a second cluster in a custom VPC, proceeded to run kube-down to destroy the cluster and ended up completely clobbering an existing cluster in a different VPC. Thought I missed some env vars, but double checked and they were all present.

@davidopp

This comment has been minimized.

Show comment
Hide comment
@davidopp

davidopp May 27, 2016

Member

@justinsb Can you give an update on this? It is marked with 1.3 milestone.

Member

davidopp commented May 27, 2016

@justinsb Can you give an update on this? It is marked with 1.3 milestone.

@activars

This comment has been minimized.

Show comment
Hide comment
@activars

activars Jun 9, 2016

Contributor

Seems to be a critical issue... If it's not warned ahead, it's likely to cause pain for the user.

Contributor

activars commented Jun 9, 2016

Seems to be a critical issue... If it's not warned ahead, it's likely to cause pain for the user.

@erictune

This comment has been minimized.

Show comment
Hide comment
@erictune

erictune Jun 15, 2016

Member

@davidopp Is this a release blocker?

Member

erictune commented Jun 15, 2016

@davidopp Is this a release blocker?

@erictune

This comment has been minimized.

Show comment
Hide comment
@erictune

erictune Jun 15, 2016

Member

key question is whether this worked in 1.2 and stopped working, versus it is just super annoying but never worked.

Member

erictune commented Jun 15, 2016

key question is whether this worked in 1.2 and stopped working, versus it is just super annoying but never worked.

@activars

This comment has been minimized.

Show comment
Hide comment
@activars

activars Jun 15, 2016

Contributor

I suspect the execution in certain scenario could be dangerous:

  • what if people want to leave the VPC untouched, but only destroy resources created by the kube-up.
  • what if we want keep the network settings

After a few days investigation, I believe a decent kube-down never worked... (Including security group clean up, dhcp clean up, routing table clean up, master deletion when VPC id is specified)

Since I have stuff to ship, the short term solution I had was using Ansible wrapped around current kube-up, and destroy resources correctly.

I can share the ansible if this helps - let me know.

On 15 Jun 2016, at 19:46, Eric Tune notifications@github.com wrote:

key question is whether this worked in 1.2 and stopped working, versus it is just super annoying but never worked.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

Contributor

activars commented Jun 15, 2016

I suspect the execution in certain scenario could be dangerous:

  • what if people want to leave the VPC untouched, but only destroy resources created by the kube-up.
  • what if we want keep the network settings

After a few days investigation, I believe a decent kube-down never worked... (Including security group clean up, dhcp clean up, routing table clean up, master deletion when VPC id is specified)

Since I have stuff to ship, the short term solution I had was using Ansible wrapped around current kube-up, and destroy resources correctly.

I can share the ansible if this helps - let me know.

On 15 Jun 2016, at 19:46, Eric Tune notifications@github.com wrote:

key question is whether this worked in 1.2 and stopped working, versus it is just super annoying but never worked.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@davidopp

This comment has been minimized.

Show comment
Hide comment
@davidopp

davidopp Jun 15, 2016

Member

I discussed with @justinsb just now.

As indicated in the discussion thread here, this never worked. However, it's a pretty bad problem.

@justinsb said he would look into it tonight. We discussed three possible approaches for the 1.3 timeframe:

  • actually fix it (the long-term solution)
  • partially fix it, so we don't over-kill, but possibly leak resources
  • total hack: just add something to kube-down.sh that, if you're running on AWS, prints a warning like "if you are using custom VPC_ID, this may ", press "y" to continue.
Member

davidopp commented Jun 15, 2016

I discussed with @justinsb just now.

As indicated in the discussion thread here, this never worked. However, it's a pretty bad problem.

@justinsb said he would look into it tonight. We discussed three possible approaches for the 1.3 timeframe:

  • actually fix it (the long-term solution)
  • partially fix it, so we don't over-kill, but possibly leak resources
  • total hack: just add something to kube-down.sh that, if you're running on AWS, prints a warning like "if you are using custom VPC_ID, this may ", press "y" to continue.
@activars

This comment has been minimized.

Show comment
Hide comment
@activars

activars Jun 15, 2016

Contributor

Can we make sure that it only destroys resources being labelled/tagged as KubernetesCluster with specified prefix?

There's an edge case: before the creation if a tagged resources already exist before creation, it should not allow creating another cluster with the same tag.

AWS Role name seems to be hard coded value. So it probably should be left untouched or refactored to be named with the cluster prefix.

On 15 Jun 2016, at 20:24, David Oppenheimer notifications@github.com wrote:

I discussed with @justinsb just now.

As indicated in the discussion thread here, this never worked. However, it's a pretty bad problem.

@justinsb said he would look into it tonight. We discussed three possible approaches for the 1.3 timeframe:

actually fix it (the long-term solution)
partially fix it, so we don't over-kill, but possibly leak resources
total hack: just add something to kube-down.sh that, if you're running on AWS, prints a warning like "if you are using custom VPC_ID, this may ", press "y" to continue.

You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

Contributor

activars commented Jun 15, 2016

Can we make sure that it only destroys resources being labelled/tagged as KubernetesCluster with specified prefix?

There's an edge case: before the creation if a tagged resources already exist before creation, it should not allow creating another cluster with the same tag.

AWS Role name seems to be hard coded value. So it probably should be left untouched or refactored to be named with the cluster prefix.

On 15 Jun 2016, at 20:24, David Oppenheimer notifications@github.com wrote:

I discussed with @justinsb just now.

As indicated in the discussion thread here, this never worked. However, it's a pretty bad problem.

@justinsb said he would look into it tonight. We discussed three possible approaches for the 1.3 timeframe:

actually fix it (the long-term solution)
partially fix it, so we don't over-kill, but possibly leak resources
total hack: just add something to kube-down.sh that, if you're running on AWS, prints a warning like "if you are using custom VPC_ID, this may ", press "y" to continue.

You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@erictune

This comment has been minimized.

Show comment
Hide comment
@erictune

erictune Jun 15, 2016

Member

@activars

It there any way you could send a PR?

Member

erictune commented Jun 15, 2016

@activars

It there any way you could send a PR?

@activars

This comment has been minimized.

Show comment
Hide comment
@activars

activars Jun 15, 2016

Contributor

@erictune Yes, definitely.

I do have most of the logics in the hacking Ansible script - just a matter of translate into the bash - I've been creating/destroying clusters in a pre-configured VPC 10 times a day without having issues, it's probably straight forward to reproduce.

e.g. PR #27278 is a small change relevant to the topic (i.e. a preconfigured VPC would have a DHCP set already, we shouldn't change the settings).

Do you mind give me a week or two, since I have a bit priority at work, but I can sort this out at free time.
By looking the 1.3 milestone due date 24th June, I will try to make a PR before that date.

Contributor

activars commented Jun 15, 2016

@erictune Yes, definitely.

I do have most of the logics in the hacking Ansible script - just a matter of translate into the bash - I've been creating/destroying clusters in a pre-configured VPC 10 times a day without having issues, it's probably straight forward to reproduce.

e.g. PR #27278 is a small change relevant to the topic (i.e. a preconfigured VPC would have a DHCP set already, we shouldn't change the settings).

Do you mind give me a week or two, since I have a bit priority at work, but I can sort this out at free time.
By looking the 1.3 milestone due date 24th June, I will try to make a PR before that date.

@justinsb

This comment has been minimized.

Show comment
Hide comment
@justinsb

justinsb Jun 16, 2016

Member

Sorry for delay. But I've reviewed this in detail now, and I no longer see it as a particularly critical problem.

  • In normal operation, kube-up owns the VPC, and will delete resources correctly.
  • We have an experimental and undocumented option to reuse a VPC, by setting VPC_ID. In that case we will not tag the VPC, and kube-down won't do anything. That's not great, but it's not dangerous (other than to your EC2 bills).
  • If you change the kube-down script, you can get it to delete other resources.
  • If you manually tag the VPC with the KubernetesCluster tag (indicating it is k8s owned), then kube-down will try to tear it down, including resources in that VPC whether or not tagged with the KubernetesCluster annotation. Again, not great, but arguably not incorrect.
  • If you give two clusters the same id (using KUBE_AWS_INSTANCE_PREFIX) in the same region, then kube-down will likely try to delete both This applies even if you specify a custom VPC_ID for one of them. If you don't use the VPC_ID option, it isn't possible to create two clusters with the same id. @thenayr I believe this is what you hit.

I'm inclined to say that the best fix here is for kube-down to print "we didn't find a VPC for that cluster. If you want to remove a shared cluster, please use X". Given kube-up & kube-down are now in maintenance mode I think we should point people to just run upup delete cluster --name <clusterid> --region <myregion>, but kube-deploy is a "big tent" and we can have many options.

Downgrading to P2, on the basis that @thenayr 's problem caused the upgrade to P0, but I now hypothesize this to be caused by using the same cluster-id in conjunction with VPC_ID. @thenayr do you think this was the problem? If not do you have any idea what could have happened in your scenario?

I don't think this is a great scenario, but I think there are two failure scenarios:

  1. we don't delete anything (and we should point uses to try a more capable tool)
  2. if users specify VPC_ID it is possible to create two clusters with the same name, which will cause problems on kube-down (and I'd say that we've always planned to address these use cases in a supported way with kube-up v2).
Member

justinsb commented Jun 16, 2016

Sorry for delay. But I've reviewed this in detail now, and I no longer see it as a particularly critical problem.

  • In normal operation, kube-up owns the VPC, and will delete resources correctly.
  • We have an experimental and undocumented option to reuse a VPC, by setting VPC_ID. In that case we will not tag the VPC, and kube-down won't do anything. That's not great, but it's not dangerous (other than to your EC2 bills).
  • If you change the kube-down script, you can get it to delete other resources.
  • If you manually tag the VPC with the KubernetesCluster tag (indicating it is k8s owned), then kube-down will try to tear it down, including resources in that VPC whether or not tagged with the KubernetesCluster annotation. Again, not great, but arguably not incorrect.
  • If you give two clusters the same id (using KUBE_AWS_INSTANCE_PREFIX) in the same region, then kube-down will likely try to delete both This applies even if you specify a custom VPC_ID for one of them. If you don't use the VPC_ID option, it isn't possible to create two clusters with the same id. @thenayr I believe this is what you hit.

I'm inclined to say that the best fix here is for kube-down to print "we didn't find a VPC for that cluster. If you want to remove a shared cluster, please use X". Given kube-up & kube-down are now in maintenance mode I think we should point people to just run upup delete cluster --name <clusterid> --region <myregion>, but kube-deploy is a "big tent" and we can have many options.

Downgrading to P2, on the basis that @thenayr 's problem caused the upgrade to P0, but I now hypothesize this to be caused by using the same cluster-id in conjunction with VPC_ID. @thenayr do you think this was the problem? If not do you have any idea what could have happened in your scenario?

I don't think this is a great scenario, but I think there are two failure scenarios:

  1. we don't delete anything (and we should point uses to try a more capable tool)
  2. if users specify VPC_ID it is possible to create two clusters with the same name, which will cause problems on kube-down (and I'd say that we've always planned to address these use cases in a supported way with kube-up v2).

justinsb added a commit to justinsb/kubernetes that referenced this issue Jun 16, 2016

@justinsb

This comment has been minimized.

Show comment
Hide comment
@justinsb

justinsb Jun 16, 2016

Member

And I opened #27518, which warns in the first scenario, and links to some task-based docs I am proposing adding to the kube-deploy repo (kubernetes/kube-deploy#114)

Member

justinsb commented Jun 16, 2016

And I opened #27518, which warns in the first scenario, and links to some task-based docs I am proposing adding to the kube-deploy repo (kubernetes/kube-deploy#114)

@thenayr

This comment has been minimized.

Show comment
Hide comment
@thenayr

thenayr Jun 16, 2016

@justinsb Thanks for the lengthy update.

I do believe you are correct in your assumption that the KUBE_AWS_INSTANCE_PREFIX was the same across both clusters. I didn't realize at the time the extent to which the instance prefix is referenced throughout the scripts.

I still think the default behavior if a user specifies a custom VPC_ID to the kube-down script should be to retain the VPC and not attempt to tear it down.

thenayr commented Jun 16, 2016

@justinsb Thanks for the lengthy update.

I do believe you are correct in your assumption that the KUBE_AWS_INSTANCE_PREFIX was the same across both clusters. I didn't realize at the time the extent to which the instance prefix is referenced throughout the scripts.

I still think the default behavior if a user specifies a custom VPC_ID to the kube-down script should be to retain the VPC and not attempt to tear it down.

@erictune

This comment has been minimized.

Show comment
Hide comment
@erictune

erictune Jun 17, 2016

Member

Thanks for the update, Justin.

Member

erictune commented Jun 17, 2016

Thanks for the update, Justin.

@justinsb

This comment has been minimized.

Show comment
Hide comment
@justinsb

justinsb Jun 18, 2016

Member

I still think the default behavior if a user specifies a custom VPC_ID to the kube-down script should be to retain the VPC and not attempt to tear it down.

I believe that is the behaviour. kube-down ignores VPC_ID, doesn't find a tagged VPC, and doesn't destroy anything. With #27518 it prints a hint as to why it isn't doing anything and points users to the more capable tools being developed in kube-deploy.

Member

justinsb commented Jun 18, 2016

I still think the default behavior if a user specifies a custom VPC_ID to the kube-down script should be to retain the VPC and not attempt to tear it down.

I believe that is the behaviour. kube-down ignores VPC_ID, doesn't find a tagged VPC, and doesn't destroy anything. With #27518 it prints a hint as to why it isn't doing anything and points users to the more capable tools being developed in kube-deploy.

k8s-merge-robot added a commit that referenced this issue Jun 20, 2016

Merge pull request #27518 from justinsb/kubedown_warn_if_no_vpc
Automatic merge from submit-queue

AWS kube-down: Issue warning if VPC not found

To address issue #17219
@goltermann

This comment has been minimized.

Show comment
Hide comment
@goltermann

goltermann Jun 20, 2016

Contributor

Moving P2 code bugs out of v1.3 milestone. Please comment if you disagree and think a fix is required for 1.3.

Contributor

goltermann commented Jun 20, 2016

Moving P2 code bugs out of v1.3 milestone. Please comment if you disagree and think a fix is required for 1.3.

@goltermann goltermann modified the milestones: next-candidate, v1.3 Jun 20, 2016

@justinsb

This comment has been minimized.

Show comment
Hide comment
@justinsb

justinsb Nov 15, 2016

Member

We're not going to be making further changes to kube-up. The behaviour of kube-up is now to fail-safe, I believe. kops can delete a cluster safely (even with a shared VPC), including the case where it did not create a cluster (using the --external flag). Not entirely sure whether there's something I'm missing that remains to-do on this issue, but I'm sure someone will let me know if so ;-) Closing

Member

justinsb commented Nov 15, 2016

We're not going to be making further changes to kube-up. The behaviour of kube-up is now to fail-safe, I believe. kops can delete a cluster safely (even with a shared VPC), including the case where it did not create a cluster (using the --external flag). Not entirely sure whether there's something I'm missing that remains to-do on this issue, but I'm sure someone will let me know if so ;-) Closing

@justinsb justinsb closed this Nov 15, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment