Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete failed #129

Closed
vlerenc opened this issue Apr 16, 2018 · 3 comments
Closed

Delete failed #129

vlerenc opened this issue Apr 16, 2018 · 3 comments

Comments

@vlerenc
Copy link
Member

vlerenc commented Apr 16, 2018

I created a couple of clusters for testing and deleted them. I did not use the programmatic interface and I did not fiddle around with the IaaS resources or opened the console, but I got this error (canary: core/vlerencaw9):

Delete Processing
Currently executing (CloudBotanist).DestroyInfrastructure
Last Error
Operation failed as there are dependent objects on cloud provider level
Failed to delete Shoot cluster: Errors occurred during flow execution: '(CloudBotanist).DestroyInfrastructure' returned 'Terraform execution job 'vlerencaw9.infra.tf-job' could not be completed. The following issues have been found in the logs:

-> Pod 'vlerencaw9.infra.tf-job-zpldk' reported:
* aws_vpc.vpc (destroy): 1 error(s) occurred:
* aws_vpc.vpc: DependencyViolation: The vpc 'vpc-3264fd59' has dependencies and cannot be deleted.
	status code: 400, request id: <omitted>'
@vlerenc vlerenc added the kind/bug Bug label Apr 16, 2018
@rfranzke
Copy link
Member

Hm, that's the same issue which we saw occasionally in the past. The kube controller manager cannot delete load balancers or security groups from AWS (more often security groups). Generally, it deletes the load balancers first and the corresponding security groups second. However, AWS takes a while to detach the network interface from the load balancer and blocks the deletion of the security group until then.
The only thing we can do about it is deleting those resources on our own... let's discuss.

@vlerenc
Copy link
Member Author

vlerenc commented Apr 16, 2018

Let's wait for the smoke tests. If they fail, we should do something. The only thing we can do is to delete the security groups automatically and not wait on the controller manager to (not) do it.

@rfranzke
Copy link
Member

FYI: I've prepared an implementation that deletes the ELB security groups. Thus, just in case we need it, we can quickly have it.

@vlerenc vlerenc added the component/gardener Gardener label Jun 27, 2018
@gardener-robot-ci-1 gardener-robot-ci-1 added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 27, 2018
richardyuwen pushed a commit to richardyuwen/gardener that referenced this issue Mar 26, 2019
This is a mitigation for kubernetes/kubernetes#17626 and the not yet merged pull requests kubernetes/kubernetes#54569 and kubernetes/kubernetes#65912.
The kube-controller-manager does not add a finalizer to `Service` objects and may forget that it has created resources in the infrastructure due to this.

closes gardener#129

```improvement user
Gardener does now explicitly delete the load balancers and security groups belonging to Kubernetes services for AWS Shoots. This is to mitigate the issue that the kube-controller-manager does not use finalizers on Service objects and may forget that it has created resources in the underlying infrastructure.
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants