[bug] Qhub fails to destroy on AWS instance #1081

viniciusdc · 2022-02-19T00:09:56Z

Describe the bug

Fresh install of qhub (version 0.4.0.dev86+g9fb62c8 based on main:9fb62c812e), failed to destroy instance:

[terraform]: module.network.aws_subnet.main[1]: Still destroying... [id=subnet-0e67818d0783e1d0c, 20m1s elapsed]
[terraform]: module.network.aws_subnet.main[0]: Still destroying... [id=subnet-057575eebaf199725, 20m1s elapsed]
[terraform]: 2022-02-18T20:24:16.347-0300 [INFO]  provider.terraform-provider-aws_v3.73.0_x5: 2022/02/18 20:24:16 [WARN] WaitForState timeout after 20m0s: timestamp=2022-02-18T20:24:16.347-0300
[terraform]: 2022-02-18T20:24:16.347-0300 [INFO]  provider.terraform-provider-aws_v3.73.0_x5: 2022/02/18 20:24:16 [WARN] WaitForState starting 30s refresh grace period: timestamp=2022-02-18T20:24:16.347-0300
[terraform]: 2022-02-18T20:24:16.371-0300 [INFO]  provider.terraform-provider-aws_v3.73.0_x5: 2022/02/18 20:24:16 [WARN] WaitForState timeout after 20m0s: timestamp=2022-02-18T20:24:16.371-0300
[terraform]: 2022-02-18T20:24:16.371-0300 [INFO]  provider.terraform-provider-aws_v3.73.0_x5: 2022/02/18 20:24:16 [WARN] WaitForState starting 30s refresh grace period: timestamp=2022-02-18T20:24:16.371-0300
[terraform]: ╷
[terraform]: │ Error: error deleting EC2 Subnet (subnet-0e67818d0783e1d0c): DependencyViolation: The subnet 'subnet-0e67818d0783e1d0c' has dependencies and cannot be deleted.
[terraform]: │ 	status code: 400, request id: a936bad1-4605-4b70-9fc4-25c0dae06131
[terraform]: │ 
[terraform]: │ 
[terraform]: ╵
[terraform]: ╷
[terraform]: │ Error: error detaching EC2 Internet Gateway (igw-00f5928d9219a02fa) from VPC (vpc-027cf7b4d3f134cc1): DependencyViolation: Network vpc-027cf7b4d3f134cc1 has some mapped public address(es). Please unmap those public address(es) before detaching the gateway.
[terraform]: │ 	status code: 400, request id: ae39e552-b365-45e6-a900-0304447bd733
[terraform]: │ 
[terraform]: │ 
[terraform]: ╵
[terraform]: ╷
[terraform]: │ Error: error deleting EC2 Subnet (subnet-057575eebaf199725): DependencyViolation: The subnet 'subnet-057575eebaf199725' has dependencies and cannot be deleted.
[terraform]: │ 	status code: 400, request id: 88945487-86a3-4f29-8ebd-8ab98ed4d638
[terraform]: │ 
[terraform]: │ 
[terraform]: ╵

I spotted this error before, and I was able to fix it by manually deleting the ELB (Elastic Load Balancer) assigned to the vpn, then deleting the vpn (as qhub has no condition to delete the resource for some weird reason).

How to reproduce

Steps to recreate:

qhub init aws --project=qhubstages --domain awsqhubstages.qhub.dev --auth-provider=password --terraform-state=local --ci-provider=github-actions
qhub deploy -c qhub-config.yaml --disable-prompt --dns-provider cloudflare --dns-auto-provision (do not execute the dns auto-provision bit if you need to do a redeployment)
then qhub destroy

Expected behavior

Successful execution of the destroy command, no traces of qhub resources left in the aws portal

Observations

I am not sure why this is happening, maybe the ELB is created from the provider, and terraform does not have control over it? or we just need to change the order of deletion (simple thoughts here)

The text was updated successfully, but these errors were encountered:

viniciusdc · 2022-02-19T23:38:13Z

Seems to be related to ELB and some internal behavior of Azure when removing resources, see this comment for ref.
There are some correlated items:

The ELB is auto-created during deployment and terraform does not have information about it, thus it's more robust when attempting to remove it.
The security group also is auto-generated, the same situation as above
The aws cluster might need to be dependent on its security roles, see example in ref.
During destroying, terraform attempts to destroy both the internet_gateway and its assigned vpcs, which might generate a loop of dependency (the gateway depends over the vps, to destroy a vpc you need to discard all internet attachments)

Possible solutions?

Add dependency between gateway and vpc (explicitly call depends_on)
import data resources for security_groups and LoadBalancer -- the second point is a little bit difficult as the Lb is created under Ingress, which means adding specific provider code into 06 stage...

tf_objects, checks, and state_imports Closes #1081

viniciusdc added type: bug 🐛 Something isn't working provider: AWS needs: investigation 🔍 Someone in the team needs to find the root cause and replicate this bug area: terraform 💾 labels Feb 19, 2022

costrouc added a commit that referenced this issue Feb 22, 2022

Reorganizing render, deploy, destroy to unify stages input_vars,

8c4715d

tf_objects, checks, and state_imports Closes #1081

costrouc mentioned this issue Feb 22, 2022

Reorganizing render, deploy, destroy to unify stages input_vars, tf_objects, checks, and state_imports #1091

Merged

costrouc added a commit that referenced this issue Feb 22, 2022

Reorganizing render, deploy, destroy to unify stages input_vars,

6360b87

tf_objects, checks, and state_imports Closes #1081

danlester closed this as completed in 5ba9746 Feb 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] Qhub fails to destroy on AWS instance #1081

[bug] Qhub fails to destroy on AWS instance #1081

viniciusdc commented Feb 19, 2022 •

edited

viniciusdc commented Feb 19, 2022 •

edited

[bug] Qhub fails to destroy on AWS instance #1081

[bug] Qhub fails to destroy on AWS instance #1081

Comments

viniciusdc commented Feb 19, 2022 • edited

Describe the bug

How to reproduce

Expected behavior

Observations

viniciusdc commented Feb 19, 2022 • edited

viniciusdc commented Feb 19, 2022 •

edited

viniciusdc commented Feb 19, 2022 •

edited