Skip to content
This repository has been archived by the owner on Jan 12, 2022. It is now read-only.

cli(destroy): handle gcp global address tear down #1320

Merged
merged 2 commits into from
Aug 24, 2021
Merged

Conversation

sreis
Copy link
Contributor

@sreis sreis commented Aug 24, 2021

Workaround for #976. I think this is some undocumented behavior where a second request to delete the global address returns a 400.

The log output of a successful destroy operation shows the global desired state is reached immediately after the delete.

[2021-08-24T00:53:20Z] 2021-08-24T00:53:20.930Z �[32minfo�[39m: Ensure Address deletion
[2021-08-24T00:53:21Z] 2021-08-24T00:53:21.124Z �[32minfo�[39m: Global Address is RESERVED
[2021-08-24T00:53:23Z] 2021-08-24T00:53:23.753Z �[32minfo�[39m: Global Address teardown: desired state reached

A failed run shows it tried to issue a delete twice.

[2021-08-23T19:04:24Z] 2021-08-23T19:04:24.724Z info: Global Address is RESERVED
[2021-08-23T19:04:27Z] 2021-08-23T19:04:27.761Z info: Global Address is RESERVED
[2021-08-23T19:04:28Z] 2021-08-23T19:04:28.185Z error: error during cluster teardown (attempt 1):
[2021-08-23T19:04:28Z] GaxiosError: The resource 'projects/ci-shard-bbb/global/addresses/google-managed-services-schedul-bk-2936-3b3-ug' is not ready

This was not caught earlier in the upgrade tests because the teardown script was swallowing the error. This is also fixed in this PR.

Signed-off-by: Simão Reis <sreis@opstrace.com>
Workaround for #976 to avoid hitting an undocumented
API limitation where a second request to delete a global address is
returning a 400.

Signed-off-by: Simão Reis <sreis@opstrace.com>
yield call(deleteAddress, {
name: addressName
});
deleteIssued = true;
Copy link
Contributor

@jgehrcke jgehrcke Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok! :)

For future: another option is to issue the DELETE API call in a loop, i.e. potentially more than once and catch-log all resulting API errors (getting said 400 response is absolutely fine -- we just crashed too much). And to then have a definite 'desired state reached' criterion for leaving said loop.

That's the paradigm we have implemented for most of AWS resource deletion via

private async tryDestroyWrapper(): TryDestroyResultType {

public async teardown(): Promise<void> {

if [ "${EXITCODE_DESTROY}" -ne 0 ]; then
echo "teardown() not yet finished, destroy failed. Exit with exitcode of destroy"
exit "${EXITCODE_DESTROY}"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch! :) thanks.

@sreis
Copy link
Contributor Author

sreis commented Aug 24, 2021

Looks like it worked, it only issued one delete request when it checked the state of the Global Address during teardown:

[2021-08-24T15:11:41Z] 2021-08-24T15:11:41.457Z info: Ensure Address deletion
[2021-08-24T15:11:41Z] 2021-08-24T15:11:41.741Z info: Global Address is RESERVED
[2021-08-24T15:11:41Z] 2021-08-24T15:11:41.741Z info: Deleting Global Address
[2021-08-24T15:11:44Z] 2021-08-24T15:11:44.769Z info: Global Address is RESERVED
[2021-08-24T15:11:47Z] 2021-08-24T15:11:47.066Z info: Global Address teardown: desired state reached
[2021-08-24T15:11:47Z] 2021-08-24T15:11:47.066Z info: Destroying CloudNat and Router
[2021-08-24T15:12:18Z] 2021-08-24T15:12:18.376Z info: Destroying Subnet

@sreis sreis merged commit cb7764a into main Aug 24, 2021
@sreis sreis deleted the sreis/gcp-teardown branch August 24, 2021 15:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants