pkg/destroy/aws: Set 'matched' on tag-pagination errors #1129

wking · 2019-01-25T19:08:40Z

Fixing a bug from e24c7dc (#1039). Before this commit, we were setting loopError, so we'd still take another pass through the loop. But we weren't setting matched, because fetch errors (e.g. because the caller lacked tag:GetResources) would not have returned any resources. The deletion code would wrongly assume that there were no matching resources behind that tagClient and remove the client from tagClients. Run would end up exiting non-zero despite having abandoned the resources behind that tagClient.

With this commit, we no longer prune the tagClient. And since we don't distinguish between fatal and non-fatal errors, we'll just loop forever until the caller notices the problem and kills us. That's not great, but with permission pre-checks in the pipe via install-time credential operator calls, I don't know if it's worth putting in a fatal/nonfatal distinction now.

CC @dgoodwin

Fixing a bug from e24c7dc (pkg/destroy/aws: Use the resource-groups service for tag->ARN lookup, 2019-01-10, openshift#1039). Before this commit, we were setting loopError, so we'd still take another pass through the loop. But we weren't setting 'matched', because fetch errors (e.g. because the caller lacked tag:GetResources) would not have returned *any* resources. The deletion code would wrongly assume that there were no matching resources behind that tagClient and remove the client from tagClients. 'Run' would end up exiting non-zero despite having abandoned the resources behind that tagClient. With this commit, we no longer prune the tagClient. And since we don't distinguish between fatal and non-fatal errors, we'll just loop forever until the caller notices the problem and kills us. That's not great, but with permission pre-checks in the pipe via install-time credential operator calls, I don't know if it's worth putting in a fatal/nonfatal distinction now.

dgoodwin · 2019-01-25T19:10:10Z

Thanks, I won't LGTM... but LGTM.

abhinavdahiya · 2019-01-25T19:25:45Z

Can error check for Unauthorized help us short-circuit ?

wking · 2019-01-25T19:28:03Z

Can error check for Unauthorized help us short-circuit ?

Yes, but see the last paragraph in my topic post for why I didn't bother ;). Did you want me to bother?

wking · 2019-01-25T19:29:29Z

Also linking #1100, ~~which is the cred pre-checker~~. There would still be a possibility for "destroy run with different permissions than create", but it seems like a low probability.

Edit: never mind, #1100 is something else. I head @joelddiaz is working on cred-operator pre-checks.

wking · 2019-01-25T21:06:24Z

e2e-aws:

Flaky tests:

[sig-auth] ServiceAccounts should allow opting out of API token automount  [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]

Failing tests:

[Feature:DeploymentConfig] deploymentconfigs when run iteratively [Conformance] should immediately start a new deployment [Suite:openshift/conformance/parallel/minimal]

/retest

wking · 2019-01-25T22:56:16Z

e2e-aws:


Flaky tests:

[Feature:DeploymentConfig] deploymentconfigs with failing hook [Conformance] should get all logs from retried hooks [Suite:openshift/conformance/parallel/minimal]
[sig-storage] In-tree Volumes [Driver: hostPath] [Testpattern: Inline-volume (default fs)] subPath should support readOnly directory specified in the volumeMount [Suite:openshift/conformance/parallel] [Suite:k8s]
[sig-storage] In-tree Volumes [Driver: nfs] [Testpattern: Pre-provisioned PV (default fs)] subPath should support non-existent path [Suite:openshift/conformance/parallel] [Suite:k8s]

Failing tests:

[sig-storage] Volume limits should verify that all nodes have volume limits [Suite:openshift/conformance/parallel] [Suite:k8s]

/retest

wking · 2019-01-26T00:17:57Z

e2e-aws:

Failing tests:

[sig-storage] Volume limits should verify that all nodes have volume limits [Suite:openshift/conformance/parallel] [Suite:k8s]

So... close... :p

/retest

crawford · 2019-01-26T00:26:15Z

/lgtm

openshift-ci-robot · 2019-01-26T00:26:23Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: crawford, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [crawford,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2019-01-26T01:44:25Z

/retest