Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OCPCLOUD-1385] Ensure removal of placement-groups during cluster destroy on AWS #5528

Merged

Conversation

JoelSpeed
Copy link
Contributor

As a part of the effort to introduce AWS EFA into openshift, we need to also include placement groups (openshift/enhancements#995).

As part of this, Machine API will create placement groups for users if they do not exist. During cluster teardown, these currently block the destroy operation as the installer does not know how to handle the placement groups.

This PR teaches the destroy logic how to handle placement groups, with an implementation based on other resources.

I have manually tested this by creating a cluster, adding two placement groups (created by MAPI) and then destroying the cluster. It no longer blocks indefinitely and successfully removes the placement groups and completes the cluster destroy operation.

})
if err != nil {
if err != nil {
if err.(awserr.Error).Code() == "InvalidPlacementGroup.Unknown" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This magic string came from a manual test, I tried to delete a placement group which I had already deleted and got this response body, so I think I have the correct string here based on this example

<?xml version="1.0" encoding="UTF-8"?>
<Response><Errors><Error><Code>InvalidPlacementGroup.Unknown</Code><Message>The Placement Group 'pg-04e435e5cedc43563' is unknown.</Message></Error></Errors><RequestID>cf6e03f4-35fe-40cc-bd20-15bd76759d40</RequestID></Response>

Copy link
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, save for the first comment.

GroupIds: []*string{aws.String(id)},
})
if err != nil {
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a duplicate check for err != nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, have resolved and pushed over the top

})
if err != nil {
if err != nil {
if err.(awserr.Error).Code() == "InvalidPlacementGroup.Unknown" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is terribly unfortunate that we have to use these magic string values to determine whether a resource has been deleted. It would be so much nicer if the AWS SDK exposed in its public interface the HTTP status code. There is nothing to do here, it is just a gripe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment may not have loaded while you reviewed, but I made a note as well about where this came from, I think I got the correct magic string #5528 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I also verified that it was correct by trying to delete a placement group that I had already deleted.

Copy link
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
/hold for 4.11

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 10, 2022
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 10, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 10, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: staebler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 10, 2022
@jstuever
Copy link
Contributor

/uncc

@openshift-ci openshift-ci bot removed the request for review from jstuever January 17, 2022 20:07
@JoelSpeed
Copy link
Contributor Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 31, 2022
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 31, 2022

@JoelSpeed: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-workers-rhel7 2785e7d link false /test e2e-aws-workers-rhel7
ci/prow/okd-e2e-aws 2785e7d link false /test okd-e2e-aws
ci/prow/e2e-aws-workers-rhel8 2785e7d link false /test e2e-aws-workers-rhel8
ci/prow/e2e-alibaba 2785e7d link true /test e2e-alibaba

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants