[SREP-4255] Add --security-group flag to cleanup command#3204
Conversation
|
There are test jobs defined for this repository which are not configured to run automatically. Comment |
|
Skipping CI for Draft Pull Request. |
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughAdded AWS security group cleanup functionality to the osde2e cleanup command via a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
f7db25b to
bc5bf90
Compare
Add --security-group flag immediately before --vpc in all 7 AWS cleanup jobs (cleanup-tekton-aws, cleanup-selfservice-aws, cleanup-dev-aws, cleanup-hypershift-aws, cleanup-trt-aws, cleanup-informing-aws, cleanup-rosa-nightly). This ensures leftover security groups are removed before VPC CloudFormation stack deletion is attempted. Workaround for OCPBUGS-74960, implemented in openshift/osde2e#3204. Signed-off-by: Daniel Hall <danhall@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
cmd/osde2e/cleanup/cmd.go (1)
51-52: Minor: Naming inconsistency.Variable
securityGroups(plural) vs flag--security-group(singular). Consider aligning for clarity, though not blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/osde2e/cleanup/cmd.go` around lines 51 - 52, The variable name securityGroups is inconsistent with the CLI flag --security-group; rename one for clarity—either change the variable securityGroups to securityGroup to match the singular flag, or change the flag to --security-groups to match the plural variable—update all references to the chosen identifier (e.g., securityGroups, securityGroup, and the flag definition/usage in cmd package) so tests and flag parsing code continue to work.pkg/common/aws/securitygroups.go (1)
78-83: Consider distinguishing stack-not-found from other errors.All
DescribeStackserrors are silently skipped. Transient failures (throttling, auth) would cause VPCs to be skipped without indication. Adding a log for non-404 errors would improve debuggability.♻️ Proposed improvement
stackResp, err := cfnClient.DescribeStacks(&cloudformation.DescribeStacksInput{ StackName: aws.String(vpcStackName), }) if err != nil { + // Stack not found is expected; log other errors for debugging + if !strings.Contains(err.Error(), "does not exist") { + log.Printf("Warning: DescribeStacks failed for %s: %v\n", vpcStackName, err) + } continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/aws/securitygroups.go` around lines 78 - 83, The DescribeStacks call in cfnClient.DescribeStacks when iterating vpcStackName currently swallows all errors (err -> continue); update the error handling in that block to distinguish the CloudFormation "stack does not exist" case from other errors: check the error type/message returned by cfnClient.DescribeStacks (use the AWS SDK error code or awsErr.Code() / errors.As to detect "ValidationError"/"StackNotFound" semantics), continue silently only for the not-found case, and for any other error log a warning or error including vpcStackName and the full err (e.g., via processLogger/your logger) before continuing; reference cfnClient.DescribeStacks, stackResp, and vpcStackName to locate and change the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@cmd/osde2e/cleanup/cmd.go`:
- Around line 51-52: The variable name securityGroups is inconsistent with the
CLI flag --security-group; rename one for clarity—either change the variable
securityGroups to securityGroup to match the singular flag, or change the flag
to --security-groups to match the plural variable—update all references to the
chosen identifier (e.g., securityGroups, securityGroup, and the flag
definition/usage in cmd package) so tests and flag parsing code continue to
work.
In `@pkg/common/aws/securitygroups.go`:
- Around line 78-83: The DescribeStacks call in cfnClient.DescribeStacks when
iterating vpcStackName currently swallows all errors (err -> continue); update
the error handling in that block to distinguish the CloudFormation "stack does
not exist" case from other errors: check the error type/message returned by
cfnClient.DescribeStacks (use the AWS SDK error code or awsErr.Code() /
errors.As to detect "ValidationError"/"StackNotFound" semantics), continue
silently only for the not-found case, and for any other error log a warning or
error including vpcStackName and the full err (e.g., via processLogger/your
logger) before continuing; reference cfnClient.DescribeStacks, stackResp, and
vpcStackName to locate and change the code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 25f594b0-9e7f-487c-b31e-f1b4cd4c7d98
📒 Files selected for processing (3)
cmd/osde2e/cleanup/cmd.gopkg/common/aws/securitygroups.gopkg/common/aws/vpc.go
bc5bf90 to
ec9a3a9
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/common/aws/securitygroups.go`:
- Around line 82-89: The describe-stack error branch currently only logs and
continues, which hides discovery failures; update both occurrences (the branch
handling err in the DescribeStacks call around the vpcStackName check and the
similar branch later) to also increment failedCounter and append the error to
errorBuilder (e.g., failedCounter++ and errorBuilder.WriteString or
fmt.Fprintf(&errorBuilder, "describe stack %s: %v\n", vpcStackName, err)) before
continuing, while preserving the existing special-case for awserr.Error with
Code() == "ValidationError" so that missing stacks still skip without counting
as a failure.
- Around line 28-35: The DescribeVpcs and DescribeSecurityGroups calls on
CcsAwsSession.ec2 are non-paginated and can miss results; replace DescribeVpcs
with DescribeVpcsPages and DescribeSecurityGroups with
DescribeSecurityGroupsPages, iterating the pages and appending items into a
results slice while propagating any error returned by the paginator callback.
Update the logic around the current results variable(s) used in this file
(references to DescribeVpcs result and DescribeSecurityGroups result) to collect
from each page, check/return any pagination error, and preserve the existing
Filters and input structs when calling the Pages functions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 5ea4385e-a9e6-4010-a7d6-523577012206
📒 Files selected for processing (3)
cmd/osde2e/cleanup/cmd.gopkg/common/aws/securitygroups.gopkg/common/aws/vpc.go
|
/override ci/prow/hypershift-pr-check |
|
@ritmun: Overrode contexts on behalf of ritmun: ci/prow/hypershift-pr-check DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
ec9a3a9 to
922feaf
Compare
Add a --security-group flag to the osde2e cleanup command that removes leftover security groups from orphaned VPCs whose CloudFormation stacks are in DELETE_FAILED state. Leftover security groups (e.g. from OCPBUGS-74960) block CloudFormation stack deletion, causing all VPC cleanup to fail silently. This flag is designed to run before --vpc so that the subsequent stack deletion can succeed. The implementation: - Finds osde2e VPCs with DELETE_FAILED CloudFormation stacks - Skips VPCs belonging to active clusters - Revokes all ingress/egress rules to clear cross-SG dependencies - Deletes all non-default security groups in those VPCs - Reports results to Slack summary Signed-off-by: Daniel Hall <danhall@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
922feaf to
8a5043b
Compare
Signed-off-by: Daniel Hall <danhall@redhat.com>
Extract core logic behind interfaces to enable testing with mocks, covering all code paths: skip conditions, dry-run, rule revocation, deletion success/failure, counter tracking, and error reporting. Signed-off-by: Daniel Hall <danhall@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
c0ac45c to
fa0976c
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ritmun, smarthall The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@ritmun: Overrode contexts on behalf of ritmun: ci/prow/hypershift-pr-check DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@smarthall: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Adds the new --security-group flag before --vpc in all 7 AWS cleanup periodic jobs. This cleans up leftover security groups in orphaned VPCs with DELETE_FAILED CloudFormation stacks, unblocking subsequent VPC stack deletion. Depends on: openshift/osde2e#3204 Signed-off-by: Daniel Hall <danhall@redhat.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds the new --security-group flag before --vpc in all 7 AWS cleanup periodic jobs. This cleans up leftover security groups in orphaned VPCs with DELETE_FAILED CloudFormation stacks, unblocking subsequent VPC stack deletion. Depends on: openshift/osde2e#3204 Signed-off-by: Daniel Hall <danhall@redhat.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…t#78297) Adds the new --security-group flag before --vpc in all 7 AWS cleanup periodic jobs. This cleans up leftover security groups in orphaned VPCs with DELETE_FAILED CloudFormation stacks, unblocking subsequent VPC stack deletion. Depends on: openshift/osde2e#3204 Signed-off-by: Daniel Hall <danhall@redhat.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…t#78297) Adds the new --security-group flag before --vpc in all 7 AWS cleanup periodic jobs. This cleans up leftover security groups in orphaned VPCs with DELETE_FAILED CloudFormation stacks, unblocking subsequent VPC stack deletion. Depends on: openshift/osde2e#3204 Signed-off-by: Daniel Hall <danhall@redhat.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add a --security-group flag to the osde2e cleanup command that removes leftover security groups from orphaned VPCs whose CloudFormation stacks are in DELETE_FAILED state.
Leftover security groups (e.g. from OCPBUGS-74960) block CloudFormation stack deletion, causing all VPC cleanup to fail silently. This flag is designed to run before --vpc so that the subsequent stack deletion can succeed.
The implementation: