Skip to content

refact/lb/sg: isolate security group deletion fragments from EnsureLoadBalancerDeleted #1159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mtulio
Copy link
Contributor

@mtulio mtulio commented Jun 13, 2025

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Isolating security group deletion fragments from EnsureLoadBalancerDeleted to buildSecurityGroupsToDelete and deleteSecurityGroupsWithBackoff, so the evaluation criteria and backoff deletion can be reused in future implementations, i.e. NLB with Security Groups.

This change contributes to decrease the change scope of #1158.

Only the backoff logic has been changed to add exponencial check, preventing

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 13, 2025
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jun 13, 2025
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

Hi @mtulio. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 13, 2025
@mtulio
Copy link
Contributor Author

mtulio commented Jun 13, 2025

PTAL? This change is making the code more generic to be used in future implementations, such as #1158
/assign @kmala @elmiko

@elmiko
Copy link

elmiko commented Jun 13, 2025

probably won't get to this today, but i will review early next week.

@mtulio
Copy link
Contributor Author

mtulio commented Jun 13, 2025

I am able to run the tests locally:

$ make test
....
    --- PASS: TestGetNodeTopology/Should_return_unhandled_errors (0.00s)
PASS
ok  	k8s.io/cloud-provider-aws/pkg/resourcemanagers	1.025s
?   	k8s.io/cloud-provider-aws/pkg/services	[no test files]


$ ./e2e.test --ginkgo.v
....
-----------------------------

Summarizing 1 Failure:
  [FAIL] [cloud-provider-aws-e2e] ecr [It] should start pod using public ecr image
  /home/mtulio/go/pkg/mod/k8s.io/kubernetes@v1.26.0/test/e2e/framework/pod/pod_client.go:107

Ran 6 of 6 Specs in 710.314 seconds
FAIL! -- 5 Passed | 1 Failed | 0 Pending | 0 Skipped
--- FAIL: TestE2E (710.31s)
    suite_test.go:46: ReportDir: 
FAIL

Looks like the failed one would be related with my local setup, needs label ok-to-test to validate it in controlled environment.

@yue9944882
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 13, 2025
@mtulio mtulio force-pushed the refact-sg-deletion branch from 5ac5356 to 2785cbb Compare June 16, 2025 15:56
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from kmala. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 16, 2025
@mtulio mtulio marked this pull request as draft June 16, 2025 15:58
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 16, 2025
@mtulio mtulio marked this pull request as ready for review June 16, 2025 16:04
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 16, 2025
@k8s-ci-robot k8s-ci-robot requested a review from hakman June 16, 2025 16:05
@mtulio mtulio force-pushed the refact-sg-deletion branch from 2785cbb to 55d583b Compare June 16, 2025 16:05
@mtulio
Copy link
Contributor Author

mtulio commented Jun 16, 2025

I am observing a permanent failure on CI when launching the cluster trying to use an image that is no longer available:
https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/cloud-provider-aws/1159/pull-cloud-provider-aws-e2e/1934643492319924224#1:build-log.txt%3A752

 specified image "099720109477/ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-20250502.1" is invalid: could not find Image for "099720109477/ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-20250502.1"

This is also happening in other PRs I am watching.

Is this comes from the test framework or is it possible to use a valid image in CCM repo?

@mtulio
Copy link
Contributor Author

mtulio commented Jun 17, 2025

/test pull-cloud-provider-aws-e2e

@mtulio
Copy link
Contributor Author

mtulio commented Jun 17, 2025

An issue has been opened to track the CI problem: #1167

@mtulio
Copy link
Contributor Author

mtulio commented Jun 18, 2025

Hi @yue9944882 @kmala , feedback addressed. Would you mind taking a look? I am still waiting/looking for CI to fix e2e job (#1167), but this is ready for review.

@mtulio
Copy link
Contributor Author

mtulio commented Jun 18, 2025

/test pull-cloud-provider-aws-e2e

@mtulio
Copy link
Contributor Author

mtulio commented Jun 19, 2025

Looks like e2e is now working, and tests are green awaiting for feedback! Thanks!

Isolating security group deletion fragments from EnsureLoadBalancerDeleted
to buildSecurityGroupsToDelete and deleteSecurityGroupsWithBackoff, so
the envaluation criteria and backof deletion can be reused in future
implementations, i.e. NLB with Security Groups.
@mtulio mtulio force-pushed the refact-sg-deletion branch from 55d583b to 82f61bc Compare June 20, 2025 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants