New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable cleaning up of hosted cluster cloud resources on destroy #1672
Enable cleaning up of hosted cluster cloud resources on destroy #1672
Conversation
✅ Deploy Preview for hypershift-docs ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
Operations: []admissionregistrationv1.OperationType{ | ||
admissionregistrationv1.Create, | ||
}, | ||
Rule: admissionregistrationv1.Rule{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this not be a second rule in the first validating webhook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, fixing
} | ||
// If storage has already been removed, nothing to do | ||
// When the registry operator has been removed, management state in status is currently cleared. | ||
if registryConfig.Status.Storage.ManagementState == "" || registryConfig.Status.Storage.ManagementState == "Removed" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Why not use the const like in the CreateOrUpdate call below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because this field is not typed, it's only a string. The other field is an enum that has constants.
for i := range pvcs.Items { | ||
pvc := &pvcs.Items[i] | ||
log.Info("Deleting persistent volume claim", "name", client.ObjectKeyFromObject(pvc).String()) | ||
if err := r.client.Delete(ctx, pvc); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not gate this with a DeletionTimestamp checks like with services?
And should't it be possible to avoid the duplicate code for services, pvcs and pvs, maybe with a configurable filter function?
if len(errs) > 0 { | ||
return false, errors.NewAggregate(errs) | ||
} | ||
return false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make sense to track if we actually deleted anything and return that negated, just like we do with Services?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, in this case, we're looking for the existence of PVs ... those will get removed once the referring pods/pvcs have been removed. However, if we got here, the initial check for PVs indicated that we still have some.
|
||
// Needed for resource cleanup | ||
&corev1.Service{}: allSelector, | ||
&corev1.Pod{}: allSelector, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm that really sucks, especially the pods? Can we maybe use an API reading client just for the new deletion path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just for the pods? I do want to watch Services because otherwise I wouldn't necessarily get requeued when they go away.
@csrwng also, can we enable this for the e2e tests? |
d9df84a
to
f9bc658
Compare
@alvaroaleman addressed your comments. Also made it the default for e2e clusters. |
f9bc658
to
8d4d637
Compare
} | ||
log.Info("Ensuring image registry storage is removed") | ||
removed, err := r.ensureImageRegistryStorageRemoved(ctx) | ||
if err != nil || !removed { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to not block cleanup of other resource type by the previous resource type and only return prior to the status condition setup if there are any remaining resources?
pv("bar"), pvc("bar"), | ||
}, | ||
verify: verifyPVCsRemoved, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a pod test here and only put the pods into the uncachedClient
? That aspect is really important
Enables the removal of cloud resources created during the lifetime of a guest cluster. These include: - registry storage - ingress load balancer and dns records - load balancer services - persistent volumes Removal of these resources will occur when the following annotation is present: hypershift.openshift.io/cleanup-cloud-resources: "true" The CLI adds this annotation to a HostedCluster when invoking the destroy command with the 'destroy-cloud-resources' flag: hypershift destroy cluster aws --destroy-cloud-resources
8d4d637
to
7eebb6f
Compare
@alvaroaleman addressed your latest comments |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alvaroaleman, csrwng The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest-required |
@csrwng: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
What this PR does / why we need it:
Enables the removal of cloud resources created during the lifetime of a
guest cluster. These include:
Removal of these resources will occur when the following annotation is
present:
hypershift.openshift.io/cleanup-cloud-resources: "true"
The CLI adds this annotation to a HostedCluster when invoking the
destroy command with the 'destroy-cloud-resources' flag:
hypershift destroy cluster aws --destroy-cloud-resources
Which issue(s) this PR fixes:
Fixes #HOSTEDCP-486
Checklist