New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maintenance: remove pools and volumes #3620
maintenance: remove pools and volumes #3620
Conversation
f2ddf13
to
eb67ba1
Compare
/retest |
It doesn't seem like any of those failures are related. The tf-lint one says it passed. 🤷♀️ |
scripts/maintenance/virsh-cleanup.sh
Outdated
if test "${POOL}" = default | ||
then | ||
continue | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First time I look at this script, but it seems destroying all volumes including the ones in the default pool would be more consistent with what it's doing (which is removing all libvirt vms/net/...). Apart from this, looks good to me.
eb67ba1
to
f570343
Compare
This will clean up all volumes under all non-default pools. The openshift CI creates a pool for each cluster. Signed-off-by: Christy Norman <christy@linux.vnet.ibm.com>
f570343
to
48fe5ed
Compare
@cfergeau -- missed your review earlier, but just made that changed and pushed it. |
/lgtm /approve |
@cfergeau is mergebot not going to take this one? |
no idea how this bot works. Let's try again |
/assign @abhinavdahiya |
ping @abhinavdahiya -- could this one get merged? |
/retest |
@clnperez: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
that is pretty dansgerous removing all the volume pools, i think the entire script is like that.. why is the openshift-install destroy cluster not enough? |
@abhinavdahiya are you suggesting it not remove the default pool then? or that this script not remove pools at all? it does give a warning at the top of the script that all resources are being destroyed. without this change, we have leftover volume pools after some ci runs. the cluster destroy seems to not always be called, and this is the best thing we have to get a clean system back. |
@abhinavdahiya @cfergeau ping. (btw I added a slack reminder for myself so i'll not miss these e-mails sent when you make comments and hopefully can speed this up. apologies again for the long delays between my updates in the past) |
@jcpowermac @abhinavdahiya @cfergeau ping again |
@clnperez Can the CI job not be amended such that it always runs |
@sdodson it does run that. But we've seen times when not everything is cleaned up. I don't know that this is a ci-specific script per se. |
A little more context ... here's the place this is mentioned in the libvirt readme: https://github.com/openshift/installer/blob/master/docs/dev/libvirt/README.md#cleanup |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Given that this script is already fairly destructive, and has to be run manually, I'm fine with extending it. |
/remove-lifecycle stale |
In order to get this landed, we need an approval from one of the members in the OWNER_ALIASES file. |
@jcpowermac @jhixson74 @crawford @abhinavdahiya Can any of you take a look? |
@clnperez You will also need a bugzilla targeted to the current release. :) |
Also, is this script used anywhere? I disagree with the point about destroying the volumes in the pool instead of the pool itself since the pool is created during deployment by the installer. Destroying the volumes before you destroy+undefine the pool is just destroying+undefining the pool with more steps. We use a variation of this script in our dev environments with a similar "target", only it uses our user-ids to filter the clusters. This script destroys all domains and all networks (besides), so it's already super dangerous. Which is why I'd like to see it modified for security. To answer @abhinavdahiya's question: |
Why does an interrupted libvirt destroy delete the cluster info? Is the libvirt destroy that you are referring to the |
I am unsure. I suspect that the terraform teardown might not receive an error, or if it does receive one, it might ignore it.
Yes!
That would be a great fix. |
This seems fine to me. @jaypoulz can you file a BZ for the /approve This will also need a valid BZ or will need to wait until the branch opens again in a few weeks. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cfergeau, crawford The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
9 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/skip |
I've overridden bugzilla/valid-bug since this only affects libvirt use cases and I don't expect us to attempt to backport this. If we do need to backport then we'll need to get a bug filed retroactively. |
@clnperez: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest Please review the full test history for this PR and help us cut down flakes. |
This will clean up all volumes under all non-default pools. The
openshift CI creates a pool for each cluster.
Signed-off-by: Christy Norman christy@linux.vnet.ibm.com