-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional AWS volumes not deleted when using --destroy option when cluster name is longer than 21 characters #75
Comments
Note: @dahorak seen this in CI today as well. |
@mbukatov do you have logs for the run? We should be logging the volumes we found with the pattern we are using. Would be helpful to see if there's a disconnect between your cluster name and what we're using for pattern matching the volumes. |
@clacroix12 Do you mean logs from the installation or destroy run? I keep the logs from installation only. Edit: Ah, it should be good enough. I will locate and check the logs tomorrow. |
@mbukatov also are you sure that those volumes are for this exact run? I just spun up a cluster and the name of my volumes includes the random id that we generate. ocs-ci cmd:
worker instance names:
extra volume names:
Notice how both have I ran the same commands as you, only changing the provided cluster name, and all volumes were destroyed. I verified from my logs that the pattern being used to find volumes was destroy cmd:
I was not able to reproduce this issue. |
@clacroix12 +1 The destroy cluster worked for me and deleted the 3 extra volumes as part of cleanup. Hence, even I was not able to reproduce this issue This was the command I used to create cluster Command to destroy cluster The extra volumes are deleted: nberry-may21-dsltb-worker-us-east-2a-9ns2k_extra_volume |
I have theory, why it doesn't work for me once ... but I wasn't able to install the cluster since then, so I wasn't able to reproduce it yet.. |
I was able to reproduce it and it and as I've write above, the reason in my case might be because of long cluster name:
And should be destroyed via this command:
But in AWS I see, that it shrink the name of the cluster, so for example
And after the destroy command, following 3 volumes remain in the AWS:
The full output of the destroy command is here: |
I located logs from 2019-05-20, as volumes listed in description of this issue belongs to cluster I created on on this date. This can be demonstrated via:
And 1st line when creation of the volume is logged:
Daniel's observation about long cluster name is good catch. As you can see, this applies to my case as well. Full name of my cluster was
while the volume name prefix is just It seems to me that we should delete the extra volumes using |
@mbukatov @dahorak It looks like either terraform or openshift-install is truncating the name if it's too long so this does actually create an issue if we're using the name we think the volumes will have to pattern match them. I like the idea of using the |
@clacroix12 yep, I think that we should propagate this cluster_id from cluster directory to some ENV_DATA and we can use this cluster_id for deleting extra volumes which will solve the issue. About cluster name length I think we should also limit it to some length as we are also adding some random numbers CID to the name. Currently CID is int with length 5 digits, we can decrease it to 3 chars which will be containing letters and digits to have more combination. |
@petr-balogh the purpose of the 5 digits we add to the cluster name was to ensure that the directory created was going to be unique. This is because by default all cluster directories are going to Now cluster dir exists here:
Proposed solution:
Note that Long story short, we can cut the 5 digits out of the equation if we wanted to use the run directory as the parent for cluster directories. This obviously doesn't cover the case where the user provides a |
should be fixed, old one. |
Description of the problem
Extra volumes are not deleted during cluster destroy.
Steps to reproduce
python run.py --suite suites/ocs_basic_install.yml --log-level info --cluster-name=loginname-ocs4-testing-cluster --no-email --no-destroy
python run.py --log-level DEBUG --cluster-path /tmp/loginname-ocs4-testing-cluster-88528/ --destroy
Actual results
3 extra volumes created in the step 1 still exist:
Expected results
There are no extra volumes left.
The text was updated successfully, but these errors were encountered: