-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to unmount from EC2 #66
Comments
Hi, the per-batch job EBS provisioning code was retired in the latest version of aegea, and is not recommended because of hazards like the one you encountered. Note that after this failure to unmount, you will end up with an orphaned EBS volume that your organization will continue to pay for until you clean it up. This is quite hazardous if nobody is watching out for this type of error, because EBS volumes are not cheap. I recommend using the EFS auto-mount functionality in the latest version of aegea instead. That way your jobs can use a shared EFS filesystem as their scratch space, and Batch automatically manages mounting and unmounting it for you. If you must continue to use the EBS batch job volume code, it was updated to incorporate tracking down and terminating processes that prevent a clean unmount: https://github.com/kislyuk/aegea/blob/develop/aegea/ebs.py#L180 You could try forking your version of aegea and updating this line to see if it makes a difference. |
In addition to the line above, if you take a look at the commit that originally introduced it in v2.8.0, there is another change that makes the Batch cleanup handler perform a |
Hello, Thanks again for all of this advice. I was going to just keep an eye on the EBS volumes to make sure there weren't any orphaned ones, but after this message I decided to just update to aegea 4.0.1 and just make the dependencies work out. So far so good. Thanks again for your quick and helpful reply- much appreciated. Best, |
Hello,
I'm running aegea version 2.6.9 for compatibility reasons and am hitting the following error after my batch jobs finish:
The result is that the drives are left up after the jobs finish. Do you have any idea what could be causing this issue? I'm happy to play around in the aegea code of this version to fix things myself, I just don't know where to start.
Thanks in advance,
Matt
The text was updated successfully, but these errors were encountered: