Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we just stop instead of terminating a finished runner? #59

Open
neverchanje opened this issue Aug 4, 2021 · 3 comments
Open

Can we just stop instead of terminating a finished runner? #59

neverchanje opened this issue Aug 4, 2021 · 3 comments

Comments

@neverchanje
Copy link

neverchanje commented Aug 4, 2021

In my case, I cache a lot of intermediate compilation objects on EBS and I don't want to lose them when a workflow is finished.
So I'm wondering if we can provide an option that only stops a machine without terminating it, and also an option to support restarting from the previously stopped instance.

mode: start_existing
instances:
  - i-123
  - i-124
  - i-125

mode: stop_no_terminate
instances:
  - i-123
  - i-124
  - i-125

Related AWS APIs:

Another approach could be just to reuse the available EBS volumes that were not deleted after the previous ec2 was terminated. https://aws.amazon.com/premiumsupport/knowledge-center/deleteontermination-ebs/ I do not know which option is technically easier to implement.

Thanks.

@machulav
Copy link
Owner

Currently, we don't support this, but it may be possible with the EC2 launch templates approach in #65, right @jpalomaki 😉

@jpalomaki
Copy link
Contributor

jpalomaki commented Aug 28, 2021

Currently, we don't support this, but it may be possible with the EC2 launch templates approach in #65, right @jpalomaki wink

Block device mapping options might help here, yes. But I don't have first hand experience with that, so cannot say for sure. EBS volume attachment might get racy, if there ever were multiple runners (workflows) running at the same time.

@neverchanje Have you tried using https://docs.github.com/en/actions/guides/caching-dependencies-to-speed-up-workflows#using-the-cache-action?

@DarrinHidef
Copy link

Another motivation for having a suspend/resume option would be that it would prevent any risk of having large numbers of instances build up and stay running (extra fees), if there are bugs in the terminate-instance logic. That possibility has me worrying, because I don't want to have to monitor the instance count and don't want a surprise bill at the end of the month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants