-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run builds on AWS Batch [issue] #28
Comments
I've just pushed the CLI changes to adding support for I would appreciate any review of the code and especially the user interaction/experience. Until I complete some of the ancillary items on the todo list above, it will only work if you have admin access to the lab's AWS account (this means @trvrb and maybe @jameshadfield for now). I will leave another comment when I've arranged wider access so more folks can test if they want, hopefully tomorrow or sometime soon this week. Note that the remote jobs are currently using the |
Folks in our lab should now be able to try this out more widely. |
Re: compute limits: In our current AWS Batch configuration, each job defaults to 2 vCPUs with 4GB of memory and will be terminated if it does not complete in 4 hours. These are adjustable on a per-job basis, but the cli itself does not change the defaults. (Though an authorized user could.) The Batch compute environment (i.e. managed pool of EC2 instances) is limited to no more than 256 combined vCPUs. Instances are automatically provisioned, including down to zero instances (no cost) running if there are no jobs in the queue. We should keep an eye on Batch usage and Batch-driven costs to make sure this is functioning as we expect. Only @trvrb (or someone else with access to Billing details) can do this. If we start submitting large jobs, we should consider increasing the default job resources. |
Re: cost tracking: @trvrb these may be of interest: https://aws.amazon.com/aws-cost-management/aws-cost-explorer/ |
Documentation is now at https://github.com/nextstrain/cli/blob/aws-batch/doc/aws-batch.md. That URL (well, the URL for |
I've bumped the default job resources to 8 vCPUs and (just under) 16GiB of memory, which should cost about 34¢/hour on a c5.2xlarge. Combined with my augur PR to auto-scale alignment and tree-building parallelism, this should make larger builds run much quicker. |
Merged and released as 1.7.0. |
This issue is for a work-in-progress feature which I've been working on recently and am currently polishing up.
Even with the working mechanics, there are several external things that need consideration before this can be considered a shippable feature:
boto3.client("batch")
.)S3: Bucket ACLS - https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-access-control.htmlaws-batch
branchnextstrain/base:latest
instead ofnextstrain/base:branch-aws-batch
^C
(or make issue for this)(The list above is as much for me as anyone else.)
The text was updated successfully, but these errors were encountered: