Skip to content

AWS Batch Configuration

dxenes1 edited this page Feb 6, 2020 · 9 revisions

Below are a set of instructions to quickly set up a AWS Batch to work with saber's conduit tool.

You must have already cloned the saber repository to follow the steps described here. [see here].(https://github.com/aplbrain/saber/wiki/Local-Setup)

AWS Batch general configuration.

Before stepping through these instructions you must have an AWS account with administrative privileges. If you are not the admin of your AWS account, please see here the resource you need permissions to perform the tasks here.

  1. Download the file saberAirflowGeneral.json
  2. On your AWS Console open the Cloudformation Service
  3. Click on Create Stack
  4. On the Choose Template screen select Upload a template to Amazon S3 option and upload the saberAirflowGeneral.json file, then click Next.
  5. On Specify Details fill in the form with your desired Stack name, subnet ID and VPC ID then click Next.
    • To obtain your VPC ID travel to the VPC Dashboard and copy the VPC ID of your desired VPC (may be default)
    • To choose a subnet ID travel to the Subnets section and copy the id of a subnet that belongs to the VPC ID you specified.
  6. On the Options window simply click Next.
  7. Check off the acknowledgment box in the Review screen.
  8. Click Create.

This will build all the resources necessary to run conduit airflow with the AWS Batch Backend with a general queue service name.

AWS Batch specific queues.

We recommend, as instructed above, to use the template file found here This template will create two batch job-queues along with two separate compute environments.

  • The saber-gen-queue should be used for less intensive processes as it will use a general saber compute environment that makes use of the optimal instance for a specific job.
  • The saber-gpu-job-queue makes use of gpu enabled Ec2 instances on AWS and thus should be used cautiously to avoid high costs. You can read more about instance types here.

You will be able to choose which queue to send jobs to through your workflow.cwl and job.yml's as described here: Writing your own workflows. Note that AWS only charges you for what you use, so simply having a job-queue or compute environment idle, does not have any extra costs.

To add more queues and monitor jobs you can go to the Batch Dashboard

IAM User (Not Admin)

If you're an IAM user with non admin privileges please ask your admin to grant you access to the following resources:

  • Virtual Private Cloud (VPC)
  • Elastic Compute Cloud (EC2)
  • Managed File Storage for EC2 (EFS)
  • Elastic Container Service (ECS)
  • Elastic Container Repositories (ECR)
  • Key Management Service (KMS)
  • AWS Batch
  • Cloudformation
  • Simple Storage Service (S3)

Associated Policies attached to IAM User:

  • AmazonElasticFileSystemFullAccess
  • AmazonVPCFullAccess
  • CloudformationFullAccess
  • AmazonEC2ContainerRegistryFullAccess
  • AWSBatchFullAccess
  • AmazonEC2ContainerServiceFullAccess