-
Notifications
You must be signed in to change notification settings - Fork 2
FAQs
-
There are actually two ways that parallel jobs can function in SABER.
- Parameter sweeps: When a job parameter is swept using a sweep file, essentially it creates multiple workflows that all run in parallel with different combinations of the parameters. This means that their processes are completely independent, and steps only depend on previous steps in the sub-workflow.
- Parallel steps in a workflow: If two or more steps depend on the same step, but do not depend on each other, they will be run in parallel.
These two methods of parallelization can both be combined. For example, if you have a workflow that is defined as
/ --> s2 -- \ s1(t) -- --> s4 \ --> s3 -- /
Where
s1(t)
depends on parametert
. If you sweep overt
such that it produces two iterations (sayt=True, t=False
), then it will produce a DAG similar to/ --> s2 -- \ s1(True) -- --> s4 \ --> s3 -- / / --> s2' -- \ s1'(False) -- --> s4' \ --> s3' -- /
(
s1, s2
) and (s2
,s3
) will run in parallel. (s2'
,s3'
) will only run whens1'
completes and so on.Parallel jobs in AWS are managed by the compute environment choice for AWS batch. In SABER's default configuration, where the compute environemnt is managed, as many jobs as possible will be run in parallel that fit within the limits that were placed on the compute environment. Each job spawns a seperate EC2 instance running the job, where the instance type (and cost) are based on the
ResourceRequirement
specified in the job cwl.If you wish to put limits on the instances launched, you can do so when creating your compute environment. See here.
-
Yes, as long as the job definition has already been generated and the containers pushed (i.e you ran the cwl-to-dag script on the job and workflow cwl's). If you navigate to the AWS Batch console and go to the job definitions page, you can select a job definition and launch the job from there. You will have to specify the individual parameters.
However, if you have already launched a similar job and you can find it in any of the status tabs under the Jobs tab on the left of the console, you can click on the job and click
Clone job
, which will give you the option to change individual parameters.Note: If you do not change any of the
_saber_
parameters, the files in S3 will be overwritten. Specifically, you should change the_saber_home
or_saber_stepname
, which specify where in the S3 the files are stored. See the S3 page for details.
Table of Contents
- Overview
- Setup and Configuration:
- Conduit:
- FAQs
- Data Access
- Tools:
- Examples: