Skip to content

Latest commit

 

History

History
158 lines (146 loc) · 3.86 KB

concurrency.md

File metadata and controls

158 lines (146 loc) · 3.86 KB

Concurrency

dsari recognizes and supports concurrency management. Consider the following example configuration:

jobs:
  job1:
    command:
    - "/bin/sleep"
    - "90"
    schedule: "* * * * *"
  job2:
    command:
    - "/bin/sleep"
    - "90"
    schedule: "* * * * *"
  job3:
    command:
    - "/bin/sleep"
    - "90"
    schedule: "* * * * *"

Since each of these jobs take 90 seconds to run, and each run every minute, there will be overlap, with multiple jobs running at the same time. In this synthetic example it's not a problem, but if they were actual jobs which conflicted with each other or took significant resources, it could be a problem.

This can be managed with concurrency groups. For example:

jobs:
  job1:
    command:
    - "ssh"
    - "host-a"
    - "build_job1"
    concurrency_groups:
    - host-a_group
    schedule: "* * * * *"
  job2:
    command:
    - "ssh"
    - "host-a"
    - "build_job2"
    concurrency_groups:
    - host-a_group
    schedule: "* * * * *"
  job3:
    command:
    - "ssh"
    - "host-a"
    - "build_job3"
    concurrency_groups:
    - host-a_group
    schedule: "* * * * *"
  job4:
    command:
    - "ssh"
    - "host-a"
    - "build_job4"
    concurrency_groups:
    - host-b_group
    schedule: "* * * * *"
  job5:
    command:
    - "ssh"
    - "host-a"
    - "build_job5"
    concurrency_groups:
    - host-b_group
    schedule: "* * * * *"
concurrency_groups:
  host-a_group:
    max: 2

Here, of the first three jobs ("job1", "job2" and "job3"), only two of them may be running at the same time, as they are all part of the "host-a_group" concurrency group, which is configured for a maximum of 2. "job4" and "job5" are part of the "host-b_group" concurrency group, but this group is not defined. In this case, the implicit limit is 1, so only "job4" or "job5" can be running at any given time, not both.

Jobs may belong to multiple concurrency groups, which can allow for run distribution. For example:

jobs:
  job1:
    command:
    - "build_job_wrapper"
    concurrency_groups:
    - host-a_group
    - host-b_group
    schedule: "* * * * *"
  job2:
    command:
    - "build_job_wrapper"
    concurrency_groups:
    - host-a_group
    - host-b_group
    schedule: "* * * * *"
  job3:
    command:
    - "build_job_wrapper"
    concurrency_groups:
    - host-a_group
    - host-b_group
    schedule: "* * * * *"
  job4:
    command:
    - "build_job_wrapper"
    concurrency_groups:
    - host-a_group
    - host-b_group
    schedule: "* * * * *"
  job5:
    command:
    - "build_job_wrapper"
    concurrency_groups:
    - host-a_group
    - host-b_group
    schedule: "* * * * *"
concurrency_groups:
  host-a_group:
    max: 2
  host-b_group:
    max: 1

Here, the scheduler will pick a viable target concurrency group from each job, ensuring that no more than 2 jobs are running as part of "host-a_group", and no more than 1 job is running as part of "host-b_group". Each run will have the CONCURRENCY_GROUP environment variable set to the picked group, which our example build_job_wrapper can use (along with JOB_NAME and RUN_ID) to determine what to do. (In this example, SSHing to host-a or host-b.)

The above example can be simplified using job groups, since they all have the same configuration:

job_groups:
  build-jobs:
    job_names:
    - job1
    - job2
    - job3
    - job4
    - job5
    command:
    - "build_job_wrapper"
    concurrency_groups:
    - host-a_group
    - host-b_group
    schedule: "* * * * *"
concurrency_groups:
  host-a_group:
    max: 2
  host-b_group:
    max: 1

If a scheduled run cannot be started due to concurrency limits, it will be retried until there is a free concurrency group, on an exponential backoff scale (maximum 5 minutes between attempts). If a job is configured for multiple concurrency groups, it will not be backed off unless all groups have reached their max.