Skip to content

feat(batch): array jobs — real child containers + parent aggregation (batch 3)#1987

Merged
vieiralucas merged 1 commit into
mainfrom
wt-batch-3
Jun 27, 2026
Merged

feat(batch): array jobs — real child containers + parent aggregation (batch 3)#1987
vieiralucas merged 1 commit into
mainfrom
wt-batch-3

Conversation

@vieiralucas

@vieiralucas vieiralucas commented Jun 27, 2026

Copy link
Copy Markdown
Member

Summary

AWS Batch batch 3 — array jobs (the highest-demand Batch feature after real execution).

  • SubmitJob with arrayProperties.size = N spawns N child jobs <parent>:<index>, each a real container with AWS_BATCH_JOB_ARRAY_INDEX injected so it selects its slice of work, launched on the ECS engine via the batch-2 path.
  • The array parent's status + arrayProperties.statusSummary aggregate the children live at DescribeJobs time — SUCCEEDED only when every child exits 0, FAILED if any child failed once all are terminal, else RUNNING/PENDING.
  • Refactored the ECS-launch-or-park logic into a shared launch_job (single + array submits).

Test plan

  • Unit (13): array spawn + parent aggregation, status-summary terminal logic, AWS_BATCH_JOB_ARRAY_INDEX injection, plus existing control-plane/resource-mapping.
  • Docker-gated e2e: array_job_runs_every_child_and_parent_succeeds — array size 3 runs three real alpine containers → parent SUCCEEDED, statusSummary.SUCCEEDED == 3. Passes locally.
  • cargo clippy -p fakecloud-batch -p fakecloud-e2e --all-targets -- -D warnings clean. Docs updated.

Next

Job dependencies (dependsOn), retry strategies + timeouts, then a CloudFormation AWS::Batch::* provisioner + tfacc.


Summary by cubic

Add AWS Batch array jobs: SubmitJob with arrayProperties.size spawns real child containers and the parent aggregates live status. Also refactors the ECS launch flow into a shared launch_job used by single and array submits.

  • New Features

    • SubmitJob with arrayProperties.size=N spawns N real child jobs <parent>:<index>, each with AWS_BATCH_JOB_ARRAY_INDEX.
    • Parent aggregates at DescribeJobs: SUCCEEDED only if all children exit 0; FAILED if any failed once all are terminal; otherwise RUNNING/PENDING.
  • Refactors

    • Extracted ECS launch/park logic into launch_job, shared by single and array submits.

Written for commit 8ffa257. Summary will update on new commits.

Review in cubic

…(batch 3)

- SubmitJob with arrayProperties.size=N spawns N child jobs `<parent>:<index>`,
  each a real container with AWS_BATCH_JOB_ARRAY_INDEX injected so it selects
  its slice of work, launched on the ECS engine via the batch-2 path.
- The array parent is a tracking record whose status + arrayProperties.
  statusSummary are computed live from its children at DescribeJobs time
  (SUCCEEDED only when every child exits 0; FAILED if any child failed once all
  are terminal; RUNNING/PENDING otherwise).
- Extracted the ECS-launch-or-park logic into `launch_job`, shared by single
  and array submits.

Tests: unit (array spawn + parent aggregation, status-summary terminal logic,
AWS_BATCH_JOB_ARRAY_INDEX injection); Docker-gated e2e (array size 3 runs three
real alpine containers -> parent SUCCEEDED, statusSummary SUCCEEDED=3). Docs
updated.
@vieiralucas vieiralucas merged commit b315195 into main Jun 27, 2026
103 checks passed
@vieiralucas vieiralucas deleted the wt-batch-3 branch June 27, 2026 05:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant