Skip to content

fix(batch): ListJobs semantics, tag read/write parity, queue validation#1993

Merged
vieiralucas merged 1 commit into
mainfrom
wt-batch-correct
Jun 27, 2026
Merged

fix(batch): ListJobs semantics, tag read/write parity, queue validation#1993
vieiralucas merged 1 commit into
mainfrom
wt-batch-correct

Conversation

@vieiralucas

@vieiralucas vieiralucas commented Jun 27, 2026

Copy link
Copy Markdown
Member

AWS Batch correctness cluster from the 2026-06-27 bug-hunt (Tier 1: findings 1.1-1.6).

ListJobs (crates/fakecloud-batch/src/service.rs)

  • 1.1 Default to RUNNING when no jobStatus is given. AWS: "If you don't specify a status, only RUNNING jobs are returned." Previously returned every status -> pollers over-counted running jobs.
  • 1.2 Honor the arrayJobId / multiNodeJobId selectors and require exactly one of jobQueue/arrayJobId/multiNodeJobId (AWS rejects otherwise). Enumerating an array job's children works now.
  • 1.4 Paginate via maxResults (<=100) + opaque nextToken.
  • 1.5 Emit the full JobSummary (jobArn/statusReason/startedAt/stoppedAt/container/arrayProperties/jobDefinition), not just jobId/name/status/createdAt.

Tags single-store (1.3)

Create wrote tags inline on the resource; TagResource/UntagResource wrote a separate per-ARN store; Describe* read inline only and ListTagsForResource read the store only — so the two diverged and terraform tag updates never converged ("inconsistent result after apply"). Now the store is authoritative: create seeds it from inline tags, and every Describe* overlays it.

1.6

SubmitJob rejects a non-existent jobQueue (AWS ClientException); DescribeComputeEnvironments reports containerOrchestrationType ("ECS").

Tests

New tags-parity test (create-inline -> ListTags; TagResource -> Describe; containerOrchestrationType). ListJobs default-RUNNING + arrayJobId-selector + no-selector-error assertions. Existing tests updated to create their queue first. cargo test -p fakecloud-batch 18 pass; batch control-plane + CFN e2e green.


Summary by cubic

Aligns fake AWS Batch API with real AWS behavior to improve polling accuracy and tag consistency. Fixes ListJobs semantics, unifies tag read/write, validates queues, and adds a missing compute environment field.

  • Bug Fixes
    • ListJobs now matches AWS: default to RUNNING when no jobStatus; require exactly one selector (jobQueue/arrayJobId/multiNodeJobId) and support array/multi-node selectors; paginate with maxResults (<=100) + nextToken; return full JobSummary fields.
    • Tags: make the per-ARN tag store authoritative; seed it from create-time tags and overlay on Describe* reads to keep Describe* and ListTagsForResource in sync (fixes Terraform tag drift).
    • SubmitJob rejects non-existent jobQueue with a ClientException.
    • DescribeComputeEnvironments includes containerOrchestrationType: "ECS".

Written for commit 6303981. Summary will update on new commits.

Review in cubic

…validation

ListJobs (crates/fakecloud-batch/src/service.rs):
- Default to RUNNING when no jobStatus is given (AWS: "if you don't
  specify a status, only RUNNING jobs are returned") instead of returning
  every status.
- Honor the arrayJobId / multiNodeJobId selectors and require exactly one
  of jobQueue / arrayJobId / multiNodeJobId, matching AWS.
- Paginate via maxResults (<=100) + an opaque nextToken.
- Emit full JobSummary fields (jobArn/statusReason/startedAt/stoppedAt/
  container/arrayProperties/jobDefinition), not just jobId/name/status/createdAt.

Tags: create wrote tags inline on the resource while TagResource/UntagResource
wrote a separate per-ARN store, so Describe* and ListTagsForResource disagreed
(terraform tag updates never converged). Seed the store from inline create
tags and overlay it on every Describe* read, making the store authoritative.

Also: SubmitJob now rejects a non-existent jobQueue (AWS ClientException), and
DescribeComputeEnvironments reports containerOrchestrationType ("ECS").

Tests: new tags-parity test; ListJobs default-RUNNING + arrayJobId selector +
no-selector-error assertions; existing tests updated to create their queue.
@vieiralucas vieiralucas merged commit 0839474 into main Jun 27, 2026
104 checks passed
@vieiralucas vieiralucas deleted the wt-batch-correct branch June 27, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant