Skip to content

Pipeline reports failure when Copilot CLI hits rate limit after successful completion #21644

@Mossaka

Description

@Mossaka

Bug

When Copilot CLI hits a rate limit AFTER successfully completing all work (build, test, create-issue), it exits with code 1. The compiled workflow propagates this as a job failure, even though the agent's work was fully successful.

Root Cause

Two issues in compiled workflow YAML:

  1. Main agent step (copilot_engine_execution.go): No continue-on-error. Copilot CLI exits 1 on rate limit even after successful completion. Safe outputs are already persisted via MCP before the rate limit hits.

  2. Threat detection step (threat_detection.go): Also lacks continue-on-error. When this step fails (rate limit), it fails the agent job even though "Set detection conclusion" handles failures gracefully.

Suggested Fix

  • Check if safeoutputs.jsonl has content despite non-zero exit — if agent produced outputs, treat as success
  • Add continue-on-error: true to threat detection step (conclusion step already handles failures)
  • Or: have Copilot CLI distinguish rate-limit exits (exit 0 + warning) from real failures (exit 1)

Impact

Inflates failure count in batch experiment runs. At least 3 repos miscounted as failures in v6.

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions