Problem
Per the OWASP Agentic Top 10 — ASI-08 (Cascading Failures & Denial-of-Wallet), agentic workflows should have circuit breakers to prevent runaway execution and cost accumulation when workflows fail repeatedly.
Current behavior: A workflow that fails 100 consecutive times will continue to trigger and execute on every event. There is no failure budget or automatic disabling mechanism.
Existing Failure Safeguards
| Mechanism |
File |
What it does |
Limitation |
stop-after |
stop_after.go |
Time-based cutoff (e.g., "+6h") |
Time-based, not failure-based |
| Concurrency |
concurrency.go |
Limits concurrent runs |
Manages parallelism, not failure budget |
| Secret validation |
compiler_activation_job_builder.go |
Validates tokens exist |
Checks format, not validity |
None of these prevent repeated execution of a failing workflow.
Parent Issue
Part of #28770 (OWASP Agentic Top 10 Compliance Evaluation)
Proposed Solution
Frontmatter Configuration
Add a circuit-breaker field to frontmatter:
---
circuit-breaker:
max-consecutive-failures: 5 # Open circuit after N consecutive failures
time-window: 24h # Only count failures within this window
cooldown: 1h # Time to wait before allowing retry after circuit opens
notify: true # Post workflow annotation when circuit opens
---
Defaults (when circuit-breaker is not specified):
- Disabled by default (backward compatible)
- Can be enabled globally via a feature flag:
features.circuit-breaker: true (uses sensible defaults: 5 failures, 24h window, 1h cooldown)
Implementation Architecture
1. Failure Counter (GitHub Actions Artifacts)
Use workflow run artifacts to persist the failure counter across runs:
Run N (success) → upload artifact: {consecutive_failures: 0, last_success: <timestamp>}
Run N+1 (fail) → download prev artifact → upload: {consecutive_failures: 1, last_failure: <timestamp>}
Run N+2 (fail) → download prev artifact → upload: {consecutive_failures: 2, last_failure: <timestamp>}
...
Run N+5 (fail) → consecutive_failures >= max → CIRCUIT OPEN → skip activation
Run N+6 (trigger) → check cooldown → if elapsed: allow one retry (half-open)
2. Activation Job Integration
Add a circuit breaker check step before the activation condition in compiler_activation_job_builder.go:
// In activationJobBuildContext (L17-41)
type activationJobBuildContext struct {
// ... existing fields ...
circuitBreakerConfig *CircuitBreakerConfig
}
New step in activation job (before validate-secret):
- name: Check circuit breaker
id: check-circuit-breaker
uses: actions/download-artifact@v4
# Download previous run's failure counter
# Compare against threshold
# Output: is_open=true/false, consecutive_failures=N
Activation condition becomes:
if: >-
${{ steps.check-circuit-breaker.outputs.is_open != 'true' &&
<existing-activation-conditions> }}
3. Post-Execution Counter Update
Add a step at the end of the agent job to update the failure counter:
- name: Update circuit breaker counter
if: always()
run: |
if [ "${{ job.status }}" = "success" ]; then
echo '{"consecutive_failures": 0, "last_success": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"}' > circuit-breaker-state.json
else
PREV=${{ steps.check-circuit-breaker.outputs.consecutive_failures }}
echo "{\"consecutive_failures\": $((PREV + 1)), \"last_failure\": \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\"}" > circuit-breaker-state.json
fi
- uses: actions/upload-artifact@v4
if: always()
with:
name: circuit-breaker-state
path: circuit-breaker-state.json
4. Circuit Breaker States
Follow the standard circuit breaker pattern:
CLOSED (normal) ──[N consecutive failures]──→ OPEN (blocking)
│
[cooldown elapsed]
│
▼
HALF-OPEN (probe)
│ │
[success] [failure]
│ │
▼ ▼
CLOSED OPEN
Key Files to Modify
| File |
Change |
pkg/workflow/frontmatter_types.go |
Add CircuitBreaker *CircuitBreakerConfig field |
pkg/workflow/circuit_breaker.go |
New file: config parsing, step generation |
pkg/workflow/compiler_activation_job_builder.go |
Add circuit breaker check step before activation |
pkg/workflow/compiler_yaml_main_job.go |
Add counter update step after agent execution |
pkg/parser/schemas/ |
Add circuit-breaker to frontmatter JSON schema |
actions/setup/sh/check_circuit_breaker.sh |
Runtime script for state checking |
Pattern to Follow
Follow the same pattern as stop_after.go:
extractCircuitBreakerConfig() — parse frontmatter
generateCircuitBreakerSteps() — generate activation/post-execution steps
- Integration in
buildActivationJob() and main job builder
Acceptance Criteria
Problem
Per the OWASP Agentic Top 10 — ASI-08 (Cascading Failures & Denial-of-Wallet), agentic workflows should have circuit breakers to prevent runaway execution and cost accumulation when workflows fail repeatedly.
Current behavior: A workflow that fails 100 consecutive times will continue to trigger and execute on every event. There is no failure budget or automatic disabling mechanism.
Existing Failure Safeguards
stop-afterstop_after.goconcurrency.gocompiler_activation_job_builder.goNone of these prevent repeated execution of a failing workflow.
Parent Issue
Part of #28770 (OWASP Agentic Top 10 Compliance Evaluation)
Proposed Solution
Frontmatter Configuration
Add a
circuit-breakerfield to frontmatter:Defaults (when
circuit-breakeris not specified):features.circuit-breaker: true(uses sensible defaults: 5 failures, 24h window, 1h cooldown)Implementation Architecture
1. Failure Counter (GitHub Actions Artifacts)
Use workflow run artifacts to persist the failure counter across runs:
2. Activation Job Integration
Add a circuit breaker check step before the activation condition in
compiler_activation_job_builder.go:New step in activation job (before validate-secret):
Activation condition becomes:
3. Post-Execution Counter Update
Add a step at the end of the agent job to update the failure counter:
4. Circuit Breaker States
Follow the standard circuit breaker pattern:
Key Files to Modify
pkg/workflow/frontmatter_types.goCircuitBreaker *CircuitBreakerConfigfieldpkg/workflow/circuit_breaker.gopkg/workflow/compiler_activation_job_builder.gopkg/workflow/compiler_yaml_main_job.gopkg/parser/schemas/actions/setup/sh/check_circuit_breaker.shPattern to Follow
Follow the same pattern as
stop_after.go:extractCircuitBreakerConfig()— parse frontmattergenerateCircuitBreakerSteps()— generate activation/post-execution stepsbuildActivationJob()and main job builderAcceptance Criteria
circuit-breakerfrontmatter field parsed and validatedfeatures.circuit-breaker: trueenables with defaults