-
Notifications
You must be signed in to change notification settings - Fork 0
feat: fallback targets for provider errors #899
Copy link
Copy link
Open
Labels
in-progressClaimed by an agent — do not duplicate workClaimed by an agent — do not duplicate work
Description
Problem
When a target provider returns errors (rate limits, outages, transient failures), the eval fails immediately. The existing retry config (retry_initial_delay_ms, retry_backoff_factor, retry_status_codes) retries the same provider, but if that provider is down or heavily rate-limited, retries just burn time.
Proposal
Add a fallback_targets field to target definitions in targets.yaml:
- name: default
provider: openai
base_url: https://models.github.ai/inference/v1
api_key: ${{ GH_MODELS_TOKEN }}
model: ${{ GH_MODELS_MODEL }}
fallback_targets:
- gemini-flash
- azure-llmWhen the primary target returns a retryable error (429, 503, connection timeout), the runner should:
- Retry with exponential backoff on the primary (existing behavior)
- After exhausting retries, try
fallback_targetsin order - Record which target actually served the response in the result JSONL
Similarly for the agent target:
- name: agent
provider: ${{ AGENT_PROVIDER }}
model: ${{ AGENT_MODEL }}
grader_target: grader
fallback_targets:
- claude
- copilot-cliDesign considerations
- Grader fallback: The
gradertarget used bygrader_targetshould also support fallbacks, since LLM-as-judge calls hit the same rate limits - Status codes: Should be configurable which errors trigger fallback (default: 429, 503, 502)
- Result attribution: The result JSONL should record
target_usedso users know which provider actually ran - Scope: Fallback applies per-request, not per-eval — different test cases in the same eval could hit different targets
Prior art
- OpenRouter does provider fallback automatically across their model pool
- Vercel AI SDK supports
fallback()provider wrapper - LiteLLM has
fallbacksconfig for provider failover
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
in-progressClaimed by an agent — do not duplicate workClaimed by an agent — do not duplicate work
Type
Projects
Status
No status