Skip to content

feat: per-agent retry policies#86

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/feature-per-agent-retry-policies
Draft

feat: per-agent retry policies#86
Copilot wants to merge 4 commits intomainfrom
copilot/feature-per-agent-retry-policies

Conversation

Copy link
Copy Markdown

Copilot AI commented Apr 10, 2026

Adds declarative per-agent retry configuration so transient provider failures don't kill entire workflows. Previously the only recovery was checkpoint/resume.

Schema

New RetryPolicy model on AgentDef.retry:

agents:
  - name: analyzer
    model: gpt-5.2
    retry:
      max_attempts: 3        # 1-10, default 1 (no retry)
      backoff: exponential    # or "fixed"
      delay_seconds: 2        # base delay, 0-300
      retry_on:
        - provider_error      # API 500s, rate limits
        - timeout             # agent-level timeouts

Validation errors (schema mismatches) are never retried. Script agents cannot have retry.

Provider changes (copilot.py, claude.py — parity maintained)

  • _resolve_retry_config(agent) — merges agent-level RetryPolicy into provider-level RetryConfig, falls back to provider default when unset
  • _classify_error(error) — categorizes exceptions as "provider_error" or "timeout" (checks exception type, status code 408, message content)
  • _calculate_delay — now supports "fixed" backoff in addition to exponential
  • _execute_with_retry — checks retry_on filter, emits agent_retry events via event_callback

Tests

35 new tests covering schema validation, _resolve_retry_config, fixed/exponential backoff, retry_on filtering, agent_retry event emission, and error classification.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/copilot_internal/user
    • Triggering command: /home/REDACTED/work/conductor/conductor/.venv/lib/python3.12/site-packages/copilot/bin/copilot /home/REDACTED/work/conductor/conductor/.venv/lib/python3.12/site-packages/copilot/bin/copilot --headless --no-auto-update --log-level info --stdio (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI linked an issue Apr 10, 2026 that may be closed by this pull request
Copilot AI and others added 3 commits April 10, 2026 19:12
…er support

- Add RetryPolicy Pydantic model to config/schema.py with max_attempts,
  backoff (fixed/exponential), delay_seconds, and retry_on fields
- Add retry field to AgentDef with validation (script/human_gate excluded)
- Add _resolve_retry_config to both CopilotProvider and ClaudeProvider
  to use per-agent retry config when set, falling back to provider default
- Add _classify_error to categorize errors as provider_error or timeout
- Support fixed and exponential backoff strategies in _calculate_delay
- Emit agent_retry events via event_callback on each retry attempt
- Check retry_on filter to only retry specified error categories

Agent-Logs-Url: https://github.com/microsoft/conductor/sessions/fc727ab8-0f5b-4cbb-8c2f-e9252217d181

Co-authored-by: jrob5756 <7672803+jrob5756@users.noreply.github.com>
- 35 tests covering RetryPolicy schema validation, AgentDef retry field,
  provider _resolve_retry_config, fixed/exponential backoff, retry_on
  filtering, agent_retry event emission, and error classification
- Fix import for TimeoutError (was ConductorTimeoutError)
- Update YAML schema reference documentation
- Fix lint and formatting issues

Agent-Logs-Url: https://github.com/microsoft/conductor/sessions/fc727ab8-0f5b-4cbb-8c2f-e9252217d181

Co-authored-by: jrob5756 <7672803+jrob5756@users.noreply.github.com>
- Improve _classify_error to also check status_code 408 and asyncio.TimeoutError
- Re-raise original exception in Claude provider when retry_on filter rejects
- Move test import to top of file

Agent-Logs-Url: https://github.com/microsoft/conductor/sessions/fc727ab8-0f5b-4cbb-8c2f-e9252217d181

Co-authored-by: jrob5756 <7672803+jrob5756@users.noreply.github.com>
Copilot AI changed the title [WIP] Add configurable retry policies at agent level feat: per-agent retry policies Apr 10, 2026
Copilot AI requested a review from jrob5756 April 10, 2026 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Per-Agent Retry Policies

2 participants