Skip to content

copilot-driver --resume fails with 'No authentication information found' after transient AI model error #26001

@tadelesh

Description

@tadelesh

Summary

When the Copilot agent crashes due to a transient AI model server error (5 retries exhausted), all --resume retry attempts immediately fail with "No authentication information found". The authentication token is not properly passed through to the resumed copilot process, making the retry mechanism ineffective.

Environment

Steps to Reproduce

  1. Run an agentic workflow using engine.id: copilot with engine.model: claude-opus-4.6
  2. The agent runs successfully for ~39 minutes, producing code changes
  3. A sub-agent call fails with a transient AI model server error:
    Failed to get response from the AI model; retried 5 times (total retry wait time: 6.59s)
    Last error: Unknown error
    
  4. The copilot process exits with code 1
  5. copilot-driver detects partial execution and attempts --resume

Expected Behavior

The --resume attempts should inherit the authentication context from the original run and continue execution from where it left off.

Actual Behavior

All 3 --resume retry attempts fail immediately (within ~2 seconds each) with:

Error: No authentication information found.
Copilot can be authenticated with GitHub using an OAuth Token or a Fine-Grained Personal Access Token.
To authenticate, you can use any of the following methods:
  - Start 'copilot' and run the '/login' command
  - Set the COPILOT_GITHUB_TOKEN, GH_TOKEN, or GITHUB_TOKEN environment variable
  - Run 'gh auth login' to authenticate with the GitHub CLI

Relevant Logs

[copilot-driver] attempt 1: process closed exitCode=1 duration=39m 14s stdout=18542B stderr=499B hasOutput=true
[copilot-driver] attempt 1 failed: exitCode=1 isCAPIError400=false hasOutput=true retriesRemaining=3
[copilot-driver] attempt 1: partial execution — will retry with --resume (attempt 2/4)

[copilot-driver] attempt 2: spawning: /usr/local/bin/copilot --add-dir /tmp/gh-aw/ ... --resume
[copilot-driver] attempt 2: process closed exitCode=1 duration=1s stdout=0B stderr=404B
[copilot-driver] attempt 2 failed: exitCode=1

[copilot-driver] attempt 3: spawning: /usr/local/bin/copilot ... --resume
[copilot-driver] attempt 3: process closed exitCode=1 duration=1s stdout=0B stderr=404B

[copilot-driver] attempt 4: spawning: /usr/local/bin/copilot ... --resume
[copilot-driver] attempt 4: process closed exitCode=1 duration=1s stdout=0B stderr=404B

[copilot-driver] all 3 retries exhausted — giving up (exitCode=1)

Impact

  • The --resume retry mechanism is completely broken when auth tokens are not propagated
  • Long-running workflows (~39 min) lose all progress and must restart from scratch
  • The original run had already produced +365 lines of changes that were lost

Possible Root Cause

The --resume spawn appears to not inherit the GITHUB_TOKEN / COPILOT_GITHUB_TOKEN / GH_TOKEN environment variables that were available during the initial attempt. The auth env var may be cleaned up or expired between attempt 1 exit and the resume spawns.

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions