Agent fails with "Failed to get response from the AI model; retried 5 times" after noop safe-output

## Summary

During a gh-aw workflow run, the agent successfully completes its work (reads files, calls MCP tools, produces a `noop` safe-output), but then hits a server error loop and fails.

## Reproduction

- **Workflow:** `errors-triage.lock.yml` mirrored onto `shim.yml`
- **Run URL:** https://github.com/microsoft/vscode-engineering/actions/runs/23660178469
- **Branch:** `brchen/error-triage-duplicate-detection`
- **Compiler version:** v0.62.5
- **Engine:** `copilot` / `claude-opus-4.6`

## Agent Log (Tail)

The agent completes its work and emits a `noop` safe-output, then immediately hits repeated server errors:
```
● noop
└ {"result":"success"}

● Response was interrupted due to a server error. Retrying...

● Response was interrupted due to a server error. Retrying...

● Response was interrupted due to a server error. Retrying...

● Response was interrupted due to a server error. Retrying...

● Response was interrupted due to a server error. Retrying...

Execution failed: Error: Failed to get response from the AI model; retried 5 times (total retry wait time: 5.9251769461139245 seconds) Last error: Unknown error
```

Run: [errors-triage · microsoft/vscode-engineering@0ea95c8](https://github.com/microsoft/vscode-engineering/actions/runs/23660178469/job/68928147106#step:21:211)

## Context

- The agent had already completed all its work — reading reference files, calling MCP tools (`get_vscode_releases`, `get_errors`, `issue_read`, etc.), and emitting the `noop` output.
- The failure happens *after* the successful `noop` call, suggesting the model encounters an error on the next turn (possibly trying to finalize/summarize).
- A previous run on the same branch (~30 min earlier, run [23659258158](https://github.com/microsoft/vscode-engineering/actions/runs/23659258158)) completed successfully with similar workload.
- The total retry wait time is very short (~5.9s), suggesting the retries are not backing off much.

## Expected Behavior

The agent should complete successfully after emitting its safe-output. If the model is unavailable on the finalization turn, ideally the run should still succeed since the meaningful work (safe-outputs) was already captured.

## Questions

1. Is this a known transient issue with the Copilot inference backend?
2. Could gh-aw treat a run as successful if all safe-outputs were captured before the model error?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent fails with "Failed to get response from the AI model; retried 5 times" after noop safe-output #23265

Summary

Reproduction

Agent Log (Tail)

Context

Expected Behavior

Questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Agent fails with "Failed to get response from the AI model; retried 5 times" after noop safe-output #23265

Description

Summary

Reproduction

Agent Log (Tail)

Context

Expected Behavior

Questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions