copilot-driver: restart fresh on null-type tool_call 400; permanently disable --continue after any fallback by Copilot · Pull Request #29329 · github/gh-aw

Copilot · 2026-04-30T14:27:16Z

When the model emits a tool call with type: null, the API returns a 400 that poisons conversation history. Retrying with --continue re-injects the same broken state — all 4 attempts fail identically. A fresh restart discards the corrupt history and gives the run a real chance to succeed.

Changes

New error detection — NULL_TYPE_TOOL_CALL_PATTERN matches the specific API 400: "Invalid type for '...tool_calls[N].type': ... but got null instead."
Fresh restart on null-type — when detected, clears --continue and continues the retry loop as a fresh run (works whether the failing attempt used --continue or not)
continueDisabledPermanently guard — once a fallback to fresh run is triggered (null-type or auth error), --continue is never re-enabled for subsequent attempts; previously a fresh-run success with output would re-arm --continue
Auth fallback hardened — existing auth-error fallback now also sets continueDisabledPermanently, closing the same re-arm bug
Diagnostic log — per-attempt failure line gains isNullTypeToolCallError= field

Retry flow example

attempt 1 (fresh):      partial output, normal error  → retry with --continue
attempt 2 (--continue): tool_calls[0].type: null 400  → restart fresh, --continue permanently disabled
attempt 3 (fresh):      partial output, other error   → retry fresh  (NOT --continue)
attempt 4 (fresh):      ...

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

invalid.example.invalid
- Triggering command: /usr/lib/git-core/git-remote-https /usr/lib/git-core/git-remote-https origin https://invalid.example.invalid/nonexistent-repo.git e/git init�� ndor/bin/git git ode_modules/.bin/git =receive test@example.com--git-dir=/tmp/bare-incremental-UQfpU1 /git (dns block)
- Triggering command: /usr/lib/git-core/git-remote-https /usr/lib/git-core/git-remote-https origin https://invalid.example.invalid/nonexistent-repo.git e/git init�� ndor/bin/git git ode_modules/.bin/git =receive test@example.com--git-dir=/tmp/bare-incremental-QCGtrx /git (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

…ntinue disable guard Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1bc21234-f151-423f-879b-2cc881234acf Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1bc21234-f151-423f-879b-2cc881234acf Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

Updates the Copilot CLI harness retry policy to detect a specific “null-type tool call” API 400 that corrupts conversation history, and to recover by restarting fresh while preventing --continue from being reused after any such fallback.

Changes:

Add detection for null-type tool_calls[N].type 400 and force a fresh restart on retry.
Introduce a continueDisabledPermanently guard so --continue can’t be re-enabled after a fallback (null-type or auth).
Add unit tests covering the new pattern and retry-mode transitions; add a patch changeset entry.

Show a summary per file

File	Description
actions/setup/js/copilot_harness.cjs	Implements null-type tool-call detection, fresh-restart fallback, and permanent `--continue` disable guard; expands per-attempt diagnostics.
actions/setup/js/copilot_harness.test.cjs	Adds tests for the null-type error regex and for retry policy behavior around fresh restarts and permanent `--continue` disabling.
.changeset/patch-copilot-driver-null-type-tool-call-fresh-restart.md	Documents the behavior change as a patch release note.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 3/3 changed files
Comments generated: 2

+    // Null-type tool_call error: the model emitted a malformed tool call that poisons the
+    // conversation history.  Retrying with --continue re-injects the same broken history and
+    // produces the same 400 on every subsequent attempt.  Restart fresh to discard the poisoned
+    // history, and permanently disable --continue so the corrupt state is never re-loaded.
+    if (isNullTypeToolCall) {


+  describe("null-type tool_call detection pattern", () => {
+    const NULL_TYPE_TOOL_CALL_PATTERN = /tool_calls\[.*?\]\.type.*null/;
+
+    it("matches the exact error from the failed workflow run", () => {


github-actions · 2026-04-30T14:56:41Z

🧪 Test Quality Sentinel Report

Test Quality Score: 80/100

✅ Excellent

Metric	Value
New/modified tests analyzed	10
✅ Design tests (behavioral contracts)	10 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	10 (100%)
Duplicate test clusters	0
Test inflation detected	⚠️ Yes (200 test lines / 43 production lines ≈ 4.6:1)
🚨 Coding-guideline violations	None

Test Classification Details

View all 10 tests

Test	File	Classification	Issues Detected
`matches the exact error from the failed workflow run`	`copilot_harness.test.cjs`	✅ Design	None — validates real-world error string
`matches with different array indices`	`copilot_harness.test.cjs`	✅ Design	None — validates index-variant matching
`does not match unrelated tool_calls errors`	`copilot_harness.test.cjs`	✅ Design	Negative/edge cases covered
`does not match unrelated null errors`	`copilot_harness.test.cjs`	✅ Design	Negative + empty-string edge case
`restarts fresh when null-type error occurs on a --continue attempt`	`copilot_harness.test.cjs`	✅ Design	Behavioral contract: retry=true, continue=false, disabled=true
`restarts fresh when null-type error occurs on a fresh run`	`copilot_harness.test.cjs`	✅ Design	Behavioral contract: fresh-start path
`does not retry when budget is exhausted`	`copilot_harness.test.cjs`	✅ Design	Edge case: MAX_RETRIES boundary
`does not retry when no output was produced`	`copilot_harness.test.cjs`	✅ Design	Edge case: hasOutput=false guard
`does not re-enable --continue after a null-type fresh restart`	`copilot_harness.test.cjs`	✅ Design	Multi-step invariant: permanent disable guard
`does not re-enable --continue after an auth-error fresh restart`	`copilot_harness.test.cjs`	✅ Design	Multi-step invariant: auth-error permanent disable

Observations (Not Blocking)

⚠️ Inlined retry logic in test file

The describe blocks null-type tool_call restarts fresh instead of --continue and permanent --continue disable guard each define a local applyRetryPolicy function that mirrors the production logic rather than importing it. This means:

The tests verify the policy's behavioral contract correctly — the invariants are real and well-chosen.
However, if the production applyRetryPolicy is refactored (e.g., variable names or branching order change), these tests would still pass while the production code silently regresses.

Suggested improvement: If the production retry policy is extractable as a pure function, export it and import it directly in tests. If not (e.g., it's deeply embedded in a closure), the current inline approach is acceptable — just document why.

i️ Test inflation (4.6:1 ratio)

200 test lines were added against 43 production lines. This triggered the inflation penalty (−10 pts), but the tests are genuinely thorough — the ratio is explained by multi-step scenario tests that require test-local helper functions and detailed assertion chains. No artificial inflation detected.

Language Support

Tests analyzed:

🟨 JavaScript (*.test.cjs): 10 tests (vitest) — actions/setup/js/copilot_harness.test.cjs
🐹 Go (*_test.go): 0 tests changed

Score Breakdown

Component	Weight	Score
Behavioral Coverage (10/10 design tests)	40 pts	40
Error/Edge Case Coverage (10/10 tests)	30 pts	30
Low Duplication (0 clusters)	20 pts	20
Proportional Growth (4.6:1 ratio > 2:1)	10 pts	0 ⚠️
Total	100 pts	80

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 10 tests enforce genuine behavioral contracts: the regex detection patterns are validated against real-world error strings (including the exact error from the failed workflow run), and the retry policy invariants are tested end-to-end including budget exhaustion, no-output guards, and multi-step permanent-disable scenarios.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References:

§25172220517

🧪 Test quality analysis by Test Quality Sentinel · ● 399.8K · ◷

github-actions

✅ Test Quality Sentinel: 80/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 10 new tests enforce genuine behavioral contracts covering the null-type tool_call detection regex, retry policy decisions, budget exhaustion, no-output guards, and multi-step permanent --continue disable invariants.

pelikhan · 2026-04-30T14:57:51Z

@copilot review all comments

…st for accuracy Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1eb81da3-5f18-4bab-803c-d9ba0be9fca8 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-04-30T15:05:36Z

@copilot review all comments

Done in 2b52491:

Header comment (copilot_harness.cjs top-level retry-policy docs): added a bullet documenting that null-type tool_calls[N].type 400 errors poison conversation history, trigger a fresh restart, and permanently disable --continue for the rest of the run.
In-loop comment (before the retry-decision block): rewrote the "Exceptions" line to enumerate MCP policy, model-not-supported, auth, and null-type tool_call cases explicitly.
Test name: renamed "matches the exact error from the failed workflow run" → "matches the error format observed in failed workflow runs" to avoid implying the fixture string is a verbatim copy of the API message.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

invalid.example.invalid
- Triggering command: /usr/lib/git-core/git-remote-https /usr/lib/git-core/git-remote-https origin https://invalid.example.invalid/nonexistent-repo.git e/git init�� ndor/bin/git git ode_modules/.bin/git =receive test@example.com--git-dir=/tmp/bare-incremental-048Db6 /git (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Initial plan

4b001b4

Copilot AI assigned Copilot and pelikhan Apr 30, 2026

Copilot started work on behalf of pelikhan April 30, 2026 14:27 View session

Copilot AI linked an issue Apr 30, 2026 that may be closed by this pull request

copilot-driver: restart fresh instead of --continue when 400 null-type tool_call poisons history #29312

Closed

Copilot AI and others added 2 commits April 30, 2026 14:41

fix: detect null-type tool_call and restart fresh; add permanent --co…

9ce0c15

…ntinue disable guard Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1bc21234-f151-423f-879b-2cc881234acf Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

refactor: improve string readability in null-type tool_call tests

88c199e

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1bc21234-f151-423f-879b-2cc881234acf Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix copilot-driver to restart fresh on null-type tool_call error~~ copilot-driver: restart fresh on null-type tool_call 400; permanently disable --continue after any fallback Apr 30, 2026

Copilot finished work on behalf of pelikhan April 30, 2026 14:46

Copilot AI requested a review from pelikhan April 30, 2026 14:46

pelikhan marked this pull request as ready for review April 30, 2026 14:50

Copilot AI review requested due to automatic review settings April 30, 2026 14:50

Copilot started reviewing on behalf of pelikhan April 30, 2026 14:51 View session

Copilot AI reviewed Apr 30, 2026

View reviewed changes

github-actions Bot mentioned this pull request Apr 30, 2026

[aw] No-Op Runs #29134

Open

github-actions Bot approved these changes Apr 30, 2026

View reviewed changes

Copilot started work on behalf of pelikhan April 30, 2026 14:58 View session

docs: update retry-policy comments for null-type tool_call; rename te…

2b52491

…st for accuracy Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1eb81da3-5f18-4bab-803c-d9ba0be9fca8 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan April 30, 2026 15:06

pelikhan merged commit 7ce8d1c into main Apr 30, 2026

pelikhan deleted the copilot/fix-copilot-driver-restart-strategy branch April 30, 2026 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

copilot-driver: restart fresh on null-type tool_call 400; permanently disable --continue after any fallback#29329

copilot-driver: restart fresh on null-type tool_call 400; permanently disable --continue after any fallback#29329
pelikhan merged 4 commits intomainfrom
copilot/fix-copilot-driver-restart-strategy

Copilot AI commented Apr 30, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

pelikhan commented Apr 30, 2026

Uh oh!

Copilot AI commented Apr 30, 2026 •

edited

Loading

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Retry flow example

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

github-actions Bot commented Apr 30, 2026

🧪 Test Quality Sentinel Report

Test Quality Score: 80/100

Test Classification Details

Observations (Not Blocking)

⚠️ Inlined retry logic in test file

i️ Test inflation (4.6:1 ratio)

Language Support

Score Breakdown

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

pelikhan commented Apr 30, 2026

Uh oh!

Copilot AI commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 30, 2026 •

edited

Loading

Copilot AI commented Apr 30, 2026 •

edited

Loading