Conversation
…ntinue disable guard Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1bc21234-f151-423f-879b-2cc881234acf Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1bc21234-f151-423f-879b-2cc881234acf Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Updates the Copilot CLI harness retry policy to detect a specific “null-type tool call” API 400 that corrupts conversation history, and to recover by restarting fresh while preventing --continue from being reused after any such fallback.
Changes:
- Add detection for null-type
tool_calls[N].type400 and force a fresh restart on retry. - Introduce a
continueDisabledPermanentlyguard so--continuecan’t be re-enabled after a fallback (null-type or auth). - Add unit tests covering the new pattern and retry-mode transitions; add a patch changeset entry.
Show a summary per file
| File | Description |
|---|---|
| actions/setup/js/copilot_harness.cjs | Implements null-type tool-call detection, fresh-restart fallback, and permanent --continue disable guard; expands per-attempt diagnostics. |
| actions/setup/js/copilot_harness.test.cjs | Adds tests for the null-type error regex and for retry policy behavior around fresh restarts and permanent --continue disabling. |
| .changeset/patch-copilot-driver-null-type-tool-call-fresh-restart.md | Documents the behavior change as a patch release note. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 3/3 changed files
- Comments generated: 2
| // Null-type tool_call error: the model emitted a malformed tool call that poisons the | ||
| // conversation history. Retrying with --continue re-injects the same broken history and | ||
| // produces the same 400 on every subsequent attempt. Restart fresh to discard the poisoned | ||
| // history, and permanently disable --continue so the corrupt state is never re-loaded. | ||
| if (isNullTypeToolCall) { |
| describe("null-type tool_call detection pattern", () => { | ||
| const NULL_TYPE_TOOL_CALL_PATTERN = /tool_calls\[.*?\]\.type.*null/; | ||
|
|
||
| it("matches the exact error from the failed workflow run", () => { |
🧪 Test Quality Sentinel ReportTest Quality Score: 80/100✅ Excellent
Test Classification DetailsView all 10 tests
Observations (Not Blocking)
|
| Component | Weight | Score |
|---|---|---|
| Behavioral Coverage (10/10 design tests) | 40 pts | 40 |
| Error/Edge Case Coverage (10/10 tests) | 30 pts | 30 |
| Low Duplication (0 clusters) | 20 pts | 20 |
| Proportional Growth (4.6:1 ratio > 2:1) | 10 pts | 0 |
| Total | 100 pts | 80 |
Verdict
✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 10 tests enforce genuine behavioral contracts: the regex detection patterns are validated against real-world error strings (including the exact error from the failed workflow run), and the retry policy invariants are tested end-to-end including budget exhaustion, no-output guards, and multi-step permanent-disable scenarios.
📖 Understanding Test Classifications
Design Tests (High Value) verify what the system does:
- Assert on observable outputs, return values, or state changes
- Cover error paths and boundary conditions
- Would catch a behavioral regression if deleted
- Remain valid even after internal refactoring
Implementation Tests (Low Value) verify how the system does it:
- Assert on internal function calls (mocking internals)
- Only test the happy path with typical inputs
- Break during legitimate refactoring even when behavior is correct
- Give false assurance: they pass even when the system is wrong
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.
References:
🧪 Test quality analysis by Test Quality Sentinel · ● 399.8K · ◷
There was a problem hiding this comment.
✅ Test Quality Sentinel: 80/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 10 new tests enforce genuine behavioral contracts covering the null-type tool_call detection regex, retry policy decisions, budget exhaustion, no-output guards, and multi-step permanent --continue disable invariants.
|
@copilot review all comments |
…st for accuracy Agent-Logs-Url: https://github.com/github/gh-aw/sessions/1eb81da3-5f18-4bab-803c-d9ba0be9fca8 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Done in 2b52491:
Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
When the model emits a tool call with
type: null, the API returns a 400 that poisons conversation history. Retrying with--continuere-injects the same broken state — all 4 attempts fail identically. A fresh restart discards the corrupt history and gives the run a real chance to succeed.Changes
NULL_TYPE_TOOL_CALL_PATTERNmatches the specific API 400:"Invalid type for '...tool_calls[N].type': ... but got null instead."--continueand continues the retry loop as a fresh run (works whether the failing attempt used--continueor not)continueDisabledPermanentlyguard — once a fallback to fresh run is triggered (null-type or auth error),--continueis never re-enabled for subsequent attempts; previously a fresh-run success with output would re-arm--continuecontinueDisabledPermanently, closing the same re-arm bugisNullTypeToolCallError=fieldRetry flow example
Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
invalid.example.invalid/usr/lib/git-core/git-remote-https /usr/lib/git-core/git-remote-https origin https://invalid.example.invalid/nonexistent-repo.git e/git init�� ndor/bin/git git ode_modules/.bin/git =receive test@example.com--git-dir=/tmp/bare-incremental-UQfpU1 /git(dns block)/usr/lib/git-core/git-remote-https /usr/lib/git-core/git-remote-https origin https://invalid.example.invalid/nonexistent-repo.git e/git init�� ndor/bin/git git ode_modules/.bin/git =receive test@example.com--git-dir=/tmp/bare-incremental-QCGtrx /git(dns block)If you need me to access, download, or install something from one of these locations, you can either: