Type: Bug
Environment
- VS Code Insiders
- Built-in Copilot extension bundled with Insiders
- BYOK OpenAI-compatible endpoint
- Responses API (Codex CLI Backend)
- Reproduced against a local OpenAI-compatible test server
Summary
When a BYOK Responses request fails with HTTP 400:
{
"code": "previous_response_not_found",
"message": "No response found for previous_response_id resp_xxx.",
"error": {
"code": "previous_response_not_found",
"message": "No response found for previous_response_id resp_xxx."
}
}
VS Code appears to retry, but the retry still includes the same previous_response_id.
Expected behavior is to retry once without the stateful marker / previous_response_id.
Expected behavior
After receiving HTTP 400 with root-level:
code: "previous_response_not_found"
message: "No response found for previous_response_id ..."
the client should classify the failure as an invalid stateful marker and retry once without previous_response_id.
Actual behavior
A second POST /v1/responses is sent, but it still contains the same previous_response_id, so the request fails again with the same 400 and the user sees an error instead of recovery.
Repro
- Configure Copilot BYOK against an OpenAI-compatible Responses endpoint.
- Start a stateful conversation so the client sends
previous_response_id.
- Make the server return HTTP 400 with:
{
"code": "previous_response_not_found",
"message": "No response found for previous_response_id resp_xxx.",
"error": {
"code": "previous_response_not_found",
"message": "No response found for previous_response_id resp_xxx."
}
}
- Observe that VS Code sends a second request.
- Observe that the second request still contains the same
previous_response_id.
Notes
I first suspected the server payload shape, but I verified the live server already returns root-level code and message as required.
The remaining issue is that the retry request still includes the original previous_response_id.
Evidence
The request log shows:
- first
POST /v1/responses with previous_response_id = X
- HTTP 400
- second
POST /v1/responses also with previous_response_id = X
- HTTP 400 again
So the retry is happening, but the stateful marker is not being cleared.
Code analysis
The retry path appears to be intended to work:
extensions/copilot/src/extension/prompt/node/chatMLFetcher.ts
- recognizes HTTP 400 with
jsonData.code === 'previous_response_not_found'
extensions/copilot/src/platform/endpoint/node/chatEndpoint.ts
- converts that into
InvalidStatefulMarker
- retries with
ignoreStatefulMarker: true
However, the BYOK OpenAIEndpoint appears to override that retry flag.
1) createRequestBody() mutates ignoreStatefulMarker
extensions/copilot/src/extension/byok/node/openAIEndpoint.ts
const zdr = !!this.modelMetadata.zeroDataRetentionEnabled;
options.ignoreStatefulMarker = zdr;
const body = super.createRequestBody(options);
This overwrites ignoreStatefulMarker: true during retry back to false in the normal non-ZDR case.
2) makeChatRequest2() also forces ignoreStatefulMarker: false
extensions/copilot/src/extension/byok/node/openAIEndpoint.ts
const modifiedOptions: IMakeChatRequestOptions = { ...options, ignoreStatefulMarker: false };
const response = await super.makeChatRequest2(modifiedOptions, token);
Even if this is not the primary reason for the observed retry failure, it is also suspicious because it prevents upstream intent from being preserved.
Proposed patch
I believe the BYOK endpoint should preserve an existing retry flag instead of overwriting it.
diff --git a/extensions/copilot/src/extension/byok/node/openAIEndpoint.ts b/extensions/copilot/src/extension/byok/node/openAIEndpoint.ts
index XXXXXXX..YYYYYYY 100644
--- a/extensions/copilot/src/extension/byok/node/openAIEndpoint.ts
+++ b/extensions/copilot/src/extension/byok/node/openAIEndpoint.ts
@@ -240,7 +240,7 @@ export class OpenAIEndpoint extends ChatEndpoint {
if (this.useResponsesApi) {
// Handle Responses API: customize the body directly
const zdr = !!this.modelMetadata.zeroDataRetentionEnabled;
- options.ignoreStatefulMarker = zdr;
+ options.ignoreStatefulMarker = !!options.ignoreStatefulMarker || zdr;
const body = super.createRequestBody(options);
body.store = !zdr;
body.n = undefined;
@@ -367,10 +367,8 @@ export class OpenAIEndpoint extends ChatEndpoint {
}
public override async makeChatRequest2(options: IMakeChatRequestOptions, token: CancellationToken): Promise<ChatResponse> {
- // Apply ignoreStatefulMarker: false for initial request
- const modifiedOptions: IMakeChatRequestOptions = { ...options, ignoreStatefulMarker: false };
- const response = await super.makeChatRequest2(modifiedOptions, token);
- return hydrateBYOKErrorMessages(response);
+ const response = await super.makeChatRequest2(options, token);
+ return hydrateBYOKErrorMessages(response);
}
}
Why this patch
ChatEndpoint.makeChatRequest2() already computes the default behavior for ignoreStatefulMarker.
- A retry from
InvalidStatefulMarker should be allowed to preserve ignoreStatefulMarker: true.
- ZDR should still disable stateful marker reuse, but it should not erase an already-true retry flag.
Suggested regression tests
Please consider adding a BYOK regression test covering this flow:
- initial Responses request uses a stateful marker
- server returns HTTP 400
previous_response_not_found
- retry is triggered
- retry request body omits
previous_response_id
A targeted test around OpenAIEndpoint.createRequestBody() preserving ignoreStatefulMarker: true would also help.
VS Code version: Code - Insiders 1.121.0-insider (f805d63, 2026-05-15T12:52:51-04:00)
OS version: Windows_NT x64 10.0.26200
Modes:
System Info
| Item |
Value |
| CPUs |
Intel(R) Core(TM) i7-14700F (28 x 2112) |
| GPU Status |
2d_canvas: enabled GPU0: VENDOR= 0x10de, DEVICE=0x2705 [NVIDIA GeForce RTX 4070 Ti SUPER], DRIVER_VENDOR=NVIDIA, DRIVER_VERSION=32.0.15.9649 ACTIVE GPU1: VENDOR= 0x1414, DEVICE=0x008c [Microsoft Basic Render Driver], DRIVER_VERSION=10.0.26100.8328 Machine model name: Machine model version: direct_rendering_display_compositor: disabled_off_ok gpu_compositing: enabled multiple_raster_threads: enabled_on opengl: enabled_on rasterization: enabled raw_draw: disabled_off_ok skia_graphite: disabled_off trees_in_viz: disabled_off video_decode: enabled video_encode: enabled webgl: enabled webgl2: enabled webgpu: enabled webnn: disabled_off |
| Load (avg) |
undefined |
| Memory (System) |
63.84GB (40.50GB free) |
| Process Argv |
--crash-reporter-id 2ec5f96c-c0f3-4a7f-a64c-d65afeffe46b |
| Screen Reader |
no |
| VM |
0% |
Extensions (50)
| Extension |
Author (truncated) |
Version |
| UnityShader |
ash |
1.0.13 |
| markdown-preview-github-styles |
bie |
2.2.0 |
| path-intellisense |
chr |
2.10.0 |
| csharpier-vscode |
csh |
10.0.2 |
| vscode-eslint |
dba |
3.0.24 |
| EditorConfig |
Edi |
0.18.2 |
| prettier-vscode |
esb |
12.4.0 |
| remotehub |
Git |
0.64.0 |
| vscode-github-actions |
git |
0.31.5 |
| vscode-pull-request-github |
Git |
0.145.2026051504 |
| discord-vscode |
icr |
5.9.2 |
| markdown-live-editor |
jis |
0.6.2 |
| zenkaku-hankaku |
mas |
1.0.0 |
| vscode-copilot-auto-retry |
Max |
0.3.1 |
| vscode-github-actions |
me- |
3.0.1 |
| zenkaku |
mos |
0.0.3 |
| vscode-language-pack-ja |
MS- |
1.118.2026051522 |
| csdevkit |
ms- |
3.20.197 |
| csharp |
ms- |
2.140.8 |
| vscode-dotnet-runtime |
ms- |
3.0.2 |
| debugpy |
ms- |
2026.7.11331010 |
| python |
ms- |
2026.5.2026051501 |
| vscode-pylance |
ms- |
2026.2.103 |
| vscode-python-envs |
ms- |
1.33.2026051501 |
| remote-containers |
ms- |
0.460.0 |
| remote-ssh |
ms- |
0.123.2026051315 |
| remote-ssh-edit |
ms- |
0.87.0 |
| remote-wsl |
ms- |
0.104.3 |
| azure-repos |
ms- |
0.40.0 |
| cmake-tools |
ms- |
1.23.52 |
| cpp-devtools |
ms- |
0.5.13 |
| cpptools |
ms- |
1.32.2 |
| cpptools-extension-pack |
ms- |
1.5.1 |
| extension-test-runner |
ms- |
0.0.14 |
| powershell |
ms- |
2025.4.0 |
| remote-explorer |
ms- |
0.6.2026031809 |
| remote-repositories |
ms- |
0.42.0 |
| vscode-chat-customizations-evaluations |
ms- |
1.0.2 |
| vscode-github-issue-notebooks |
ms- |
0.0.134 |
| indent-rainbow |
ode |
8.3.1 |
| fix-json |
oli |
0.2.0 |
| chatgpt |
ope |
26.5513.21555 |
| material-icon-theme |
PKi |
5.34.0 |
| vscode-yaml |
red |
1.24.2026050908 |
| trailing-spaces |
sha |
0.4.1 |
| shader |
sle |
1.1.5 |
| pdf |
tom |
1.2.2 |
| native-preview |
Typ |
0.20260515.1 |
| vstuc |
vis |
1.2.2 |
| gpg-indicator |
wdh |
0.7.5 |
(1 theme extensions excluded)
A/B Experiments
vsliv368cf:30146710
pythonvspyt551:31249597
nativeloc1:31118317
dwcopilot:31158714
dwoutputs:31242946
copilot_t_ci:31333650
g012b348:31231168
pythonrdcb7:31268811
pythonpcpt1cf:31399617
6518g693:31302842
82j33506:31327384
6abeh943:31336334
envsactivate1:31349248
editstats-enabled:31346256
cloudbuttont:31366566
3efgi100_wstrepl:31403338
cp_cls_c_966_ss:31454199
inlinechat_v2_hd992725:31445440
4je02754:31455664
8hhj4413:31478653
ge8j1254_inline_auto_hint_haiku:31490507
cp_cls_c_1081:31454833
conptydll_true:31485575
ia-use-proxy-models-svc:31446143
e9c30283:31453065
test_treatment2:31471001
c9b86496:31447327
idci7584:31454084
nes_chat_context_disabled:31451402
e3e4d672:31454087
ei9d7968:31462942
nes-extended-on:31455475
quick_suggest_off_75197330:31462668
89g7j272:31506658
7e884298:31462391
7e187181:31482583
i2gc6536:31472020
52612955:31508042
h08i8180:31475367
ddid_c:31478205
hmra_i5g22:31509478
getcmakediagnosticsoff:31489825
61f49681:31505879
7df3h592:31491241
pro_large_t:31499377
cp_cls_c_1082:31504161
logging_enabled_new:31490725
jb_cp_cls_c_632:31510883
cg448276_tst_on:31503513
32d76977:31503652
ha629193:31508444
31fi7170_t:31510641
jh5f2457_c:31514653
tco_off:31513904
api_cot_ctrl:31509853
hgf2d445:31510900
rd_file_off:31514889
prpt_ctrl:31513638
Type: Bug
Environment
Summary
When a BYOK Responses request fails with HTTP 400:
{ "code": "previous_response_not_found", "message": "No response found for previous_response_id resp_xxx.", "error": { "code": "previous_response_not_found", "message": "No response found for previous_response_id resp_xxx." } }VS Code appears to retry, but the retry still includes the same
previous_response_id.Expected behavior is to retry once without the stateful marker /
previous_response_id.Expected behavior
After receiving HTTP 400 with root-level:
code: "previous_response_not_found"message: "No response found for previous_response_id ..."the client should classify the failure as an invalid stateful marker and retry once without
previous_response_id.Actual behavior
A second
POST /v1/responsesis sent, but it still contains the sameprevious_response_id, so the request fails again with the same 400 and the user sees an error instead of recovery.Repro
previous_response_id.{ "code": "previous_response_not_found", "message": "No response found for previous_response_id resp_xxx.", "error": { "code": "previous_response_not_found", "message": "No response found for previous_response_id resp_xxx." } }previous_response_id.Notes
I first suspected the server payload shape, but I verified the live server already returns root-level
codeandmessageas required.The remaining issue is that the retry request still includes the original
previous_response_id.Evidence
The request log shows:
POST /v1/responseswithprevious_response_id = XPOST /v1/responsesalso withprevious_response_id = XSo the retry is happening, but the stateful marker is not being cleared.
Code analysis
The retry path appears to be intended to work:
extensions/copilot/src/extension/prompt/node/chatMLFetcher.tsjsonData.code === 'previous_response_not_found'extensions/copilot/src/platform/endpoint/node/chatEndpoint.tsInvalidStatefulMarkerignoreStatefulMarker: trueHowever, the BYOK
OpenAIEndpointappears to override that retry flag.1)
createRequestBody()mutatesignoreStatefulMarkerextensions/copilot/src/extension/byok/node/openAIEndpoint.tsThis overwrites
ignoreStatefulMarker: trueduring retry back tofalsein the normal non-ZDR case.2)
makeChatRequest2()also forcesignoreStatefulMarker: falseextensions/copilot/src/extension/byok/node/openAIEndpoint.tsEven if this is not the primary reason for the observed retry failure, it is also suspicious because it prevents upstream intent from being preserved.
Proposed patch
I believe the BYOK endpoint should preserve an existing retry flag instead of overwriting it.
Why this patch
ChatEndpoint.makeChatRequest2()already computes the default behavior forignoreStatefulMarker.InvalidStatefulMarkershould be allowed to preserveignoreStatefulMarker: true.Suggested regression tests
Please consider adding a BYOK regression test covering this flow:
previous_response_not_foundprevious_response_idA targeted test around
OpenAIEndpoint.createRequestBody()preservingignoreStatefulMarker: truewould also help.VS Code version: Code - Insiders 1.121.0-insider (f805d63, 2026-05-15T12:52:51-04:00)
OS version: Windows_NT x64 10.0.26200
Modes:
System Info
GPU0: VENDOR= 0x10de, DEVICE=0x2705 [NVIDIA GeForce RTX 4070 Ti SUPER], DRIVER_VENDOR=NVIDIA, DRIVER_VERSION=32.0.15.9649 ACTIVE
GPU1: VENDOR= 0x1414, DEVICE=0x008c [Microsoft Basic Render Driver], DRIVER_VERSION=10.0.26100.8328
Machine model name:
Machine model version:
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: enabled
multiple_raster_threads: enabled_on
opengl: enabled_on
rasterization: enabled
raw_draw: disabled_off_ok
skia_graphite: disabled_off
trees_in_viz: disabled_off
video_decode: enabled
video_encode: enabled
webgl: enabled
webgl2: enabled
webgpu: enabled
webnn: disabled_off
Extensions (50)
(1 theme extensions excluded)
A/B Experiments