fix: add timeout to chat control data fetch request (fixes #317337)#317342
Open
vs-code-engineering[bot] wants to merge 1 commit into
Open
fix: add timeout to chat control data fetch request (fixes #317337)#317342vs-code-engineering[bot] wants to merge 1 commit into
vs-code-engineering[bot] wants to merge 1 commit into
Conversation
The _fetchChatControlData function makes a periodic network request without a timeout, causing it to hang indefinitely on slow networks. This triggers PerfSampleError telemetry when the profiler detects the long-running fetch call. Add a 30-second timeout so the request aborts cleanly on slow networks. The existing error handling catches the timeout and retries in 5 minutes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🔧 Error Fix
Summary
The
_fetchChatControlDatafunction in the chat language models service makes a periodic network request (every 5 minutes) to fetch chat control data without a request timeout. When the network is slow or unresponsive, the nativefetchcall hangs indefinitely, causing the CPU profiler to sample it and generate aPerfSampleError: by <> in fetchthat reaches error telemetry.This is a new anomaly in stable 1.120.0, affecting 3,958 users with 6,332 hits across all platforms (Mac, Windows, Linux).
Fixes #317337
Recommended reviewer:
@lramos15Culprit Commit
Commit range narrowing was inconclusive per the regression scan. The
_fetchChatControlDatafunction existed prior to 1.120.0, but the absence of a timeout has always been latent. The anomaly likely appeared due to changes in the performance sampling infrastructure or increased server latency in 1.120.0.Code Flow
sequenceDiagram participant LM as languageModels.ts participant RS as requestService participant RI as requestImpl.ts participant Net as Network/fetch participant Prof as Profiler LM->>LM: _refreshChatControlData() [every 5 min] LM->>RS: request({url, callSite, NO timeout}) RS->>RI: logAndRequest -> request() RI->>Net: fetch(url, {signal}) Note over Net: Network slow/unresponsive Prof->>Prof: CPU sample detects fetch on stack Prof->>Prof: Creates PerfSampleError Prof->>Prof: errorHandler.onUnexpectedError() Note over Prof: Error reaches telemetryAffected Files
src/vs/workbench/contrib/chat/common/languageModels.tssrc/vs/base/parts/request/common/requestImpl.tssrc/vs/platform/profiling/common/profilingTelemetrySpec.tsRepro Steps
_refreshChatControlDatacycle to firePerfSampleErrorunhandlederror-PerfSampleError: by <> in fetchHow the Fix Works
Chosen approach (
src/vs/workbench/contrib/chat/common/languageModels.ts):Added
timeout: 30_000(30 seconds) to the request options passed tothis._requestService.request()in_fetchChatControlData. This causesrequestImpl.tsto useAbortSignal.timeout(30000)which aborts the fetch after 30 seconds. The abort throws aTimeoutErrorwhich is caught by the existing try/catch in_fetchChatControlData(line 1929), logged as a warning, and the function returns gracefully. The next retry happens in 5 minutes via_refreshChatControlData.This fixes the issue at the data producer level — the unbounded request that causes the profiler to flag the slow fetch. The fix prevents the fetch from hanging long enough to be profiled as a performance problem, without silencing any errors or removing telemetry logging.
Recommended Owner
@lramos15— most active recent contributor tolanguageModels.tswith model management and picker changes.