Await pending session DB writes on agent host shutdown#311432
Merged
roblourens merged 3 commits intomainfrom Apr 20, 2026
Merged
Await pending session DB writes on agent host shutdown#311432roblourens merged 3 commits intomainfrom
roblourens merged 3 commits intomainfrom
Conversation
The agent host server's SIGTERM/SIGINT handler called process.exit(0) synchronously, abandoning any fire-and-forget SQLite writes that were in flight (configValues, customTitle, isRead/isDone, diffs). Under CI load this caused the 'Session Config persistence across restarts' integration test to the most recent SessionConfigChanged writeflake could lose the race against shutdown, leaving the previous value persisted instead. Track in-flight writes inside SessionDatabase via a _pendingWrites set populated by every public mutating method (the outermost wrap is required so the await this._ensureDb() window is also covered). SessionDataService aggregates whenIdle() across all live per-session DBs. The server's shutdown handler now awaits this with a 3s raceTimeout before disposing. Removes the await timeout(500) hack the test previously needed. (Written by Copilot) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a shutdown race in the Agent Host where fire-and-forget SQLite writes could be dropped on SIGTERM/SIGINT, causing flaky restart/persistence integration tests.
Changes:
- Track in-flight write promises in
SessionDatabaseand exposewhenIdle()to await pending writes. - Add
ISessionDataService.whenIdle()to aggregatewhenIdle()across all live per-session databases. - Await
whenIdle()(with a 3sraceTimeout) during agent host shutdown; remove the test’stimeout(500)workaround.
Show a summary per file
| File | Description |
|---|---|
| src/vs/platform/agentHost/node/sessionDatabase.ts | Adds pending-write tracking and whenIdle(); wraps mutating DB methods with _track(). |
| src/vs/platform/agentHost/node/sessionDataService.ts | Tracks live DB instances and implements whenIdle() across open databases. |
| src/vs/platform/agentHost/node/agentHostServerMain.ts | Makes shutdown async and awaits DB idleness with a bounded timeout before exiting. |
| src/vs/platform/agentHost/common/sessionDataService.ts | Extends ISessionDatabase/ISessionDataService interfaces with whenIdle(). |
| src/vs/platform/agentHost/test/node/protocol/sessionConfig.integrationTest.ts | Removes sleep-based flake workaround; relies on graceful shutdown flush. |
| src/vs/platform/agentHost/test/node/copilotAgent.test.ts | Updates test ISessionDataService stub to include whenIdle(). |
| src/vs/platform/agentHost/test/node/agentService.test.ts | Updates session data service mocks to include whenIdle(). |
| src/vs/platform/agentHost/test/common/sessionTestHelpers.ts | Updates ISessionDatabase/ISessionDataService test helpers with whenIdle(). |
Copilot's findings
- Files reviewed: 8/8 changed files
- Comments generated: 2
- Close the WebSocket server before awaiting whenIdle() so no further actions can be dispatched during the flush window. - Simplify SessionDataService.whenIdle(): per-DB whenIdle() already drains writes against existing DBs, so the outer loop only needs to re-pass when a NEW DB was opened during the await. Comment now matches the code. (Written by Copilot) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Yoyokrazy
previously approved these changes
Apr 20, 2026
auto-merge was automatically disabled
April 20, 2026 18:39
Pull request was converted to draft
On Windows, child.kill() (SIGTERM) terminates the process unconditionally
without invoking the SIGTERM so the in-flight setMetadata writehandler
never reaches SQLite and the second phase sees no persisted config at all.
Closing the child's stdin fires process.stdin.on('end', shutdown) on every
platform, exercising the same graceful flush path.
(Written by Copilot)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bryanchen-d
approved these changes
Apr 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes a flake in the
Protocol WebSocket - Session Config persistence across restarts > persisted config values are restored on subscribe after server restartintegration test, where the assertion saw the previousbranch: 'main'value instead of the updatedbranch: 'release'.Root cause
The agent host server's
SIGTERM/SIGINThandler calledprocess.exit(0)synchronously, abandoning any fire-and-forget SQLite writes that were still in flight —configValues,customTitle,isRead/isDone, and per-session diffs. Under CI load the most recentSessionConfigChangedwrite could lose the race against shutdown, leaving the prior value persisted.Fix
Track in-flight writes inside
SessionDatabasevia a_pendingWritesset populated by every public mutating method. The outermost wrap is required so theawait this._ensureDb()window is also covered — tracking only the leafdbRun/dbExecwould miss the gap between the method being called and the query actually being queued.SessionDataServiceaggregateswhenIdle()across all live per-session DBs (tracked via the existing ref-counted collection). The server's shutdown handler now awaits this with a 3sraceTimeoutbefore disposing, so a stuck write cannot hang shutdown indefinitely.Removes the
await timeout(500)hack the test previously needed to mask the race.Verification
632/632platform/agentHosttests pass locally.10/10stability runs of the previously-flaky test pass with the fix.5/5runs fail when theawait raceTimeout(...)line is commented out, confirming the test now deterministically catches the race rather than masking it with a sleep.npm run compile-check-ts-nativeclean.(Written by Copilot)
skipped in https://github.com/microsoft/vscode/pull/311374/changes