-
Notifications
You must be signed in to change notification settings - Fork 8
fix(snapshot): close TOCTOU race on save/restore/delete #1003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c5870ac
a5cb048
e3bfbaa
e6aae24
f38c1d1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| // Worker entry used by snapshot.test.ts to create genuine concurrency | ||
| // between two snapshotSave calls. better-sqlite3 is synchronous, so | ||
| // Promise-based concurrency cannot exercise the TOCTOU race — only | ||
| // separate threads can. | ||
| // | ||
| // This file is .mjs so Node loads it directly without a TypeScript loader; | ||
| // it imports the compiled dist build when available, falling back to the | ||
| // TS source via the project's ts-resolve-hooks when running under vitest | ||
| // (which propagates the strip-types flag to workers via NODE_OPTIONS). | ||
|
|
||
| import { parentPort, workerData } from 'node:worker_threads'; | ||
|
|
||
| const { dbPath, name, force } = workerData; | ||
|
|
||
| try { | ||
| const mod = await import('../../src/features/snapshot.ts'); | ||
| mod.snapshotSave(name, { dbPath, force }); | ||
| parentPort.postMessage({ ok: true }); | ||
| } catch (err) { | ||
| parentPort.postMessage({ ok: false, error: err?.message ?? String(err) }); | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,6 +1,8 @@ | ||
| import fs from 'node:fs'; | ||
| import os from 'node:os'; | ||
| import path from 'node:path'; | ||
| import { fileURLToPath } from 'node:url'; | ||
| import { Worker } from 'node:worker_threads'; | ||
| import Database from 'better-sqlite3'; | ||
| import { afterEach, beforeEach, describe, expect, it } from 'vitest'; | ||
| import { | ||
|
|
@@ -12,6 +14,8 @@ import { | |
| validateSnapshotName, | ||
| } from '../../src/features/snapshot.js'; | ||
|
|
||
| const __dirname = path.dirname(fileURLToPath(import.meta.url)); | ||
|
|
||
| let tmpDir: string; | ||
| let dbPath: string; | ||
|
|
||
|
|
@@ -132,6 +136,109 @@ describe('snapshotSave', () => { | |
| it('rejects invalid name', () => { | ||
| expect(() => snapshotSave('bad name', { dbPath })).toThrow(/Invalid snapshot name/); | ||
| }); | ||
|
|
||
| it('leaves no temp files in snapshots dir after success', () => { | ||
| snapshotSave('clean', { dbPath }); | ||
| const entries = fs.readdirSync(snapshotsDir(dbPath)); | ||
| expect(entries).toContain('clean.db'); | ||
| // Temp files are named `.<name>.db.tmp-<pid>-<ts>` — none should remain. | ||
| expect(entries.filter((f) => f.includes('.tmp-'))).toEqual([]); | ||
| }); | ||
|
|
||
| // Worker infrastructure for genuine cross-thread concurrency on | ||
| // snapshotSave. better-sqlite3 is synchronous, so Promise-based | ||
| // concurrency would queue two sequential microtasks — only separate | ||
| // threads exercise the TOCTOU race. | ||
| const nodeMajor = Number(process.versions.node.split('.')[0]); | ||
| const raceWorkerPath = path.join(__dirname, 'snapshot-race-worker.mjs'); | ||
| // --import requires a URL (file://…) or a bare/relative specifier, not a | ||
| // drive-letter path on Windows. Use the file:// URL directly. | ||
| const loaderUrl = new URL('../../scripts/ts-resolve-loader.js', import.meta.url).href; | ||
| const raceExecArgv = [ | ||
| nodeMajor >= 23 ? '--strip-types' : '--experimental-strip-types', | ||
| '--import', | ||
| loaderUrl, | ||
| ]; | ||
| const spawnSaveWorker = (workerData: { | ||
| dbPath: string; | ||
| name: string; | ||
| force: boolean; | ||
| }): Promise<{ ok: boolean; error?: string }> => | ||
| new Promise((resolve, reject) => { | ||
| const w = new Worker(raceWorkerPath, { workerData, execArgv: raceExecArgv }); | ||
| let messageReceived = false; | ||
| w.once('message', (msg) => { | ||
| messageReceived = true; | ||
| resolve(msg); | ||
| }); | ||
| w.once('error', reject); | ||
| w.once('exit', (code) => { | ||
| if (!messageReceived) { | ||
| reject(new Error(`worker exited with code ${code} before posting a message`)); | ||
| } | ||
| }); | ||
| }); | ||
|
|
||
| it('does not corrupt output when two --force saves race on the same name', async () => { | ||
| // Prime the target so both workers take the --force overwrite path. | ||
| snapshotSave('race', { dbPath }); | ||
|
|
||
| // Spawn two worker threads racing on the same name. Post-fix, the atomic | ||
| // rename ensures the winner's file is intact and the loser either | ||
| // overwrites cleanly or leaves no corrupt artifact. | ||
| const results = await Promise.allSettled([ | ||
| spawnSaveWorker({ dbPath, name: 'race', force: true }), | ||
| spawnSaveWorker({ dbPath, name: 'race', force: true }), | ||
| ]); | ||
|
|
||
| // At least one save must have succeeded. | ||
| const succeeded = results.filter((r) => r.status === 'fulfilled' && r.value.ok === true); | ||
| expect(succeeded.length).toBeGreaterThanOrEqual(1); | ||
|
|
||
| // The final file must be a valid SQLite DB with the expected contents. | ||
| const finalPath = path.join(snapshotsDir(dbPath), 'race.db'); | ||
| const db = new Database(finalPath, { readonly: true }); | ||
| const rows = db.prepare('SELECT name FROM nodes').all(); | ||
| db.close(); | ||
| expect(rows).toEqual([{ name: 'hello' }]); | ||
|
|
||
| // No temp files should leak from either worker. | ||
| const entries = fs.readdirSync(snapshotsDir(dbPath)); | ||
| expect(entries.filter((f) => f.includes('.tmp-'))).toEqual([]); | ||
| }); | ||
|
|
||
| it('atomically rejects a concurrent non-force save when one already won', async () => { | ||
| // With no existing snapshot, two concurrent non-force saves race on the | ||
| // same name. Post-fix, the atomic linkSync(tmp, dest) makes the guard | ||
| // authoritative: exactly one must succeed, the other must fail with | ||
| // "already exists". Pre-fix, both could pass existsSync and silently | ||
| // overwrite each other. | ||
| const results = await Promise.allSettled([ | ||
| spawnSaveWorker({ dbPath, name: 'nonforce-race', force: false }), | ||
| spawnSaveWorker({ dbPath, name: 'nonforce-race', force: false }), | ||
| ]); | ||
| const fulfilled = results.filter((r) => r.status === 'fulfilled'); | ||
| expect(fulfilled).toHaveLength(2); | ||
|
|
||
| const outcomes = fulfilled.map( | ||
| (r) => (r as PromiseFulfilledResult<{ ok: boolean; error?: string }>).value, | ||
| ); | ||
| const wins = outcomes.filter((o) => o.ok); | ||
| const losses = outcomes.filter((o) => !o.ok); | ||
| expect(wins).toHaveLength(1); | ||
| expect(losses).toHaveLength(1); | ||
| expect(losses[0].error).toMatch(/already exists/); | ||
|
|
||
| // Final snapshot must be valid. | ||
| const finalPath = path.join(snapshotsDir(dbPath), 'nonforce-race.db'); | ||
| const db = new Database(finalPath, { readonly: true }); | ||
| const rows = db.prepare('SELECT name FROM nodes').all(); | ||
| db.close(); | ||
| expect(rows).toEqual([{ name: 'hello' }]); | ||
|
|
||
| const entries = fs.readdirSync(snapshotsDir(dbPath)); | ||
| expect(entries.filter((f) => f.includes('.tmp-'))).toEqual([]); | ||
| }); | ||
|
Comment on lines
+182
to
+241
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This test would pass identically against the pre-fix code, because there is no opportunity for interleaving on the Node.js microtask queue. The test validates the success path (which is already covered by the "overwrites with force" test) rather than the TOCTOU concern it describes. Genuine concurrency in this module requires
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Addressed — replaced the Promise-based pseudo-concurrency with a worker_threads harness ( |
||
| }); | ||
|
|
||
| // ─── snapshotRestore ──────────────────────────────────────────────────── | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With
force: false, two concurrent callers can both passexistsSync(dest)→false, both proceed throughVACUUM INTO tmp, and both successfullyrenameSync(tmp, dest)— silently overwriting each other with no error. The user expecting a "snapshot already exists" guard gets no protection in this race.This is a narrower window than the
--forcecase (the PR's stated target), so treating it as a known limitation is reasonable, but worth noting here since the fix otherwise closes most of the race surface.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed — closed the non-force TOCTOU by replacing
renameSyncwithlinkSync(tmp, dest)on the non-force path.linkSyncfails atomically with EEXIST if dest already exists, making the existence check authoritative even under concurrent writers. The cheapexistsSyncabove is kept as a fail-fast for the common case. Added a regression test (atomically rejects a concurrent non-force save when one already won) that spawns two worker threads and asserts exactly one wins with the other receiving the "already exists" error.