v7.31.1 — fix saveBinaryBlob rename race
Fix
A same-key rename race in FileSystemStorage.saveBinaryBlob caused brain.flush() to throw ENOENT and broke downstream jobs (GCS backups, snapshot exports) for any consumer that triggers concurrent flush + compaction.
[job-queue] gcs-backup: failed — ENOENT: no such file or directory,
rename '/data/brainy-data/.../_column_index/owner/DELETED.bin.tmp'
-> '/data/brainy-data/.../_column_index/owner/DELETED.bin'
Root cause: saveBinaryBlob used a bare ${filePath}.tmp suffix. Two concurrent same-key calls computed the same temp path; both writeFiled, the first rename succeeded, the second rename fired against a missing temp and threw.
Fix: unique per-writer temp suffix (matches the pattern at every other atomic-write site in the same file) + defensive ENOENT swallow on rename + temp cleanup on any other failure.
Audit
One bug site, scoped audit confirms no other similar patterns:
FileSystemStorage— six sibling atomic-write sites already used unique suffixes. OnlysaveBinaryBlobwas the outlier. All clean now.OPFSStorage— WritableStream (no tmp+rename).- Object-store adapters (GCS / R2 / Azure / S3) —
PUTis atomic. MemoryStorage/HistoricalStorageAdapter— not affected.- COW / versioning / HNSW / aggregation / snapshot — all delegate to storage adapters; they get the fix automatically.
Beneficiaries beyond the reported bug
HNSW connection persistence (hnswIndex.ts:252) also writes via saveBinaryBlob and was structurally susceptible to the same race. No production reports of HNSW failures (probably because the HNSW write path is more naturally serialized by the index lock), but the fix removes the latent issue.
Tests
- New
tests/integration/savebinaryblob-concurrent-rename.test.ts(4 tests) - 1468 / 1468 unit suite passing
See RELEASES.md for the full entry.