fix(blob): clean up orphaned blob files on aborted/superseded writes#462
Merged
Conversation
Blobs flagged with `saveBeforeCommit` (or `saveInRecord`) are written to
disk in the `beforeIntermediate` phase of a transaction write, before the
underlying record write commits. Several paths could leave the file on
disk with no record referencing it:
- Multi-write transaction where one `beforeIntermediate` rejected after
others wrote files (only the failing blob was cleaned up by
`deleteOnFailure`; succeeded peers were leaked).
- Commit-handler early-returns: out-of-order replicated update treated
as duplicate, superseded by a newer put/delete, no-audit fullUpdate
loses to existing version, cache-resolve version-changed.
- `DatabaseTransaction.commit` rejecting on `Promise.all(completions)`
failure (e.g. blob save error) without aborting — leaked the
underlying RocksDB transaction *and* the blob files.
- Mid-record decode failure in the replication ingest path.
Approach:
- `startPreCommitBlobsForRecord` now returns `{ blobs, complete }` so
each `TransactionWrite` can attach `savedBlobs` for later cleanup.
- New `cleanupUnusedBlobs(blobs)` waits for any in-flight save to
settle, then unlinks. Idempotent — clears the list after scheduling.
- Commit handlers in `Table.ts` set `write.skipped = true` (resetting
to `false` on each invocation) on early-return paths that don't end
up writing a record or audit reference. The transaction commit
success paths walk writes and clean up `skipped` ones. Cleanup is
deferred to post-commit because the same handler runs again on
optimistic-lock retries and a retry can flip a previously-skipped
write into a successful one (e.g. a concurrent delete makes our
older replicated update suddenly win).
- `LMDBTransaction.abort` and `DatabaseTransaction.abort` walk all
writes' `savedBlobs` and clean unconditionally.
- `DatabaseTransaction.commit` adds an explicit reject handler so a
`Promise.all` failure aborts instead of leaking.
Tests: new `Blob test > multi-write transaction with one failing blob
cleans up succeeded blobs` and `cleanupUnusedBlobs is a no-op for
unsaved blobs`. The post-suite `cleanupOrphans` test acts as a safety
net across all blob tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 task
Contributor
|
Reviewed; no blockers found. |
…KNOWN_SIZE When all blob content is read and the reader gets bytesRead=0, it re-reads the header (still UNKNOWN_SIZE) and sets up an fs.watch watcher. If the writer finishes and updates the header between that header read and the watcher being created, no further writes occur and the watcher never fires, causing a 60s timeout with 'size is supposed to be 281474976710655 bytes'. Fix: after readSync at `position` returns 0, immediately re-read the header from position 0. If the size is no longer UNKNOWN_SIZE, bypass the watcher and retry directly. Apply the same check in the timeout callback as a second line of defense. Also eliminates the shared module-level HEADER/headerView buffers in favor of local DataView instances, avoiding any shared-state risk. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…riterFinished Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 tasks
cb1kenobi
approved these changes
May 5, 2026
…lect phase When a join condition (e.g. relatedByName) becomes the primary iterator due to a lower estimated count, the select phase uses filterMap.hasMappings=true and calls targetTable.transformEntryForSelect with the outer table's readTxn. That readTxn is a RocksTransaction scoped to the outer table's RocksDB column family and cannot read from a different table's column family, so getEntry returns null, producing [undefined] in the joined attribute. Fix: compute targetReadTxn = targetTable._readTxnForContext(context) when the target table differs from the current table, ensuring each joined table's records are loaded with a transaction scoped to that table's own store. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Several pre-commit blob save paths could leave a backing file on disk with no record referencing it. This PR plugs them: the file gets cleaned up after the abort, or after a successful commit if the commit handler took an early-return.
Concrete orphan paths fixed:
beforeIntermediaterejects after others wrote files (only the failing blob was cleaned up bydeleteOnFailure; succeeded peers leaked).DatabaseTransaction.commitrejecting onPromise.all(completions)failure (e.g. blob save error) without aborting — leaked the underlying RocksDB transaction and the blob files.Approach
startPreCommitBlobsForRecordnow returns{ blobs, complete };Table.tsattachessavedBlobsto eachTransactionWrite.write.skipped = true(resetting on each invocation) on early-return paths that don't end up writing a record/audit.DatabaseTransaction.commit,LMDBTransaction.commit) walk writes andcleanupUnusedBlobs(write.savedBlobs)forskippedones. Bothabort()paths run the cleanup unconditionally.DatabaseTransaction.commitadds an explicit reject handler so acompletionsfailure aborts (instead of leaking).Why deferred and not inline: the commit handler runs again on optimistic-lock retries. A retry can flip a previously-skipped write into a successful one (e.g. concurrent delete makes our older replicated update suddenly win). Inline cleanup would race the deletion's
setTimeoutagainst the retry that referenced the blob.A companion PR in harper-pro fixes one orphan path in the replication decode loop.
Where to look
resources/blob.ts: newcleanupUnusedBlobsand refactoredstartPreCommitBlobsForRecord.resources/Table.ts: 4addWritecall sites refactored to theconst write = {…}; write.beforeIntermediate = …; addWrite(write)shape so the commit closure can referencewrite.savedBlobs/write.skipped. Five early-return paths setwrite.skipped = true.resources/DatabaseTransaction.ts&resources/LMDBTransaction.ts: post-commit walk + abort cleanup. TheDatabaseTransaction.commitreject handler is a behavioral change worth careful eyes.DESIGN.md: design notes for future agents extending blob storage.Test plan
Blob test > multi-write transaction with one failing blob cleans up succeeded blobs— passes locally.cleanupUnusedBlobs is a no-op for unsaved blobs— passes locally.cleanupOrphanstest — passes (catches regressions across all blob tests).RecordEncoder.this.encoder is undefinederror in this dev environment; CI should be unaffected.🤖 Generated with Claude Code