Skip to content

feat(changelog): Add and use Assembling phase with deferred cleanup#480

Merged
lcian merged 22 commits into
lcian/feat/multipart-upload-tieredfrom
lcian/feat/assembling-changeguard
May 29, 2026
Merged

feat(changelog): Add and use Assembling phase with deferred cleanup#480
lcian merged 22 commits into
lcian/feat/multipart-upload-tieredfrom
lcian/feat/assembling-changeguard

Conversation

@lcian
Copy link
Copy Markdown
Member

@lcian lcian commented May 26, 2026

⚠️ Stacked on #458

This addresses the issue being discussed in #458 (comment) by introducing a new ChangePhase::Assembling variant.
The corresponding Change carries a cleanup_after timestamp, allowing the caller to defer cleanup.
In practice, we use this in TieredStorage::complete_multipart with an arbitrary 24 hours deadline (should we change this?), during which the user can freely retry completing the upload.

Note that this ChangePhase is special, as it behaves like ChangePhase::complete in-process:

  • It does nothing on Drop, as we assume that the deadline hasn't passed yet.
  • The corresponding ChangeGuard for it also doesn't spawn any task on Drop, due to the same assumption.

As a consequence, this depends on a persistent implementation of the ChangeLog to actually clean anything up, or at least on a task that periodically scans the in-memory ChangeLog for Changes to clean up, both of which we don't have yet.
In production, we will eventually always use a persistent ChangeLog, and other code paths exist that similarly require a persistent ChangeLog to actually ensure that no LT orphans are created, so I think it's fine for this change to also make that assumption.

Part of FS-339.
Will be squash-merged into #458 once approved.

The existing ChangeGuard assumes every LT write uses a unique key and
can be safely cleaned up if the guard drops before CAS commits. For
multipart uploads the physical key is fixed at initiation time, so a
retry of complete_multipart reuses the same key. Immediate cleanup on
guard drop would race with retries and delete successfully assembled
blobs.

Introduce ChangePhase::Assembling: when the guard drops in this phase,
in-process cleanup is skipped and the changelog entry persists. A
cleanup_after timestamp on Change tells scan() to filter out entries
whose grace period hasn't elapsed. Once the deadline passes, a recovery
scan picks up the entry and uses read_tombstone() to determine the
correct cleanup action — preserving blobs that a retry committed and
deleting genuine orphans.

Refs: FS-339
@linear-code

This comment was marked as spam.

@codecov

This comment was marked as spam.

@lcian

This comment was marked as spam.

@lcian

This comment was marked as spam.

Comment thread objectstore-service/src/backend/tiered.rs
cursor[bot]

This comment was marked as outdated.

@lcian lcian changed the title feat: Assembling phase for multipart ChangeGuard feat(service): Add and use Assembling phase with deferred cleanup May 26, 2026
@lcian lcian changed the title feat(service): Add and use Assembling phase with deferred cleanup feat(changelog): Add and use Assembling phase with deferred cleanup May 26, 2026
@lcian lcian marked this pull request as ready for review May 26, 2026 16:40
@lcian lcian requested a review from a team as a code owner May 26, 2026 16:40
Comment thread objectstore-service/src/backend/tiered.rs
Comment thread objectstore-service/src/backend/changelog.rs Outdated
Comment thread objectstore-service/src/backend/tiered.rs Outdated
@lcian
Copy link
Copy Markdown
Member Author

lcian commented May 28, 2026

@lcian lcian requested a review from matt-codecov May 28, 2026 16:45
Comment thread objectstore-service/src/backend/changelog.rs Outdated
Comment thread objectstore-service/src/backend/tiered.rs
@lcian lcian merged commit 59919c0 into lcian/feat/multipart-upload-tiered May 29, 2026
16 checks passed
@lcian lcian deleted the lcian/feat/assembling-changeguard branch May 29, 2026 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants