s3: AdminDeleteBucket DEL_PREFIX safety net (closes TOCTOU) by bootjp · Pull Request #695 · bootjp/elastickv

bootjp · 2026-04-27T16:22:03Z

Summary

Closes the AdminDeleteBucket TOCTOU race that coderabbitai flagged on PR #669. The empty-probe (ScanAt at readTS) and the BucketMetaKey delete (commit at a later commitTS) were not atomic — a concurrent PutObject between the two would leave manifest, blob chunks, upload metadata, upload parts, GC entries, and route key all orphaned under a now-deleted bucket meta.

Fix

Append DEL_PREFIX ops to the same OperationGroup as the BucketMetaKey delete, covering every per-bucket key family in internal/s3keys:

ObjectManifestPrefixForBucket
UploadMetaPrefixForBucket (new)
UploadPartPrefixForBucket (new)
BlobPrefixForBucket (new)
GCUploadPrefixForBucket (new)
RoutePrefixForBucket (new)

All seven ops share a commitTS, so anything that snuck into any per-bucket prefix between readTS and commit is tombstoned at the same logical moment as the bucket meta. BucketGenerationKey is intentionally not deleted — re-creating the bucket bumps generation and orphans under the old generation prefix stay isolated.

The same shape lands on the SigV4 path (adapter/s3.go:deleteBucket) so both delete entrypoints share the race-window guarantees. Factored into one bucketDeleteOperationGroupElems helper.

Contract change for clients

A PutObject that returned 200 OK during the race window can have its data swept by the concurrent delete. This is operationally bounded — operators are advised to pause writes before bucket delete (now in docs/admin_deployment.md §4.6). The alternative was orphan objects that no API can enumerate or remove; the new contract is strictly cleaner.

Commit structure (per CLAUDE.md design-doc-first + test-first)

3b8ef475 — design doc proposing Option C (DEL_PREFIX safety net) over Options A (BucketGenerationKey bump on every PutObject) and B (OperationGroup.ReadRanges schema extension)
360a2ee0 — s3keys per-bucket prefix helpers + isolation test (no behaviour change)
ad4de053 — failing regression test (would fail on current main)
d7c3589e — DEL_PREFIX safety net implementation (test passes)
5f48cd72 — operator doc + partial design doc Outstanding section update

Self-review (5 lenses per CLAUDE.md)

Data loss — the contract change is intentional and documented. Without the fix, data was lost in a worse way (orphaned, unreachable). With the fix, data loss is bounded to the race window and operator-controllable.
Concurrency / distributed failures — DEL_PREFIX is broadcast to every shard (kv/sharded_coordinator.go:230); per-shard scans return 0 keys in the common case (empty-probe already passed). bucketDeleteOperationGroupElems is shared between Admin and SigV4 paths so leader-direct and forwarded calls land identical ops.
Performance — DEL_PREFIX × 6 × N shards on each delete. AdminDeleteBucket is operator-frequency, not data-plane. PutObject hot path unchanged (rejected Option A specifically because it would have taxed it).
Data consistency — BucketGenerationKey isolation pinned by TestS3Server_AdminDeleteBucket_BucketGenerationKeySurvives. Generation bumps on recreate, orphans (if any escape from a future regression) stay invisible.
Test coverage — failing-then-passing regression test (TestS3Server_AdminDeleteBucket_SweepsOrphansAcrossAllPerBucketPrefixes table-driven across all 6 prefixes), generation-survival test, and s3keys isolation test (every per-bucket prefix matches its own bucket+gen and rejects sibling buckets / other generations).

Test plan

go test -count=1 ./internal/s3keys/... — passes
go test -count=1 -run TestS3Server_AdminDeleteBucket ./adapter/ — passes
go test -race -count=1 -run TestS3 ./adapter/ — passes (1.456s)
golangci-lint run ./adapter/... ./internal/s3keys/... — 0 issues
No dedicated S3 Jepsen suite today; if added, this fix should be exercised under it

Pre-existing TOCTOU between the empty-probe (ScanAt at readTS) and the BucketMetaKey delete commit, flagged by coderabbitai on PR #669 and tracked in the partial admin dashboard design doc Outstanding section. This doc proposes adopting Option C (DEL_PREFIX safety net) over Options A (BucketGenerationKey bump on every PutObject) and B (OperationGroup.ReadRanges schema extension): - Option A imposes a permanent throughput tax on every PutObject (the hot path) to fix a race that fires only at admin-delete time. - Option B is the semantically purest fix but the schema + coordinator + FSM surface is large and blocks unrelated work. - Option C closes the orphan window with the smallest patch. Existing op type (pb.Op_DEL_PREFIX), no schema change, PutObject hot path unchanged. Trade-off: a contract change that a PutObject returning 200 OK during the race window can have its data swept by the racing AdminDeleteBucket. Bounded and matches what docs/admin_deployment.md already advises operators to avoid. The doc also enumerates every per-bucket key family in internal/s3keys (manifest / upload meta / upload part / blob / gc / route) and the new *PrefixForBucket helpers each one needs. BucketGenerationKey is intentionally not deleted so re-creating the bucket bumps generation and orphans from the old generation remain isolated. Implementation lands in subsequent commits per CLAUDE.md design- doc-first workflow.

ObjectManifestPrefixForBucket has existed since the s3 adapter was introduced; the other five per-bucket key families never had a helper because the existing call sites always knew the per-upload or per-object prefix they needed. AdminDeleteBucket's DEL_PREFIX safety net (design doc 2026_04_28_proposed_admin_delete_bucket_safety_net.md) needs to sweep every per-bucket prefix in one OperationGroup, so every key family gets a parallel helper: - UploadMetaPrefixForBucket - UploadPartPrefixForBucket - BlobPrefixForBucket - GCUploadPrefixForBucket - RoutePrefixForBucket All six functions now share a single bucketScopedPrefix helper — the encoding is identical (`<family-prefix><EncodeSegment(bucket)> <appendU64(gen)>`) so factoring it out avoids drift if the per-key constructors evolve. Pinned the contract with TestPerBucketPrefixes_IsolateByBucketAnd Generation: for each of the six families, every key under (bucket=A, gen=N) matches the (A, N) prefix, while keys under (B, N) and (A, M) do NOT. The (A, M) check is the load-bearing assertion — if a future encoding change put generation before bucket, the orphan-isolation property bucket re-creation relies on would silently break and DEL_PREFIX would start nuking the new bucket's data. The test fails at construction time before any prod call site notices. No behavior change in this commit — pure refactor + new helpers.

Fails on current main: AdminDeleteBucket only deletes BucketMetaKey, so any keys planted in the 5 non-manifest per-bucket prefixes (upload_meta, upload_part, blob, gc_upload, route) survive the delete and become orphaned under a now-deleted bucket meta. Per CLAUDE.md "test the bug first" — this commit lands the failing test; the next commit lands the DEL_PREFIX safety net fix. Together they pin the regression so a future refactor that strips the safety net trips the assertion. The test plants directly into the store (via the local adapter coordinator's Put) rather than racing two goroutines because: - the empty-probe scan only looks at the manifest prefix, so orphans in the other 5 prefixes are exactly what can leak through the race window without disturbing the probe - direct planting is deterministic; goroutine racing is not - the safety net's behavior is independent of the race — it must wipe every per-bucket prefix at delete time, full stop, and that's what the test asserts A second test, TestS3Server_AdminDeleteBucket_BucketGenerationKey Survives, pins the orphan-isolation property the design relies on: BucketGenerationKey must NOT be deleted, so re-creating the bucket bumps generation and any pre-existing orphans under the old generation prefix stay invisible to the new bucket.

Closes the orphan-object race coderabbitai flagged on PR #669 and analysed in docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md. The empty-probe (ScanAt at readTS) and the BucketMetaKey delete (commit at a later commitTS) are not atomic; a concurrent PutObject between the two would leave manifest, blob, upload metadata, upload parts, GC entries, and route key all orphaned under a now-deleted bucket meta. The OCC validator only inspects keys appearing in OperationGroup.ReadKeys, so the existing single-key read set could not catch range-level mutations. The fix appends DEL_PREFIX ops to the same OperationGroup as the BucketMetaKey delete: - ObjectManifestPrefixForBucket - UploadMetaPrefixForBucket - UploadPartPrefixForBucket - BlobPrefixForBucket - GCUploadPrefixForBucket - RoutePrefixForBucket All seven ops share a commitTS, so anything that landed in any per-bucket prefix between readTS and commit is tombstoned with the bucket meta. BucketGenerationKey is intentionally NOT in the list — re-creating the bucket bumps generation and orphans under the old generation prefix stay isolated from the new bucket. Pinned by TestS3Server_AdminDeleteBucket_BucketGeneration KeySurvives. The same shape is mirrored on the SigV4 path (adapter/s3.go:deleteBucket) so both delete entrypoints share the same race-window guarantees. Factored into one bucketDeleteOperationGroupElems helper so a future per-bucket key family added to either path covers both delete paths in lockstep. Contract change for clients: a PutObject that returned 200 OK during the race window can have its data swept by the concurrent delete. Operationally bounded — operators are already advised to pause writes before AdminDeleteBucket. The alternative was orphan objects that no API can enumerate or remove; the new contract is strictly cleaner. DEL_PREFIX is broadcast to every shard (kv/sharded_coordinator.go), but per-shard scans return 0 keys in the common case (empty-probe already passed), and bucket delete is operator-frequency. The cost is acceptable. Tests: previously failing TestS3Server_AdminDeleteBucket_SweepsOrphansAcrossAllPerBucketPrefixes now passes (the 5 non-manifest prefixes are wiped). Existing AdminDeleteBucket tests (HappyPath / MissingBucket / RejectsRead Only / RejectsNonEmpty) and the full TestS3 race suite continue to pass. go test -race ./adapter/ — passes. golangci-lint run ./adapter/... ./internal/s3keys/... — 0 issues.

Two updates now that the DEL_PREFIX safety net has landed: 1. **docs/admin_deployment.md** — new §4.6 "Deleting an S3 bucket" covering the operator-facing contract: - Standard "bucket must be empty" rule still applies (409 BucketNotEmpty when not). - The race-window contract change is documented explicitly: a PutObject that returned 200 OK during the empty-probe → commit window of a concurrent AdminDeleteBucket can have its data swept. Calls out the two scenarios where this matters (planned vs emergency delete) and what operators should do in each case. - Confirms re-create-after-delete is safe because BucketGenerationKey survives — orphans from any previous escape stay isolated under the old generation prefix. - Cross-link to the new design doc. 2. **docs/design/2026_04_24_partial_admin_dashboard.md** — Outstanding open items section: TOCTOU entry struck through and replaced with the "fixed" summary plus a pointer to the safety-net design doc. Updated the closing rename trigger so only the rolling-upgrade flag (criterion 5) blocks promotion to `_implemented_` now. No design changes; no behaviour changes. State refresh.

coderabbitai · 2026-04-27T16:22:10Z

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 39 minutes and 47 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f7b81c33-1861-4737-a62e-50be867e1da8

📥 Commits

Reviewing files that changed from the base of the PR and between 303ebe2 and 69c808c.

📒 Files selected for processing (9)

adapter/dynamodb_migration_test.go
adapter/s3.go
adapter/s3_admin.go
adapter/s3_admin_test.go
docs/admin_deployment.md
docs/design/2026_04_24_partial_admin_dashboard.md
docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md
internal/s3keys/keys.go
internal/s3keys/keys_test.go

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/admin-delete-bucket-toctou-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

bootjp · 2026-04-27T16:22:12Z

@claude review

claude · 2026-04-27T16:22:28Z

Claude finished @bootjp's task in 3m 17s —— View job

PR Review: s3: AdminDeleteBucket DEL_PREFIX safety net (closes TOCTOU)

Todo List:

Read PR context and understand the changes
Review adapter/s3_admin.go — core fix implementation
Review adapter/s3.go — SigV4 path changes
Review internal/s3keys/keys.go — new prefix helpers
Review adapter/s3_admin_test.go — test coverage
Review internal/s3keys/keys_test.go — key isolation tests
Review design doc and admin deployment doc
Synthesize findings and post full review

Summary

The design intent is sound — appending DEL_PREFIX ops to the same OperationGroup as the BucketMetaKey delete is the right shape to close the TOCTOU window. The s3keys prefix helpers, isolation tests, and design doc are all well-structured. However, there is a critical implementation bug that causes the production dispatch to always fail, and the test coordinator masks it completely.

🔴 Critical: Mixed `Del` + `DelPrefix` in a transaction is rejected by the production coordinator

adapter/s3_admin.go:421-425 and adapter/s3.go:676-681

_, err = s.coordinator.Dispatch(ctx, &kv.OperationGroup[kv.OP]{
    IsTxn:   true,
    StartTS: startTS,
    Elems:   bucketDeleteOperationGroupElems(name, meta.Generation),
})

bucketDeleteOperationGroupElems returns [Del, DelPrefix×6]. The production ShardedCoordinator.Dispatch (kv/sharded_coordinator.go:221-264) routes any group containing a DelPrefix element through dispatchDelPrefixBroadcast. That function has two guards that both reject this call:

// kv/sharded_coordinator.go:292-298
func (c *ShardedCoordinator) dispatchDelPrefixBroadcast(isTxn bool, elems []*Elem[OP]) (*CoordinateResponse, error) {
    if isTxn {
        return nil, errors.Wrap(ErrInvalidRequest, "DEL_PREFIX not supported in transactions")
    }
    if err := validateDelPrefixOnly(elems); err != nil {  // rejects the leading kv.Del
        return nil, err
    }
    ...

In production, AdminDeleteBucket and deleteBucket will always return ErrInvalidRequest at the dispatch step, silently breaking bucket deletion. Because ErrInvalidRequest is not in isRetryableS3MutationErr, the retry loop bails out immediately and the HTTP handler returns an internal error. Bucket metadata is never deleted, which is a regression from the previous behavior.

Why do the tests pass? The test coordinator (localAdapterCoordinator.applyElem, adapter/dynamodb_migration_test.go:69-83) is a stub that directly calls store.DeletePrefixAt for DelPrefix ops, bypassing all ShardedCoordinator validation. The test never touches the real dispatch path. The regression test (TestS3Server_AdminDeleteBucket_SweepsOrphansAcrossAllPerBucketPrefixes) validates the desired post-condition against the stub but doesn't validate that the production coordinator can actually execute the operation.

Fix direction: The simplest correct fix is to split the commit into two separate dispatches:

The txn dispatch (IsTxn: true) with only {Op: kv.Del, Key: BucketMetaKey(name)} — this is the current behavior and closes the bucket for new creates.
A non-txn DEL_PREFIX broadcast for each per-bucket prefix — this sweeps orphans after the meta is gone.

This sacrifices the single-commitTS atomicity guarantee from the design doc (the two commits have different timestamps), but it's the only shape the current coordinator supports. The practical impact is a brief window where the meta is deleted but orphan data is not yet swept — operators who read the admin deployment doc's "pause writes" guidance won't be affected, and orphaned data from this window is bounded to a single request's worth. The design doc's Section 4 should be updated to reflect this implementation constraint.

Alternatively, extend dispatchDelPrefixBroadcast to support non-txn mixed Del+DelPrefix groups — but that is a coordinator-level change with broader blast radius.

Fix this →

🟠 Test gap: stub coordinator masks production coordinator constraints

adapter/s3_admin_test.go:378-497 (the two new regression tests)

All adapter tests use localAdapterCoordinator (defined in adapter/dynamodb_migration_test.go:14-83), which is a direct-apply stub that neither validates operation types nor enforces the IsTxn + DelPrefix incompatibility. This is the root cause of the green test suite hiding the production failure above.

The existing test infrastructure should either:

Add a ShardedCoordinator-backed integration path for this specific test (even a minimal single-shard version), or
Add an explicit assertion in the test that the coordinator accepted the operation group without error and validate against the coordinator's actual constraints.

🟡 Design doc lifecycle inconsistency

docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md

Per CLAUDE.md: "rename *_proposed_*.md → *_partial_*.md once the first milestone ships." If the implementation is landing in this same PR (commits 360a2ee through d7c3589), the file should be renamed *_partial_*.md (or *_implemented_*.md if all milestones ship here) at merge time. The doc currently carries the Proposed status.

Also minor: the doc is dated 2026-04-28 but the branch appears to be merged on 2026-04-27.

✅ What's good

internal/s3keys/keys.go — The bucketScopedPrefix factoring is clean. All five new *PrefixForBucket helpers correctly encode bucket-segment + generation-u64, matching the structure of their per-key constructors.
internal/s3keys/keys_test.go:TestPerBucketPrefixes_IsolateByBucketAndGeneration — Solid isolation coverage: pinning that a prefix matches its own (bucket, gen) and rejects both sibling buckets and other generations. This is exactly the contract that matters for the orphan-isolation property.
TestS3Server_AdminDeleteBucket_BucketGenerationKeySurvives — Correct and important. Pinning that BucketGenerationKey survives the delete, and that a recreate lands at a strictly higher generation, is the right safety guarantee.
adapter/s3_admin.go:bucketDeleteOperationGroupElems — The helper function is the right abstraction. Sharing it between Admin and SigV4 paths ensures both delete entrypoints stay in sync if a future key family is added. The comment is well-reasoned.
docs/admin_deployment.md §4.6 — The "pause writes" guidance and the contract-change explanation are clear and accurate for what the fix is intended to do. This section should be retained when the implementation is corrected.
adapter/s3_admin.go comment block on AdminDeleteBucket — The race analysis, the BucketGenerationKey rationale, and the contract change callout are thorough and accurate.

Summary

The test-visible behavior is correct (the stub coordinator does the right thing), but in production both AdminDeleteBucket and deleteBucket will fail every time due to the coordinator rejecting the mixed Del + DelPrefix transaction. The fix needs to be restructured to two separate dispatches before merging.

gemini-code-assist

Code Review

This pull request addresses a TOCTOU race condition in bucket deletion by introducing a safety net that uses DEL_PREFIX operations to tombstone all per-bucket key families during the deletion transaction. This prevents orphaned keys from being left behind if a concurrent write occurs. The changes include new key prefix helpers, updates to both the SigV4 and Admin delete paths, and extensive regression tests and documentation. I have no feedback to provide.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5f48cd7230

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-27T16:29:53Z

 		_, err = s.coordinator.Dispatch(ctx, &kv.OperationGroup[kv.OP]{
 			IsTxn:   true,
 			StartTS: startTS,
-			Elems: []*kv.Elem[kv.OP]{
-				{Op: kv.Del, Key: s3keys.BucketMetaKey(name)},
-			},
+			Elems:   bucketDeleteOperationGroupElems(name, meta.Generation),


Split bucket delete into supported coordinator operation shapes

AdminDeleteBucket now dispatches IsTxn: true with bucketDeleteOperationGroupElems(...), which includes both kv.Del and kv.DelPrefix. That combination is rejected by real coordinators: kv/sharded_coordinator.go routes any request containing DelPrefix through dispatchDelPrefixBroadcast, which rejects transactional and mixed-operation requests, and the single-group txn path rejects Op_DEL_PREFIX intents in kv/fsm.go. In production this makes bucket deletion fail (admin and SigV4, since both paths share this helper) instead of applying the intended safety net; the new local tests miss it because newLocalAdapterCoordinator bypasses coordinator validation.

Useful? React with 👍 / 👎.

Codex P1 on PR #695 caught that the original single-OperationGroup shape would be rejected by the production coordinator on every bucket delete. The fix lands here together with a test-coordinator tightening that catches this class of bug in unit tests. Background: kv/sharded_coordinator.go:dispatchDelPrefixBroadcast rejects any OperationGroup that mixes DEL_PREFIX with other ops AND any OperationGroup that uses DEL_PREFIX inside a transaction. The earlier shape ([Del BucketMetaKey, DelPrefix×6] with IsTxn=true) hit both rejection rules, so production would have failed every bucket delete with ErrInvalidRequest. The local test coordinator bypassed those checks, so the regression test passed despite the production-breaking shape. Fix is two-part: 1. **Split AdminDeleteBucket and s3.go:deleteBucket into two Dispatch calls**: Phase 1 — `Del BucketMetaKey` in a txn (OCC-protected against a concurrent AdminCreateBucket racing the delete). Lives inside retryS3Mutation so an OCC conflict retries the whole closure. Phase 2 — DEL_PREFIX broadcast over every per-bucket key family in a non-txn OperationGroup. Lives outside retryS3Mutation because Phase 1 is the point of no return: a Phase-2 retry would 404 at loadBucketMetaAt. Phase-2 failure is logged via slog.WarnContext and not propagated. The bucket meta is already gone from the operator's POV; orphan keys (if any) are no worse than the pre-fix state on main and can be recovered by a future sweep tool. Surfacing a 500 to the operator after a successful delete would be a worse UX. Phase-1-first ordering is deliberate: a Phase-2-first ordering could leave the bucket meta extant while per-bucket data was wiped if Phase 1 then failed (concurrent recreate). Phase-1- first localises any partial failure to "bucket gone, orphan data may persist", which has a clean audit trail. 2. **Tighten localAdapterCoordinator validation** to mirror the production coordinator's dispatch-time rejection rules: reject IsTxn=true with any DelPrefix, reject mixed Del+DelPrefix groups. Without this, a future regression that ships the rejected shape would silently pass tests while breaking production. The existing TestS3Server_AdminDeleteBucket_SweepsOrphansAcrossAllPerBucketPrefixes test now exercises the production-realistic dispatch path and would have caught Codex P1 directly. Refactor: split the single `bucketDeleteOperationGroupElems` helper into `bucketDeleteSafetyNetElems` (DEL_PREFIX-only, used by Phase 2) and a new `runBucketDeleteSafetyNet` method that shares the dispatch-and-log logic between Admin and SigV4 paths. The Phase-1 Del op shape is small enough to inline at each call site. Design doc §6.2 rewritten with the two-phase rationale, the Codex P1 finding, the Phase-2 failure semantics, and why Phase-1- first ordering is correct. Tests: go test -count=1 -run TestS3Server_AdminDeleteBucket ./adapter/ — passes go test -race -count=1 -run TestS3 ./adapter/ — passes (1.455s) golangci-lint run ./adapter/... — 0 issues

bootjp · 2026-04-27T16:50:11Z

@claude review

Addressed the Codex P1 finding (single-OperationGroup shape rejected by production coordinator):

The original [Del BucketMetaKey, DelPrefix×6] with IsTxn=true would have hit two rejection rules in kv/sharded_coordinator.go:dispatchDelPrefixBroadcast — DelPrefix-in-txn AND mixed Del+DelPrefix. Production would have failed every bucket delete with ErrInvalidRequest. My local test coordinator bypassed those checks, which is exactly the gap Codex caught.

Two-part fix in commit 69c808cf:

Split into two Dispatch calls:
- Phase 1: Del BucketMetaKey in a txn (OCC-protected against concurrent AdminCreateBucket)
- Phase 2: DEL_PREFIX broadcast (non-txn) over the 6 per-bucket prefixes
- Phase 2 failure is logged via slog.WarnContext and not propagated — Phase 1 is the point of no return; a Phase-2 retry would 404 at loadBucketMetaAt. Orphan state is no worse than pre-fix.
- Phase-1-first ordering is deliberate: Phase-2-first could leave bucket meta extant while data was wiped if Phase 1 then failed (concurrent recreate). Phase-1-first localises partial failure to a well-defined "bucket gone, orphans may persist" state.
Tightened localAdapterCoordinator.validateDispatchShape to mirror the production rejection rules. Without this, the regression test passed despite the production-breaking dispatch shape — exactly the gap Codex called out. The existing TestS3Server_AdminDeleteBucket_SweepsOrphansAcrossAllPerBucketPrefixes now exercises the production-realistic dispatch path.

Design doc §6.2 rewritten with the two-phase rationale, the Phase-2 failure semantics, why Phase-1-first ordering is correct, and the test-coordinator parity note.

go test -race passes. Lint 0 issues.

The single comment at adapter/s3_admin.go:424 references the now-removed bucketDeleteOperationGroupElems shape — the helper no longer exists; the fix splits the dispatch into two calls so neither rejection rule fires.

claude · 2026-04-27T16:50:25Z

Claude finished @bootjp's task in 1m 55s —— View job

PR Review: s3: AdminDeleteBucket DEL_PREFIX safety net (closes TOCTOU)

Todo List:

Read trigger comment to understand what was changed
Read adapter/s3_admin.go — two-phase dispatch implementation
Read adapter/s3.go — SigV4 path two-phase dispatch
Read adapter/dynamodb_migration_test.go — validateDispatchShape additions
Read kv/sharded_coordinator.go — verify production constraints
Read updated design doc §6.2
Synthesize findings and post review

Review of commit `69c808cf`

The P1 fix is correct. Here's the full assessment.

✅ Two-phase dispatch — correct shape

adapter/s3_admin.go:439-448 and adapter/s3.go:681-690

Phase 1 dispatches {IsTxn:true, Elems:[Del BucketMetaKey]} — only Del ops, no DelPrefix, so neither production rejection rule fires. Phase 2 dispatches {Elems:[DelPrefix×6]} with no IsTxn — a pure DelPrefix-only non-txn broadcast, which is the only shape dispatchDelPrefixBroadcast accepts. Both shapes are confirmed valid against kv/sharded_coordinator.go:292-298.

✅ `validateDispatchShape` mirrors production correctly

adapter/dynamodb_migration_test.go:58-81

The two rejection rules are faithfully reproduced:

IsTxn && hasDelPrefix → "DEL_PREFIX not supported in transactions"
hasDelPrefix && hasOther → "DEL_PREFIX cannot be mixed with other operations"

This closes the gap the previous review flagged: a future regression that ships IsTxn:true + [Del, DelPrefix...] will now fail in tests before it reaches production.

✅ `deletedGeneration` capture is correct

var deletedGeneration uint64 is set inside the closure only after Phase 1 successfully commits (deletedGeneration = meta.Generation). The if err != nil { return err } guard after retryS3Mutation ensures Phase 2 is never called when Phase 1 failed, so the zero-value deletedGeneration = 0 is never passed to runBucketDeleteSafetyNet. The same pattern is correct in both AdminDeleteBucket and deleteBucket.

✅ Phase-1-first ordering is correct

Phase-2-first would wipe per-bucket data while the bucket meta still exists — if Phase 1 then failed (OCC from concurrent recreate), callers would see "bucket exists, data gone", which is incoherent. Phase-1-first bounds any partial failure to "bucket gone, orphans may persist", which has a clear audit trail in slog.WarnContext and is no worse than the pre-fix race condition.

✅ `bucketDeleteSafetyNetElems` + `runBucketDeleteSafetyNet` abstraction

One place to maintain the prefix list, one place to do the error-swallow-and-warn. Adding a new per-bucket key family in the future means editing exactly one function — exactly the right shape.

🟡 Design doc lifecycle — should be renamed (minor, same as previous review)

docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md

Per CLAUDE.md: "rename *_proposed_*.md → *_partial_*.md once the first milestone ships" and "rename *_partial_*.md → *_implemented_*.md once the final milestone ships." The implementation has fully landed — all milestones in §7 have shipped in this PR. The file should be renamed to *_implemented_*.md via git mv before merge so history follows it.

Fix this →

Summary

The critical issue from the previous review is fully resolved. The two-phase dispatch is structurally correct, the test coordinator now has production-realistic rejection rules, both delete paths share a single implementation, and the design doc accurately reflects the implementation. Only the doc lifecycle rename remains.

Per docs/design/README.md's lifecycle convention. The original P1–P4 plan has fully shipped: - P1 (admin skeleton + Dynamo + AdminForward) — #634/#635/#644/#648 - P2 (S3 endpoints incl. write paths and AdminForward integration) — #658 / #669 / #673 / #695 (TOCTOU safety net) - P3 (React SPA + embed) — #649 / #650 - P4 (TLS / role / CSRF / operator doc / deployment runbook / scripts/rolling-update.sh admin support) — #674 / #669 / #678 The AdminDeleteBucket TOCTOU caught during PR #669 review (the last "in-flight" item that kept the doc at _partial_) is fully resolved by the safety-net design landed in #695. What changed: - git mv 2026_04_24_partial_admin_dashboard.md → 2026_04_24_implemented_admin_dashboard.md (history follows the rename) - Header Status line: "Partial" → "Implemented", explanation updated to reflect the post-fix state and the rationale for promotion. - "Last updated" bumped to 2026-04-28 with the rename trigger. - Section heading "Outstanding open items" → "Out-of-scope follow-ups" — the remaining three entries (criterion 5, object browser, TLS hot-reload) are not in-flight work; they are deferred-at-design or Non-goal items. The TOCTOU bullet is removed (resolved) and replaced with a one-line cross-link to the safety-net design + admin_deployment.md §4.6 contract. - Removed the closing "rename trigger" sentence — we just did the rename. - Status table: P2 row now lists #695 alongside #658/#669/#673 so a future reader can find the TOCTOU fix from the index. - Cross-references updated everywhere the old filename appeared: docs/admin.md (header link + Cross-references) docs/admin_deployment.md (header link + final cross-ref) docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md (Background section pointer) internal/admin/config.go (Section 7.1 reference comment) No code changes other than the comment-only filename refresh in config.go.

## Summary Promote the admin dashboard design doc from `_partial_` → `_implemented_` per `docs/design/README.md`'s lifecycle convention. PR #695 landed the TOCTOU safety-net fix (the last in-flight item that kept the doc at `_partial_`), so the original P1–P4 plan is now fully shipped: | Phase | Landed via | |---|---| | P1 (admin skeleton + Dynamo + AdminForward) | #634 / #635 / #644 / #648 | | P2 (S3 endpoints + writes + AdminForward S3 + TOCTOU fix) | #658 / #669 / #673 / **#695** | | P3 (React SPA + embed) | #649 / #650 | | P4 (TLS / role / CSRF / operator doc / runbook / script wiring) | #674 / #669 / #678 | The remaining three items in the doc move from "Outstanding open items" (in-flight) to **"Out-of-scope follow-ups"** (deferred-at-design or Non-goal): - AdminForward criterion 5 — rolling-upgrade flag, deferred behind a cluster-version bump that doesn't exist yet - S3 object browser — Non-goal per §2.2 - TLS cert hot-reload — out of scope per `docs/admin.md` ## Changes - `git mv` partial → implemented (history follows the rename) - Status line / Last-updated / status-table / Out-of-scope section content reflects the promotion - Cross-references updated in all 4 referencing files: `docs/admin.md`, `docs/admin_deployment.md`, `docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md`, `internal/admin/config.go` (comment-only) ## Test plan - [x] No code changes other than a comment-only filename refresh in `config.go` - [x] `go build ./...` passes - [x] `golangci-lint run ./internal/admin/...` — 0 issues - [x] `grep -rn "2026_04_24_partial\|2026_04_24_proposed_admin"` returns nothing — no stale references

bootjp added 5 commits April 28, 2026 01:07

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 27, 2026

View reviewed changes

bootjp merged commit eb85c3d into main Apr 28, 2026
9 checks passed

bootjp deleted the feat/admin-delete-bucket-toctou-fix branch April 28, 2026 05:13

bootjp mentioned this pull request Apr 28, 2026

docs(design): promote admin dashboard _partial_ → _implemented_ #701

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s3: AdminDeleteBucket DEL_PREFIX safety net (closes TOCTOU)#695

s3: AdminDeleteBucket DEL_PREFIX safety net (closes TOCTOU)#695
bootjp merged 6 commits intomainfrom
feat/admin-delete-bucket-toctou-fix

bootjp commented Apr 27, 2026

Uh oh!

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

bootjp commented Apr 27, 2026

Uh oh!

claude Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Uh oh!

bootjp commented Apr 27, 2026

Uh oh!

claude Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bootjp commented Apr 27, 2026

Summary

Fix

Contract change for clients

Commit structure (per CLAUDE.md design-doc-first + test-first)

Self-review (5 lenses per CLAUDE.md)

Test plan

Uh oh!

coderabbitai Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Uh oh!

bootjp commented Apr 27, 2026

Uh oh!

claude Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: s3: AdminDeleteBucket DEL_PREFIX safety net (closes TOCTOU)

Summary

🔴 Critical: Mixed Del + DelPrefix in a transaction is rejected by the production coordinator

🟠 Test gap: stub coordinator masks production coordinator constraints

🟡 Design doc lifecycle inconsistency

✅ What's good

Summary

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

bootjp commented Apr 27, 2026

Uh oh!

claude Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: s3: AdminDeleteBucket DEL_PREFIX safety net (closes TOCTOU)

Review of commit 69c808cf

✅ Two-phase dispatch — correct shape

✅ validateDispatchShape mirrors production correctly

✅ deletedGeneration capture is correct

✅ Phase-1-first ordering is correct

✅ bucketDeleteSafetyNetElems + runBucketDeleteSafetyNet abstraction

🟡 Design doc lifecycle — should be renamed (minor, same as previous review)

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading

claude Bot commented Apr 27, 2026 •

edited

Loading

🔴 Critical: Mixed `Del` + `DelPrefix` in a transaction is rejected by the production coordinator

claude Bot commented Apr 27, 2026 •

edited

Loading

Review of commit `69c808cf`

✅ `validateDispatchShape` mirrors production correctly

✅ `deletedGeneration` capture is correct

✅ `bucketDeleteSafetyNetElems` + `runBucketDeleteSafetyNet` abstraction