fix(remove): atomic CAS branch deletion unifies safe-delete semantics#2903
fix(remove): atomic CAS branch deletion unifies safe-delete semantics#2903max-sixty wants to merge 8 commits into
Conversation
The `check_lock` in `step_prune` was carried over from #2808's Windows `.git/config` race fix. After #2870 restructured the flow so hook approval is exact, the phase ordering already serializes integration-check readers against the rename-failure fallback's `.git/config` rewrite: the background par_iter is the only reader, and the channel-drain loop (`for (idx, result) in rx`) only exits once that thread has finished. Removals run strictly after. That made the lock belt-and-suspenders — one mechanism per guarantee, per CLAUDE.md. Drop the `RwLock<()>`, the `Arc` wrapping it, the read-guard in the par_iter worker, the write-guard at the top of `try_remove`, and the `check_lock` field on `RemovalContext`. Comments on the par_iter spawn and the `SynchronousForNonCurrent` variant now name the real invariants (phase ordering, summary-accurate iteration) rather than the dropped lock. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e branch Both branch-deletion paths in `execute_instant_removal_or_fallback` swallowed errors at `log::debug!` — invisible in default verbosity. When the worktree is already gone but the branch survives (a hook advanced it, a `git branch -D` exec failed, refs scan errored), the user got no signal that the second half of the operation didn't happen. Promote both to a `warning_message` with a `wt remove --force-delete <branch>` recovery hint. To avoid duplicating the planner's existing "Branch unmerged" hint, the warning suppresses the `Ok(NotDeleted)` case when the planner already predicted retention (unmerged, or `--no-delete-branch`); race-driven `NotDeleted` outcomes still warn. `Err` always warns, since the prior `log::debug!` was invisible. `BackgroundRemoval` carries a new `planner_expected_retention` bit so the helper can make this distinction without re-deriving planner state. This adds noise to a previously silent path — the tradeoff is that silent half-completion of a destructive operation is worse than an extra warning line on the rare race. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
worktrunk-bot
left a comment
There was a problem hiding this comment.
The CAS approach unifies the four SafeDelete paths cleanly: one library helper (delete_branch_if_safe) for fast/synchronous/branch-only, plus a CAS-tail builder for the Detached fallback. The update-ref -d <ref> <oid> primitive picks up IntegrationReason semantics for free across all paths — the description of the prior && git branch -d <branch> shell tail silently losing squash-merged/patch-id matches is accurate. The race-rejection discrimination (try update-ref -d, then rev-parse --verify --quiet to distinguish "ref still present" from "real error") is the right shape per the structured-output rule.
The only gap is the one you raised in test plan item 3, and it shows up the most on handle_named_removed_worktree_foreground. In that path output.branch_result carries Ok(RetainedRaced) straight through RemovalDisplayInfo::from_branch_result → handle_branch_deletion_result, which sets show_unmerged_hint = true. RemovalDisplayInfo::print_message then prints "Removed feature worktree" with no mention that the branch was kept (since branch_deleted() is false), and print_hints falls into the self.show_unmerged_hint || … arm and emits "Branch unmerged; to delete, run wt remove -D". The branch wasn't unmerged from the planner's view — it was integrated and then moved — so that hint mis-attributes the cause.
The other three paths handle this correctly: the rename-into-trash fast path and SynchronousForNonCurrent go through warn_if_branch_retained (handlers.rs:311), and handle_branch_only_output has its own dedicated branch (handlers.rs:1123). The detached fallback can't surface it (CAS happens in the detached child), so foreground is the one missing case.
Cleanest fix is probably either: (1) split show_unmerged_hint in BranchDeletionDisplay from a show_race_warning flag for RetainedRaced, then have print_hints emit the dedicated "moved during deletion" message; or (2) keep the field but add a RetainedRaced early branch in print_hints that prints the warning before falling through.
Nit: assert_ne!(original_sha, advanced_sha) works because TestRepo::git_output trims (src/testing/mod.rs:1120), but it's worth knowing — if that trim ever goes away, the assertion becomes vacuous since the trailing \n would always make the strings differ.
The `Err` arm of `warn_if_branch_retained` covered failure modes the user cannot act on differently than re-running the command (`git branch -D` exec errors, refs DB I/O failures). Promote it from `warning_message` to `log::warn!` to match the writing-user-outputs guideline for non-user-fixable issues, and let the `Ok(NotDeleted)` race path stay user-visible since the recovery is `wt remove --force-delete <branch>`. Net effect on the prior commit: removes the duplicate "to retry, run ..." recovery hint for I/O errors (the prior wording suggested the same recovery either way, conflating two different failure shapes) and cuts an integration-test-unreachable code path that was the source of the codecov/patch miss on #2900. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
c127ed5 to
96f7f14
Compare
Integration tests cover only one of the four arms — the NotDeleted+planner-expected-deletion warning surfaced by test_merge_pre_remove_new_commit_keeps_branch. Add a binary-crate unit test that calls the helper with each outcome (Ok(NotDeleted) with both planner predictions, Ok(ForceDeleted), Ok(Integrated), Err) so the suppression logic and the log::warn Err arm are exercised in coverage. Output goes to stderr and isn't captured — the assertion is the call doesn't panic on any arm. User-visible message shapes stay pinned by the integration tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three follow-ups to #2870 collapse into one rewrite of `delete_branch_if_safe`: it now performs an atomic compare-and-swap delete via `git update-ref -d refs/heads/<branch> <expected-sha>`, where the expected SHA comes from the snapshot the integration check already consulted. If the ref moves between the integration check and the delete (a hook, a concurrent push), git rejects the CAS — the branch is retained (fail-closed) and we surface the new `BranchDeletionOutcome::RetainedRaced` outcome rather than dropping the unmerged commits silently. Force-delete still uses `git branch -D` (the user's explicit override). This unifies the divergent safe-delete semantics in `execute_instant_removal_or_fallback`: - Fast path and `SynchronousForNonCurrent` fallback already routed through `delete_branch_if_safe`, so they pick up CAS for free. - Detached fallback used to bake `&& git branch -d <branch>` into the shell, which only honored git's reachability-from-HEAD check (so squash-merged / patch-id / ancestor branches the planner accepted as integrated were rejected at delete time). Now `build_cas_branch_delete_tail` runs worktrunk's full integration check in the foreground and, if integrated, appends an atomic `git update-ref -d <ref> <expected-sha>` to the detached shell — same semantics as the other paths, plus CAS safety against tip movement between the foreground check and the detached delete. - `handle_branch_only_output` also switches its safe-delete path from `git branch -D` (an unsafe force-delete after a stale integration check) to `delete_branch_if_safe`, so the CAS protects branch-only deletions the same way. The display layer absorbs the new variant: `flag_note` treats `RetainedRaced` like `NotDeleted` for the success badge, `handle_branch_deletion_result` shows the unmerged hint, and the branch-only output path emits a dedicated "moved during deletion" warning with a `wt remove --force-delete <branch>` recovery hint. CAS protects the TOCTOU window inside `delete_branch_if_safe` (between its own snapshot capture and `update-ref -d`). Hook-driven races are still caught by the existing fresh integration check that runs after pre-remove hooks; CAS narrows the residual unprotected window from \"between check and delete\" to \"never\". Tests: `git update-ref -d <ref> <sha>` is structured — exit 0 on deletion, non-zero on stale-SHA rejection (\"cannot lock ref ... is at X but expected Y\"). The CAS helper re-checks ref presence via `rev-parse --verify --quiet` rather than parsing the error message, keeping with the structured-output policy. Two new unit tests in `git::remove::tests`: - `cas_rejects_delete_when_branch_advances` — captures a snapshot, moves the branch externally, then calls `delete_branch_if_safe` and asserts `RetainedRaced` plus surviving tip at the post-race SHA. Pins the CAS rejection mechanism. - `cas_deletes_when_branch_unchanged` — sanity check the common integrated case still deletes. `test_prune_fallback_config_race_canary`'s wrapper-git script now matches `update-ref -d refs/heads/<branch>` alongside the `branch -d|-D <branch>` shapes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The doc rewrite that came with the CAS commit only mentioned the NotDeleted-race case. Spell out RetainedRaced, NotDeleted (with the planner-predicted-retention suppression), the silent success arms, and the log::warn! Err arm so the file reads consistently after the rebase onto the PR-2900 simplification. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
96f7f14 to
992790f
Compare
worktrunk-bot
left a comment
There was a problem hiding this comment.
The CAS rewrite addresses the previous review's structural concerns. One small follow-up on the new test (inline).
The arm this PR introduces — the only one that warns regardless of planner_expected_retention — was the one missing from the per-arm test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…8e-cas # Conflicts: # src/commands/step/prune.rs
worktrunk-bot
left a comment
There was a problem hiding this comment.
codecov/patch is failing on the CAS changes. Triaging the report:
False positives (test exists, llvm-cov mis-attributes the middle lines of eprintln!(warning_message(cformat!(...)))): src/output/handlers.rs L324 and L335 in warn_if_branch_retained. The eprintln! openers at L322 and L333 are hit (L322 hits=2, L333 hits=1 in cobertura.xml), so the new warn_if_branch_retained_arms test added in 80e4b5e is doing its job — the gap is reporting noise inside the macro expansion.
Genuine gaps (hits=0 in cobertura):
src/output/handlers.rsL1127/L1129 — theRetainedRacedarm ofhandle_branch_only_output. Distinct fromwarn_if_branch_retained; the new unit test doesn't reach this function. Easiest fix: a sibling unit test that callshandle_branch_only_outputwith aBranchDeletionOutcome::RetainedRacedand asserts it doesn't panic, mirroring the pattern at handlers.rs:2016.src/output/handlers.rsL393 — theNone => remove_commandbranch ofbuild_remove_command_with_tail. TheSome(tail)arm is exercised (L391 hits=1); the no-tail arm is trivial to cover with a unit test on the helper.src/git/remove.rsL437-438 — thesnapshot_shareturnsNonefallback. By the inline comment this is intentionally a corner-case preservation of pre-CAS behavior, reachable only via a synthetic snapshot/branch mismatch. Either construct aRefSnapshotthat hits it, or — if the path is genuinely unreachable — remove it and let the caller fall through to a single deletion code path per the CLAUDE.md coverage guidance ("remove unused code… where falling through to a general handler suffices").src/git/remove.rsL495 — the real-error propagation incas_delete_branch_outcomewhen bothupdate-ref -dand the post-failurerev-parse --verifyfail. Hardest to exercise without a fault-injection hook.
I'm dismissing my prior approval per CLAUDE.md's "never merge a PR with failing codecov/patch without explicit user approval" — the two helper-function gaps look mechanical to close.
codecov/patch failing — coverage gaps detailed in follow-up comment
Three follow-ups to #2870, combined. The diff includes the three commits
also in #2900 — that PR is the small/conservative path; this one (#2903)
is the same fixes plus the larger CAS-based unification of safe-delete
semantics. Merge one or the other, not both. If you want the
incremental path, merge #2900 first and rebase this one — the CAS
commit will then stand alone.
Item 1 — drop the defensive
check_lock(refactor)The
check_lockRwLock<()>instep_prunewas carried over from#2808's Windows
.git/configrace fix. After #2870 restructured theflow, the phase ordering already serializes integration-check readers
against removals: the background
par_iteris the only reader, thefor ... in rxloop exits only once that thread has finished, andremovals run strictly after. Belt-and-suspenders per CLAUDE.md — removed.
Item 2 — warn when background branch deletion silently retains (fix)
Both branch-deletion paths in
execute_instant_removal_or_fallbackswallowed errors at
log::debug!. Promoted the surpriseOk(NotDeleted)(planner predicted deletion, branch survived — ahook moved the tip) to a
warning_messagewith awt remove --force-delete <branch>recovery hint. Suppressed when theplanner already predicted retention so the existing
print_hintsunmerged-branch message doesn't duplicate.
Errfailures now log atwarn level (developer-facing) rather than user-actionable warnings —
the failure modes (
git update-refexec error, refs DB I/O) aren'tactionable beyond re-running.
Item 3 / CAS rewrite — atomic safe-delete (the big one)
delete_branch_if_safenow usesgit update-ref -d refs/heads/<branch> <expected-sha>instead ofgit branch -D. The expected SHA comes fromthe snapshot the integration check already consulted, so the
check→delete sequence becomes one atomic step against a known SHA. If
anything moved the ref in between, git rejects the CAS — the new
BranchDeletionOutcome::RetainedRacedoutcome is returned and surfacedto the user. Force-delete keeps
git branch -D(explicit override).This unifies the divergent safe-delete semantics in
execute_instant_removal_or_fallback:delete_branch_if_safe, so they pick up CAS for free.&& git branch -d <branch>is replaced by a foreground integrationcheck + atomic
&& git update-ref -d <ref> <expected-sha>tail(
build_cas_branch_delete_tail). Squash-merged / patch-id / ancestorbranches the planner accepts are now accepted at delete time too,
matching the fast path.
handle_branch_only_output) switches frombare
git branch -D(effectively a force-delete after a staleintegration check) to
delete_branch_if_safe. CAS protects it too.Race coverage
CAS closes the TOCTOU window inside
delete_branch_if_safe(betweenits own
capture_refsandupdate-ref -d). Hook-driven racescontinue to be caught by the existing fresh integration check that
runs after pre-remove hooks; CAS narrows the residual window from
"between check and delete" to "never".
Tests
Two new unit tests in
git::remove::tests:cas_rejects_delete_when_branch_advances— captures snapshot, movesbranch externally, calls
delete_branch_if_safe, assertsRetainedRacedand that the branch survives at the post-race SHA.Concretely pins the CAS rejection mechanism per the user's
suggestion (rather than just the observable "branch still there"
outcome).
cas_deletes_when_branch_unchanged— sanity check the common case.test_prune_fallback_config_race_canary's wrapper-git script picks upthe new
update-ref -d refs/heads/<branch>shape.cargo run -- hook pre-merge --yespasses locally: 3836 nextest testsTest plan
cas_delete_branch_outcome—re-checking ref presence via
rev-parse --verify --quietexit coderather than parsing the error message (per CLAUDE.md "structured
output over error-message parsing").
build_cas_branch_delete_tailshell construction — usesshell_escape::unix::escapefor both the ref name and the SHA.RetainedRacedwarrants its own message inprint_hints— currently the BranchOnly path emits a dedicatedwarning; the worktree-removal paths fall through
handle_branch_deletion_result's unmerged-hint path.