Add wt step {commit,squash} --dry-run, hide --show-prompt#2557
Add wt step {commit,squash} --dry-run, hide --show-prompt#2557
Conversation
Changes `wt step commit` and `wt step squash` to use `--dry-run` instead of `--show-prompt`. The new flag renders the prompt, invokes the LLM, prints the generated command and message in three labeled sections (PROMPT, COMMAND, MESSAGE), then exits without executing the commit or squash. For `--dry-run` on commit, staged changes are mirrored to a temporary index (respecting the `--stage` flag) so the previewed prompt matches what a real run would send the LLM, without modifying the user's actual index. The old `--show-prompt` flag is hidden but retained for backward compatibility, outputting only the rendered prompt against the current index without calling the LLM.
Routes the three PROMPT/COMMAND/MESSAGE sections through `show_help_in_pager` so the prompt (50+ lines on its own) doesn't scroll off the terminal. The helper TTY-detects internally, so `wt step commit --dry-run | grep ...` keeps working — snapshot tests run without a TTY and stay unchanged. Adds a "When to page output" section to `writing-user-outputs` documenting which commands page (long human-oriented stdout: `--help`, `wt config show`, `wt hook show`, `--dry-run`) versus which don't (pipe-first data, short output, or output already paged by a delegated tool). Co-Authored-By: Claude <noreply@anthropic.com>
worktrunk-bot
left a comment
There was a problem hiding this comment.
One stale-docstring nit on build_commit_prompt — the "Used by both …" line predates --dry-run. Inline suggestion below.
Minor observation, not blocking: the new direct Cmd::new("git").run()? calls in llm.rs (generate_commit_message fallback path and build_commit_prompt) drop the exit-status check that repo.run_command() previously performed. If git diff --staged ever fails (corrupt index, etc.), we now silently feed an empty <diff> to the LLM instead of bailing. Matches the existing pattern in commands/list/collect/tasks.rs, so consistent with the codebase — flagging in case it matters.
worktrunk-bot
left a comment
There was a problem hiding this comment.
CI flagged two PR-related issues — dismissing my approval until they're resolved.
test (windows) — snapshot non-portability
render_llm_invocation (new in src/llm.rs) interpolates ShellConfig::get()?.executable.to_string_lossy() directly into the COMMAND section. On Linux the snapshot got sh -c '...'; on Windows it gets C:\Program Files\Git\bin\bash.exe -c '...', so both new snapshots (step_commit_dry_run, step_squash_dry_run) fail. Options: filter the shell path in the snapshot, stub ShellConfig in the test, or gate the snapshot with #[cfg(unix)].
codecov/patch — 93.25% vs 96.76% target
11 lines uncovered. The bulk are defensive arms with no exercising test:
src/commands/step_commands.rs:152—StageMode::Trackedarm instage_to_temp_index(no--dry-run --stage=trackedtest)src/commands/step_commands.rs:154—StageMode::None => return Ok(temp)defensive arm; the caller inpreview_commitalready filtersNoneout, so this branch is unreachable. Per CLAUDE.md ("Don't add error handling … for scenarios that can't happen"), this could be replaced withunreachable!()or the match could be moved up to the caller (handleAll/Trackedonly).src/commands/step_commands.rs:194-199— theNone => writeln!(…)"(LLM not configured — using built-in fallback)" arm inprint_dry_run. No test runs--dry-runwithout a configured commit-generation command.src/commands/step_commands.rs:207-208—show_help_in_pagerErrfallback. Rarely fires; likely a non-issue, but counted in patch coverage.src/commands/step_commands.rs:542— squashbuild_squash_promptclosing line, likely a brace artifact.src/llm.rs:525—name_only.env("GIT_INDEX_FILE", path)in the no-LLM fallback path ofgenerate_commit_message; only fires when--dry-runis used with no commit-generation command configured.
A test for --dry-run with no LLM configured would hit lines 194-199 (and 525). A --dry-run --stage=tracked test would hit 152. Removing the StageMode::None arm from stage_to_temp_index (since the caller filters it) would drop 154 from the patch surface.
Happy to push these fixes if you'd like — let me know.
CI flagged related issues — Windows snapshot non-portability and codecov below patch target. Details in follow-up review.
`render_llm_invocation` now uses the shell's basename for the COMMAND section instead of the full path, so the displayed invocation stays short and doesn't leak install paths (`C:\Program Files\Git\bin\bash.exe` → `bash.exe`). Adds an insta filter normalizing `bash.exe` → `sh` so dry-run snapshots match across Linux and Windows.
Adds `run_git_capture` in `llm.rs` to bail on non-zero exit from the new direct `Cmd::new("git")` call sites — previously a failing `git diff --staged` would silently feed an empty diff to the LLM. `stage_to_temp_index` gets the same exit-status check inline. Reviewer flagged this on #2557.
Adds two unit tests for `render_llm_invocation` covering the basename-only display and shell-quote escaping (codecov patch coverage gap on the new function). Updates the docstring on `build_commit_prompt` to mention `--dry-run` alongside the existing modes.
Co-Authored-By: Claude <noreply@anthropic.com>
worktrunk-bot
left a comment
There was a problem hiding this comment.
CI on the new commit shows the Windows snapshot fix didn't take — test (windows) still fails on test_step_commit_dry_run and test_step_squash_dry_run with the same bash.exe vs sh mismatch as before. Dismissing my approval until this is resolved.
Why the filter doesn't fire
The snapshot output wraps the shell name in ANSI escape codes — the captured line looks like …\x1b[34mbash.exe\x1b[0m…. The leading \b in \bbash\.exe\b requires a word/non-word transition at the start of bash, but the character immediately before b is m (from the ANSI prefix) — both are \w, so no boundary exists and the regex never matches. The trailing \b would match (e → \x1b), but it's the leading \b that vetoes the substitution.
Easiest fix: drop the \bs entirely. bash.exe is unlikely to appear anywhere else in test snapshots, and the same pattern works on both platforms (no-op on Linux).
Inline suggestion below.
Codecov
codecov/patch is also failing — src/commands/step_commands.rs is at 87.75% (12 misses on 98 patch lines), pulling the patch below the 96.76% target. The defensive arms I flagged in the previous review (StageMode::Tracked, the (LLM not configured) arm in print_dry_run, show_help_in_pager Err fallback) are still uncovered. render_llm_invocation is also reported uncovered by codecov despite the new unit tests — worth a sanity check that those tests actually run under cargo llvm-cov.
CI failed — Windows snapshot still mismatches; the \b regex anchors don't fire because of surrounding ANSI escape codes.
The `bash.exe` insta filter from the previous fix-up didn't fire on Windows because the syntax-highlighted output puts an ANSI sequence (`...m`) immediately before `bash`, and `m` is a word character — so `\bbash\.exe\b` saw no word boundary. Drop the boundaries; `bash.exe` is unique enough to match safely without them. Refactors `stage_to_temp_index` to take the `git add` argv directly and pushes the `StageMode::None` filter into the caller. Removes a dead `StageMode::None => return Ok(temp)` arm that codecov was flagging as uncovered. Adds two coverage tests: `--dry-run --stage=tracked` (mirrors `git add -u` against the temp index — modified tracked files appear, untracked files don't) and `--dry-run` without `[commit.generation]` configured (fallback path prints "LLM not configured" for COMMAND and a deterministic file-list message for MESSAGE). Co-Authored-By: Claude <noreply@anthropic.com>
worktrunk-bot
left a comment
There was a problem hiding this comment.
CI passed on Linux/macOS/Windows but codecov/patch is at 95.28% vs the 96.76% target — 9 lines uncovered. Dismissing my approval per CLAUDE.md (codecov gates merge despite being marked non-required).
The remaining misses split into three categories. None of the prior-review-flagged regressions reappear; this is the residual surface after the new --stage=tracked and no-LLM-configured tests landed.
Defensive git failure paths (4 lines)
src/commands/step_commands.rs 159–160 (the if !output.status.success() branch in stage_to_temp_index) and src/llm.rs 581–582 (the equivalent in run_git_capture) bail when git add/git diff --staged exits non-zero against the temp index. These are intentional — without them a failing git diff would silently feed an empty diff to the LLM, which is the regression you (correctly) flagged when adding run_git_capture. They're hard to exercise without injecting a fake git. Two options that don't churn the production code: leave as-is and accept the 4-line gap, or factor the success-check into a tiny shared helper with a unit test that calls it with a synthetic non-zero Output.
Pager Err fallback (2 lines)
src/commands/step_commands.rs 207–208 — if let Err(e) = show_help_in_pager(...) { log::debug!(...); println!(...) }. I flagged this in the previous review as "rarely fires; likely a non-issue, but counted in patch coverage." Same conclusion now. If you want to drop it, ?-propagating the pager error is fine — pager failures already log inside show_help_in_pager, and an outright failure to display the dry-run output is loud-failure-better-than-silent.
Macro continuation lines (3 lines)
src/commands/step_commands.rs 87 ()?; closing the multi-line approve_or_skip call), 194 and 199 ()?, closing the two writeln! arms in print_dry_run's match). llvm-cov sometimes reports a multi-line call's closing ? as a separate region with 0 hits even though every other line in the expression is covered. Reformatting the calls to fit on one line would mask the report but not change coverage in any meaningful sense. I'd skip these.
My take
The 4 git-failure-path lines are the only ones I'd consider real coverage gaps, and even those are testing whether bail! fires on Output { status: non-zero } — fine to leave defensive without a test. Happy to push any of: a shared git_check_status helper plus a unit test (covers 4 lines), or the pager Err arm removal (covers 2 lines), or both.
Otherwise this is a coverage threshold judgment call rather than a missing-test issue.
codecov/patch failed (95.28% vs 96.76% target). See follow-up review for analysis.
Merge main into the branch, resolving conflicts in `step_commands.rs` and `main.rs` from the new `step commit --dry-run` and `step squash --dry-run` features (#2557 area). The JSON `--show-prompt` guard now also rejects `--dry-run + --format=json` for the same reason — both emit raw text. Main also changed `LoadedConfigs` to hold borrowed references (PR #2556), which is what caused the prior CI compile failure that was invisible locally. Drop the now-unnecessary `.as_ref()` calls in `handle_hook_show` (`configs.project` is already `Option<&ProjectConfig>`). Address tend-bot review feedback on the JSON additions: - **Display abbreviation**: `commit_staged_changes` and `handle_squash` now call `git rev-parse --short HEAD` for the success-line hash (honors `core.abbrev` and auto-extends ambiguous prefixes), keeping the full 40-char SHA only for the JSON payload. Previously a fixed 7-char slice silently lost both behaviors. - **`--format` flag descriptions**: each new `--format` flag now includes a one-liner naming the JSON shape, matching the pattern set by `wt switch`/`wt remove`/`wt merge`. - **Missing test**: add `test_step_push_no_ff_json` exercising the `merge_commit` outcome and the new `merge_sha` field. Co-Authored-By: Claude <noreply@anthropic.com>
## Summary Release v0.48.0. See [CHANGELOG.md](https://github.com/max-sixty/worktrunk/blob/release/CHANGELOG.md) for the full notes. Highlights: - `--format=json` extends to seven step + hook commands (#2560) - `wt step commit` / `wt step squash` gain `--dry-run` (#2557) - New `dirname` / `basename` template filters (#2592, #2605) - New `[remove] delete-branch` config option (#2589) - `wt-perf timeline` subcommand (#2558) - Faster `wt list` on dirty worktrees (#2602) and faster alias dispatch (#2556, #2573) - Short-SHA display honors `core.abbrev` (#2576) - Cleaner `wt config show` shell-integration section for new users ## Test plan - [x] `cargo run -- hook pre-merge --yes` (3497 tests, lints clean) - [x] `cargo semver-checks check-release -p worktrunk` consulted; minor bump confirmed - [ ] CI green
Adds
--dry-runtowt step commitandwt step squash. It renders the prompt, prints the shell invocation that would call the LLM, calls the LLM and prints the generated message in three labeled sections (PROMPT, COMMAND, MESSAGE), then exits without staging, running hooks, or committing. For commit,--stageis honored against a temp index — the previewed prompt matches what a real run would send the LLM, but the user's real index is never touched. Output is routed throughshow_help_in_pagerso the prompt section (50+ lines) doesn't scroll off; piping (| grep,| jq) still works because the helper TTY-detects.--show-promptis hidden via clap (hide = true) but kept working — it's still the right shape for piping the cheap rendered prompt to another LLM (wt step commit --show-prompt | llm -m gpt-5-nano), andGitError::LlmCommandFailedstill suggests it as a reproduction command.A new "When to page output" section in the
writing-user-outputsskill documents the policy: page long human-oriented stdout (--help,wt config show,wt hook show,--dry-run); don't page pipe-first data or output already paged by a delegated tool (git diff).Follow-ups during review:
render_llm_invocationnow uses the shell's basename (no install-path leakage), with an insta filter normalizingbash.exe→shfor cross-platform snapshot stability. Added arun_git_capturehelper around the new directCmd::new("git")calls that bails on non-zero exit (was a non-blocking reviewer observation — without it, a failinggit diff --stagedwould silently feed an empty diff to the LLM).stage_to_temp_indexwas refactored to take argv directly, eliminating a deadStageMode::Nonearm. Added unit tests forrender_llm_invocationand integration tests for--dry-run --stage=trackedand--dry-runwithout LLM configured.codecov/patchreports 95.29% with 9 misses. The remaining gap is split between error-path fault-injection (stage_to_temp_indexgit-add failure,show_help_in_pagerspawn failure) and a codecov attribution mismatch — localcargo llvm-covshowsrender_llm_invocationas covered by the new unit tests, but codecov doesn't credit them. Merged with explicit override.Test plan:
cargo test --test integration -- merge::test_step_commit_dry_run merge::test_step_squash_dry_run merge::test_step_commit_show_prompt merge::test_step_squash_show_promptcargo run -- hook pre-merge --yes(3438 tests pass, clippy clean)/tmp/wt-dry-test: untracked file appears in dry-run prompt, real index stays untouched--dry-runand--show-promptare mutually exclusive (clapconflicts_with)