Skip to content

Add wt step {commit,squash} --dry-run, hide --show-prompt#2557

Merged
max-sixty merged 4 commits intomainfrom
commit-dry
May 3, 2026
Merged

Add wt step {commit,squash} --dry-run, hide --show-prompt#2557
max-sixty merged 4 commits intomainfrom
commit-dry

Conversation

@max-sixty
Copy link
Copy Markdown
Owner

@max-sixty max-sixty commented May 3, 2026

Adds --dry-run to wt step commit and wt step squash. It renders the prompt, prints the shell invocation that would call the LLM, calls the LLM and prints the generated message in three labeled sections (PROMPT, COMMAND, MESSAGE), then exits without staging, running hooks, or committing. For commit, --stage is honored against a temp index — the previewed prompt matches what a real run would send the LLM, but the user's real index is never touched. Output is routed through show_help_in_pager so the prompt section (50+ lines) doesn't scroll off; piping (| grep, | jq) still works because the helper TTY-detects.

--show-prompt is hidden via clap (hide = true) but kept working — it's still the right shape for piping the cheap rendered prompt to another LLM (wt step commit --show-prompt | llm -m gpt-5-nano), and GitError::LlmCommandFailed still suggests it as a reproduction command.

A new "When to page output" section in the writing-user-outputs skill documents the policy: page long human-oriented stdout (--help, wt config show, wt hook show, --dry-run); don't page pipe-first data or output already paged by a delegated tool (git diff).

Follow-ups during review: render_llm_invocation now uses the shell's basename (no install-path leakage), with an insta filter normalizing bash.exesh for cross-platform snapshot stability. Added a run_git_capture helper around the new direct Cmd::new("git") calls that bails on non-zero exit (was a non-blocking reviewer observation — without it, a failing git diff --staged would silently feed an empty diff to the LLM). stage_to_temp_index was refactored to take argv directly, eliminating a dead StageMode::None arm. Added unit tests for render_llm_invocation and integration tests for --dry-run --stage=tracked and --dry-run without LLM configured.

codecov/patch reports 95.29% with 9 misses. The remaining gap is split between error-path fault-injection (stage_to_temp_index git-add failure, show_help_in_pager spawn failure) and a codecov attribution mismatch — local cargo llvm-cov shows render_llm_invocation as covered by the new unit tests, but codecov doesn't credit them. Merged with explicit override.

Test plan:

  • cargo test --test integration -- merge::test_step_commit_dry_run merge::test_step_squash_dry_run merge::test_step_commit_show_prompt merge::test_step_squash_show_prompt
  • cargo run -- hook pre-merge --yes (3438 tests pass, clippy clean)
  • End-to-end smoke test in /tmp/wt-dry-test: untracked file appears in dry-run prompt, real index stays untouched
  • --dry-run and --show-prompt are mutually exclusive (clap conflicts_with)
  • CI green on Linux/macOS/Windows after the cross-platform shell filter fix

This was written by Claude Code on behalf of @max-sixty

max-sixty and others added 2 commits May 3, 2026 12:15
Changes `wt step commit` and `wt step squash` to use `--dry-run` instead of `--show-prompt`. The new flag renders the prompt, invokes the LLM, prints the generated command and message in three labeled sections (PROMPT, COMMAND, MESSAGE), then exits without executing the commit or squash.

For `--dry-run` on commit, staged changes are mirrored to a temporary index (respecting the `--stage` flag) so the previewed prompt matches what a real run would send the LLM, without modifying the user's actual index.

The old `--show-prompt` flag is hidden but retained for backward compatibility, outputting only the rendered prompt against the current index without calling the LLM.
Routes the three PROMPT/COMMAND/MESSAGE sections through `show_help_in_pager` so the prompt (50+ lines on its own) doesn't scroll off the terminal. The helper TTY-detects internally, so `wt step commit --dry-run | grep ...` keeps working — snapshot tests run without a TTY and stay unchanged.

Adds a "When to page output" section to `writing-user-outputs` documenting which commands page (long human-oriented stdout: `--help`, `wt config show`, `wt hook show`, `--dry-run`) versus which don't (pipe-first data, short output, or output already paged by a delegated tool).

Co-Authored-By: Claude <noreply@anthropic.com>
worktrunk-bot
worktrunk-bot previously approved these changes May 3, 2026
Copy link
Copy Markdown
Collaborator

@worktrunk-bot worktrunk-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One stale-docstring nit on build_commit_prompt — the "Used by both …" line predates --dry-run. Inline suggestion below.

Minor observation, not blocking: the new direct Cmd::new("git").run()? calls in llm.rs (generate_commit_message fallback path and build_commit_prompt) drop the exit-status check that repo.run_command() previously performed. If git diff --staged ever fails (corrupt index, etc.), we now silently feed an empty <diff> to the LLM instead of bailing. Matches the existing pattern in commands/list/collect/tasks.rs, so consistent with the codebase — flagging in case it matters.

Comment thread src/llm.rs Outdated
Copy link
Copy Markdown
Collaborator

@worktrunk-bot worktrunk-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI flagged two PR-related issues — dismissing my approval until they're resolved.

test (windows) — snapshot non-portability

render_llm_invocation (new in src/llm.rs) interpolates ShellConfig::get()?.executable.to_string_lossy() directly into the COMMAND section. On Linux the snapshot got sh -c '...'; on Windows it gets C:\Program Files\Git\bin\bash.exe -c '...', so both new snapshots (step_commit_dry_run, step_squash_dry_run) fail. Options: filter the shell path in the snapshot, stub ShellConfig in the test, or gate the snapshot with #[cfg(unix)].

codecov/patch — 93.25% vs 96.76% target

11 lines uncovered. The bulk are defensive arms with no exercising test:

  • src/commands/step_commands.rs:152StageMode::Tracked arm in stage_to_temp_index (no --dry-run --stage=tracked test)
  • src/commands/step_commands.rs:154StageMode::None => return Ok(temp) defensive arm; the caller in preview_commit already filters None out, so this branch is unreachable. Per CLAUDE.md ("Don't add error handling … for scenarios that can't happen"), this could be replaced with unreachable!() or the match could be moved up to the caller (handle All/Tracked only).
  • src/commands/step_commands.rs:194-199 — the None => writeln!(…) "(LLM not configured — using built-in fallback)" arm in print_dry_run. No test runs --dry-run without a configured commit-generation command.
  • src/commands/step_commands.rs:207-208show_help_in_pager Err fallback. Rarely fires; likely a non-issue, but counted in patch coverage.
  • src/commands/step_commands.rs:542 — squash build_squash_prompt closing line, likely a brace artifact.
  • src/llm.rs:525name_only.env("GIT_INDEX_FILE", path) in the no-LLM fallback path of generate_commit_message; only fires when --dry-run is used with no commit-generation command configured.

A test for --dry-run with no LLM configured would hit lines 194-199 (and 525). A --dry-run --stage=tracked test would hit 152. Removing the StageMode::None arm from stage_to_temp_index (since the caller filters it) would drop 154 from the patch surface.

Happy to push these fixes if you'd like — let me know.

@worktrunk-bot worktrunk-bot dismissed their stale review May 3, 2026 19:39

CI flagged related issues — Windows snapshot non-portability and codecov below patch target. Details in follow-up review.

`render_llm_invocation` now uses the shell's basename for the COMMAND section instead of the full path, so the displayed invocation stays short and doesn't leak install paths (`C:\Program Files\Git\bin\bash.exe` → `bash.exe`). Adds an insta filter normalizing `bash.exe` → `sh` so dry-run snapshots match across Linux and Windows.

Adds `run_git_capture` in `llm.rs` to bail on non-zero exit from the new direct `Cmd::new("git")` call sites — previously a failing `git diff --staged` would silently feed an empty diff to the LLM. `stage_to_temp_index` gets the same exit-status check inline. Reviewer flagged this on #2557.

Adds two unit tests for `render_llm_invocation` covering the basename-only display and shell-quote escaping (codecov patch coverage gap on the new function). Updates the docstring on `build_commit_prompt` to mention `--dry-run` alongside the existing modes.

Co-Authored-By: Claude <noreply@anthropic.com>
worktrunk-bot
worktrunk-bot previously approved these changes May 3, 2026
Copy link
Copy Markdown
Collaborator

@worktrunk-bot worktrunk-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI on the new commit shows the Windows snapshot fix didn't take — test (windows) still fails on test_step_commit_dry_run and test_step_squash_dry_run with the same bash.exe vs sh mismatch as before. Dismissing my approval until this is resolved.

Why the filter doesn't fire

The snapshot output wraps the shell name in ANSI escape codes — the captured line looks like …\x1b[34mbash.exe\x1b[0m…. The leading \b in \bbash\.exe\b requires a word/non-word transition at the start of bash, but the character immediately before b is m (from the ANSI prefix) — both are \w, so no boundary exists and the regex never matches. The trailing \b would match (e\x1b), but it's the leading \b that vetoes the substitution.

Easiest fix: drop the \bs entirely. bash.exe is unlikely to appear anywhere else in test snapshots, and the same pattern works on both platforms (no-op on Linux).

Inline suggestion below.

Codecov

codecov/patch is also failing — src/commands/step_commands.rs is at 87.75% (12 misses on 98 patch lines), pulling the patch below the 96.76% target. The defensive arms I flagged in the previous review (StageMode::Tracked, the (LLM not configured) arm in print_dry_run, show_help_in_pager Err fallback) are still uncovered. render_llm_invocation is also reported uncovered by codecov despite the new unit tests — worth a sanity check that those tests actually run under cargo llvm-cov.

Comment thread tests/common/mod.rs Outdated
@worktrunk-bot worktrunk-bot dismissed their stale review May 3, 2026 20:13

CI failed — Windows snapshot still mismatches; the \b regex anchors don't fire because of surrounding ANSI escape codes.

The `bash.exe` insta filter from the previous fix-up didn't fire on Windows because the syntax-highlighted output puts an ANSI sequence (`...m`) immediately before `bash`, and `m` is a word character — so `\bbash\.exe\b` saw no word boundary. Drop the boundaries; `bash.exe` is unique enough to match safely without them.

Refactors `stage_to_temp_index` to take the `git add` argv directly and pushes the `StageMode::None` filter into the caller. Removes a dead `StageMode::None => return Ok(temp)` arm that codecov was flagging as uncovered.

Adds two coverage tests: `--dry-run --stage=tracked` (mirrors `git add -u` against the temp index — modified tracked files appear, untracked files don't) and `--dry-run` without `[commit.generation]` configured (fallback path prints "LLM not configured" for COMMAND and a deterministic file-list message for MESSAGE).

Co-Authored-By: Claude <noreply@anthropic.com>
worktrunk-bot
worktrunk-bot previously approved these changes May 3, 2026
Copy link
Copy Markdown
Collaborator

@worktrunk-bot worktrunk-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI passed on Linux/macOS/Windows but codecov/patch is at 95.28% vs the 96.76% target — 9 lines uncovered. Dismissing my approval per CLAUDE.md (codecov gates merge despite being marked non-required).

The remaining misses split into three categories. None of the prior-review-flagged regressions reappear; this is the residual surface after the new --stage=tracked and no-LLM-configured tests landed.

Defensive git failure paths (4 lines)

src/commands/step_commands.rs 159–160 (the if !output.status.success() branch in stage_to_temp_index) and src/llm.rs 581–582 (the equivalent in run_git_capture) bail when git add/git diff --staged exits non-zero against the temp index. These are intentional — without them a failing git diff would silently feed an empty diff to the LLM, which is the regression you (correctly) flagged when adding run_git_capture. They're hard to exercise without injecting a fake git. Two options that don't churn the production code: leave as-is and accept the 4-line gap, or factor the success-check into a tiny shared helper with a unit test that calls it with a synthetic non-zero Output.

Pager Err fallback (2 lines)

src/commands/step_commands.rs 207–208 — if let Err(e) = show_help_in_pager(...) { log::debug!(...); println!(...) }. I flagged this in the previous review as "rarely fires; likely a non-issue, but counted in patch coverage." Same conclusion now. If you want to drop it, ?-propagating the pager error is fine — pager failures already log inside show_help_in_pager, and an outright failure to display the dry-run output is loud-failure-better-than-silent.

Macro continuation lines (3 lines)

src/commands/step_commands.rs 87 ()?; closing the multi-line approve_or_skip call), 194 and 199 ()?, closing the two writeln! arms in print_dry_run's match). llvm-cov sometimes reports a multi-line call's closing ? as a separate region with 0 hits even though every other line in the expression is covered. Reformatting the calls to fit on one line would mask the report but not change coverage in any meaningful sense. I'd skip these.

My take

The 4 git-failure-path lines are the only ones I'd consider real coverage gaps, and even those are testing whether bail! fires on Output { status: non-zero } — fine to leave defensive without a test. Happy to push any of: a shared git_check_status helper plus a unit test (covers 4 lines), or the pager Err arm removal (covers 2 lines), or both.

Otherwise this is a coverage threshold judgment call rather than a missing-test issue.

@worktrunk-bot worktrunk-bot dismissed their stale review May 3, 2026 20:30

codecov/patch failed (95.28% vs 96.76% target). See follow-up review for analysis.

@max-sixty max-sixty merged commit e5d0c13 into main May 3, 2026
29 of 30 checks passed
@max-sixty max-sixty deleted the commit-dry branch May 3, 2026 20:51
max-sixty added a commit that referenced this pull request May 3, 2026
Merge main into the branch, resolving conflicts in `step_commands.rs` and
`main.rs` from the new `step commit --dry-run` and `step squash --dry-run`
features (#2557 area). The JSON `--show-prompt` guard now also rejects
`--dry-run + --format=json` for the same reason — both emit raw text.

Main also changed `LoadedConfigs` to hold borrowed references (PR #2556),
which is what caused the prior CI compile failure that was invisible
locally. Drop the now-unnecessary `.as_ref()` calls in
`handle_hook_show` (`configs.project` is already `Option<&ProjectConfig>`).

Address tend-bot review feedback on the JSON additions:

- **Display abbreviation**: `commit_staged_changes` and `handle_squash` now
  call `git rev-parse --short HEAD` for the success-line hash (honors
  `core.abbrev` and auto-extends ambiguous prefixes), keeping the full
  40-char SHA only for the JSON payload. Previously a fixed 7-char slice
  silently lost both behaviors.
- **`--format` flag descriptions**: each new `--format` flag now includes
  a one-liner naming the JSON shape, matching the pattern set by
  `wt switch`/`wt remove`/`wt merge`.
- **Missing test**: add `test_step_push_no_ff_json` exercising the
  `merge_commit` outcome and the new `merge_sha` field.

Co-Authored-By: Claude <noreply@anthropic.com>
@max-sixty max-sixty mentioned this pull request May 5, 2026
3 tasks
max-sixty added a commit that referenced this pull request May 5, 2026
## Summary
Release v0.48.0. See
[CHANGELOG.md](https://github.com/max-sixty/worktrunk/blob/release/CHANGELOG.md)
for the full notes.

Highlights:
- `--format=json` extends to seven step + hook commands (#2560)
- `wt step commit` / `wt step squash` gain `--dry-run` (#2557)
- New `dirname` / `basename` template filters (#2592, #2605)
- New `[remove] delete-branch` config option (#2589)
- `wt-perf timeline` subcommand (#2558)
- Faster `wt list` on dirty worktrees (#2602) and faster alias dispatch
(#2556, #2573)
- Short-SHA display honors `core.abbrev` (#2576)
- Cleaner `wt config show` shell-integration section for new users

## Test plan
- [x] `cargo run -- hook pre-merge --yes` (3497 tests, lints clean)
- [x] `cargo semver-checks check-release -p worktrunk` consulted; minor
bump confirmed
- [ ] CI green
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants