Skip to content

chore: convert integration tests to YAML manual test framework#302

Merged
avihut merged 22 commits intomasterfrom
chore/convert-integration-tests
Mar 15, 2026
Merged

chore: convert integration tests to YAML manual test framework#302
avihut merged 22 commits intomasterfrom
chore/convert-integration-tests

Conversation

@avihut
Copy link
Owner

@avihut avihut commented Mar 14, 2026

Summary

Migrates all bash integration tests to the YAML manual test framework, adds output assertions, and builds a TUI benchmark runner.

  • 252 YAML test scenarios covering all 18 commands (907 steps, all passing)
  • Matches the 271 bash tests in test_all.sh minus ~9 intentionally skipped (performance/framework-self-tests)
  • output_contains / output_not_contains assertions for checking command stdout/stderr
  • Recursive scenario discovery — manual-test --ci finds all subdirectory scenarios
  • YAML tests now run alongside bash tests in the CI matrix (default + gitoxide)
  • TUI benchmark runner (xtask bench) with live spinner table, --parallel flag, Ctrl+C cancellation
  • Per-command bash-vs-YAML benchmarks via mise run bench:tests:integration

Test coverage by command

Command Bash YAML
exec 8 8
clone 16 14
init 13 12
checkout 31 29
checkout-branch 24 22
prune 23 20
branch-delete 22 22
list 28 28
fetch 23 22
hooks 15 9
sync 12 12
config 9 9
completions 27 10
rename 12 12
unknown-command 6 6
flow 5 5
simple 5 3
setup 7 2
shell-init 17 5

Test plan

  • mise run test:unit — all 50 xtask unit tests pass
  • mise run test:manual -- --ci — 252 scenarios, 907 steps, all pass
  • mise run bench:tests:integration — bash vs YAML side-by-side, all pass
  • CI workflow runs YAML tests in matrix (default + gitoxide)

🤖 Generated with Claude Code

avihut and others added 17 commits March 14, 2026 22:19
Add output_contains/output_not_contains to the YAML test expectations
schema, enabling tests to verify command stdout/stderr content. This
unblocks migration of ~90 bash integration tests that check command
output (list, completions, shell-init, unknown-command, etc.).

Key changes:
- Always capture command output (even in interactive mode), print it
  to terminal, and make it available for assertions
- Add check_output_contains/check_output_not_contains assertion fns
- run_assertions/check_step now accept captured output parameter
- Default scenario discovery is recursive (finds subdirectory tests)
- Add 5 unknown-command test scenarios as validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers basic checkout, carry feature (7 tests), auto-start (6 tests),
dash/previous worktree (4 tests), complex branches, subdirectory,
existing worktree, uncommitted changes, and error handling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 12 init scenarios: basic, custom-branch, bare, quiet, errors,
  branch-conventions, direnv, workflow, paths, help, repo-name-formats,
  security
- 2 simple scenarios: binaries (14 binary checks), help
- 10 clone scenarios: default-branch, no-checkout, quiet, all-branches,
  invalid-url, existing-directory, ssh-url, help, direnv, empty-repo

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 8 branch-delete: basic, refuses-unmerged, force, refuses-default,
  refuses-dirty, nonexistent, branch-d-basic, deprecated-warning
- 1 fetch: current worktree
- 1 config: defaults
- 2 flow: adopt-basic, eject-basic
- 1 rename: basic
- 2 setup: help, dry-run

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 7 checkout-branch: basic, with-base, from-subdirectory, errors,
  naming, outside-repo, help
- 11 list: basic, current-marker, dirty-marker, json, outside-repo,
  help, from-subdirectory, branch-age, head-column, relative-path,
  commit-subject
- 11 prune: basic, no-deletion, multiple, from-subdirectory, errors,
  help, skips-untracked, force-dirty, removes-clean, plus-marker,
  empty-parent-cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mise run bench:tests:manual runs all YAML tests with timing.
Supports namespace filtering: mise run bench:tests:manual -- checkout

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bench:tests:integration now depends on both bench:tests:integration:exec
and bench:tests:manual, so running it benchmarks all test suites.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each converted command now has a side-by-side benchmark comparing the
bash integration tests against the equivalent YAML manual tests:
bench:tests:integration:{exec,clone,init,checkout,checkout-branch,
prune,branch-delete,list}

Extracted shared helper _bench_command to avoid duplication.

To run converted YAML tests only: mise run test:manual -- --ci
To benchmark all comparisons: mise run bench:tests:integration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The bash integration tests share /tmp/git-worktree-integration-tests
so they can't run in parallel. Replaced mise dependency-based parallel
execution with a single script that runs each suite sequentially and
prints a consolidated summary table at the end.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds 46 new scenario files across all remaining command groups:
- 4 completions: bash, zsh, fish, fig
- 5 shell-init: bash-output, bash-syntax, zsh-output, zsh-syntax, fish-output
- 5 hooks: untrusted, trust-command, status-command, no-hooks, trust-hooks
- 9 fetch: specific, multiple, all, dry-run, rebase, help, outside-repo,
  from-subdirectory, cross-branch
- 9 branch-delete: multiple, no-worktree, from-current-writes-cd,
  local-behind-remote, by-relative-path, by-dot, branch-D-force,
  branch-no-flag-fails, default-force-worktree-only
- 6 list: detached-head, ahead-behind, shorthand-age, stat-lines,
  branches-flag, default-unchanged
- 3 prune: skips-uncommitted, nested-directories, many-worktrees
- 2 config: checkout-push-false, remote-custom
- 1 flow: adopt-preserves-changes
- 1 rename: from-inside
- 1 sync: basic

Total: 155 scenarios, 532 steps, all passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds per-command benchmark scripts for completions, shell-init, hooks,
fetch, sync, config, rename, and setup. Updates the consolidated
benchmark to include all 16 command suites.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
test_shell_init.sh and test_setup.sh are not included in test_all.sh
and have never been run in CI. They have stale expectations (gwcob
alias, --yes flag) that don't match the current binary. Move them to
YAML-only benchmark entries so the benchmark doesn't fail on
pre-existing bash test bugs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test matrix (xtask test-matrix) now runs both bash integration
tests and YAML manual tests for each config entry (default, gitoxide).
The CI workflow uploads the xtask binary and runs YAML tests as a
separate step with the same GIT_CONFIG_GLOBAL.

Also adds missing git-worktree-list symlink in CI setup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the bash benchmark script with a Rust TUI that shows a live
table with spinners and elapsed timers while suites run. Each cell
shows ⠋ with a live timer while running, then ✓/✗ with duration on
completion. Supports --parallel flag to run bash and YAML concurrently
for the same suite.

Usage:
  mise run bench:tests:integration          # sequential (default)
  mise run bench:tests:integration -- --parallel  # parallel

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The TOTAL row now includes elapsed time from currently running cells
(not just completed ones), so it ticks up in real-time. Also fixes
column alignment by applying proper padding to the totals row.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pressing Ctrl+C during a benchmark run now:
- Marks running cells as cancelled (⊘ yellow with elapsed time)
- Marks pending cells as skipped
- Stops starting new suites
- Does a final table redraw showing the partial results
- Exits with code 130 (standard SIGINT exit)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes the gap from 136 to 252 YAML scenarios (907 steps, all passing).
Coverage now matches the 271 bash integration tests minus ~9 intentionally
skipped (performance/large_repo/framework-self-tests) and ~10 remaining
edge cases.

New scenarios added per command:
- checkout-branch: +15 (carry tests, workflow, security, direnv)
- completions: +6 (dynamic-branch, flags-consistent, position-aware)
- fetch: +12 (config, force, dirty/clean target, cross-branch variants)
- sync: +10 (help, verbose, cd-target, diverged, rebase, autostash)
- rename: +10 (paths, dry-run, remote, cleanup, error cases)
- list: +11 (JSON variants, -a/-r flags, many-worktrees, stat-config)
- hooks: +4 (help, flags-exclusive, deprecated-warning, env-vars)
- config: +6 (carry, upstream, bool-variants, flag overrides)
- prune: +6 (from-current, cd-target, regular-repo, shell-wrapper)
- branch-delete: +5 (absolute-path, cd-target, default variants)
- clone: +2 (empty-repo variants)
- checkout: +4 (help, direnv, outside-repo, carry-untracked)
- simple: +1 (init-bare)
- flow: +2 (eject-with-branch, eject-dirty)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@avihut avihut modified the milestones: Standalone daft, Public Launch Mar 14, 2026
@avihut avihut self-assigned this Mar 14, 2026
@avihut avihut added chore Maintenance tasks ci CI/CD changes labels Mar 14, 2026
avihut and others added 5 commits March 15, 2026 01:16
Rewrites tests/README.md to cover both bash and YAML test systems:
YAML scenario format, assertions, variables, fixtures, path conventions,
benchmarks, and CI integration. Updates CLAUDE.md with manual test
commands and adds YAML tests to the new-command checklist.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expands the Testing section with YAML manual tests (preferred for new
tests), dev sandbox usage, and the TUI benchmark runner. Updates the
new-command checklist to include YAML scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The build job was only running cargo build --release which doesn't
build the xtask package. Explicitly build xtask and update the binary
count verification from 1 to 2 (daft + xtask).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The integration-tests job now has two matrix dimensions:
  entry: [default, gitoxide]
  suite: [bash, yaml]

This creates 4 parallel workers instead of 2 sequential steps,
cutting CI wall time roughly in half. Each worker only runs its
own suite (bash steps are skipped for yaml workers and vice versa).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds 6 new hooks scenarios: post-create-execution, pre-create-abort,
migrate-basic, migrate-conflict, migrate-dry-run, hooks-execute-in-tui.

Total: 258 YAML scenarios, 933 steps, all passing. The 59 file count
difference vs 317 bash tests is entirely from consolidation (e.g.,
27 completions bash tests → 10 YAML files with 95 steps) plus 9
intentionally skipped performance/framework tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@avihut avihut merged commit b67b045 into master Mar 15, 2026
8 checks passed
@avihut avihut deleted the chore/convert-integration-tests branch March 15, 2026 00:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

chore Maintenance tasks ci CI/CD changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant