chore: convert integration tests to YAML manual test framework by avihut · Pull Request #302 · avihut/daft

avihut · 2026-03-14T23:14:35Z

Summary

Migrates all bash integration tests to the YAML manual test framework, adds output assertions, and builds a TUI benchmark runner.

252 YAML test scenarios covering all 18 commands (907 steps, all passing)
Matches the 271 bash tests in test_all.sh minus ~9 intentionally skipped (performance/framework-self-tests)
output_contains / output_not_contains assertions for checking command stdout/stderr
Recursive scenario discovery — manual-test --ci finds all subdirectory scenarios
YAML tests now run alongside bash tests in the CI matrix (default + gitoxide)
TUI benchmark runner (xtask bench) with live spinner table, --parallel flag, Ctrl+C cancellation
Per-command bash-vs-YAML benchmarks via mise run bench:tests:integration

Test coverage by command

Command	Bash	YAML
exec	8	8
clone	16	14
init	13	12
checkout	31	29
checkout-branch	24	22
prune	23	20
branch-delete	22	22
list	28	28
fetch	23	22
hooks	15	9
sync	12	12
config	9	9
completions	27	10
rename	12	12
unknown-command	6	6
flow	5	5
simple	5	3
setup	7	2
shell-init	17	5

Test plan

mise run test:unit — all 50 xtask unit tests pass
mise run test:manual -- --ci — 252 scenarios, 907 steps, all pass
mise run bench:tests:integration — bash vs YAML side-by-side, all pass
CI workflow runs YAML tests in matrix (default + gitoxide)

🤖 Generated with Claude Code

Add output_contains/output_not_contains to the YAML test expectations schema, enabling tests to verify command stdout/stderr content. This unblocks migration of ~90 bash integration tests that check command output (list, completions, shell-init, unknown-command, etc.). Key changes: - Always capture command output (even in interactive mode), print it to terminal, and make it available for assertions - Add check_output_contains/check_output_not_contains assertion fns - run_assertions/check_step now accept captured output parameter - Default scenario discovery is recursive (finds subdirectory tests) - Add 5 unknown-command test scenarios as validation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Covers basic checkout, carry feature (7 tests), auto-start (6 tests), dash/previous worktree (4 tests), complex branches, subdirectory, existing worktree, uncommitted changes, and error handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- 12 init scenarios: basic, custom-branch, bare, quiet, errors, branch-conventions, direnv, workflow, paths, help, repo-name-formats, security - 2 simple scenarios: binaries (14 binary checks), help - 10 clone scenarios: default-branch, no-checkout, quiet, all-branches, invalid-url, existing-directory, ssh-url, help, direnv, empty-repo Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- 8 branch-delete: basic, refuses-unmerged, force, refuses-default, refuses-dirty, nonexistent, branch-d-basic, deprecated-warning - 1 fetch: current worktree - 1 config: defaults - 2 flow: adopt-basic, eject-basic - 1 rename: basic - 2 setup: help, dry-run Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- 7 checkout-branch: basic, with-base, from-subdirectory, errors, naming, outside-repo, help - 11 list: basic, current-marker, dirty-marker, json, outside-repo, help, from-subdirectory, branch-age, head-column, relative-path, commit-subject - 11 prune: basic, no-deletion, multiple, from-subdirectory, errors, help, skips-untracked, force-dirty, removes-clean, plus-marker, empty-parent-cleanup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mise run bench:tests:manual runs all YAML tests with timing. Supports namespace filtering: mise run bench:tests:manual -- checkout Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

bench:tests:integration now depends on both bench:tests:integration:exec and bench:tests:manual, so running it benchmarks all test suites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Each converted command now has a side-by-side benchmark comparing the bash integration tests against the equivalent YAML manual tests: bench:tests:integration:{exec,clone,init,checkout,checkout-branch, prune,branch-delete,list} Extracted shared helper _bench_command to avoid duplication. To run converted YAML tests only: mise run test:manual -- --ci To benchmark all comparisons: mise run bench:tests:integration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The bash integration tests share /tmp/git-worktree-integration-tests so they can't run in parallel. Replaced mise dependency-based parallel execution with a single script that runs each suite sequentially and prints a consolidated summary table at the end. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds 46 new scenario files across all remaining command groups: - 4 completions: bash, zsh, fish, fig - 5 shell-init: bash-output, bash-syntax, zsh-output, zsh-syntax, fish-output - 5 hooks: untrusted, trust-command, status-command, no-hooks, trust-hooks - 9 fetch: specific, multiple, all, dry-run, rebase, help, outside-repo, from-subdirectory, cross-branch - 9 branch-delete: multiple, no-worktree, from-current-writes-cd, local-behind-remote, by-relative-path, by-dot, branch-D-force, branch-no-flag-fails, default-force-worktree-only - 6 list: detached-head, ahead-behind, shorthand-age, stat-lines, branches-flag, default-unchanged - 3 prune: skips-uncommitted, nested-directories, many-worktrees - 2 config: checkout-push-false, remote-custom - 1 flow: adopt-preserves-changes - 1 rename: from-inside - 1 sync: basic Total: 155 scenarios, 532 steps, all passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds per-command benchmark scripts for completions, shell-init, hooks, fetch, sync, config, rename, and setup. Updates the consolidated benchmark to include all 16 command suites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

test_shell_init.sh and test_setup.sh are not included in test_all.sh and have never been run in CI. They have stale expectations (gwcob alias, --yes flag) that don't match the current binary. Move them to YAML-only benchmark entries so the benchmark doesn't fail on pre-existing bash test bugs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The test matrix (xtask test-matrix) now runs both bash integration tests and YAML manual tests for each config entry (default, gitoxide). The CI workflow uploads the xtask binary and runs YAML tests as a separate step with the same GIT_CONFIG_GLOBAL. Also adds missing git-worktree-list symlink in CI setup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replaces the bash benchmark script with a Rust TUI that shows a live table with spinners and elapsed timers while suites run. Each cell shows ⠋ with a live timer while running, then ✓/✗ with duration on completion. Supports --parallel flag to run bash and YAML concurrently for the same suite. Usage: mise run bench:tests:integration # sequential (default) mise run bench:tests:integration -- --parallel # parallel Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The TOTAL row now includes elapsed time from currently running cells (not just completed ones), so it ticks up in real-time. Also fixes column alignment by applying proper padding to the totals row. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Pressing Ctrl+C during a benchmark run now: - Marks running cells as cancelled (⊘ yellow with elapsed time) - Marks pending cells as skipped - Stops starting new suites - Does a final table redraw showing the partial results - Exits with code 130 (standard SIGINT exit) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Closes the gap from 136 to 252 YAML scenarios (907 steps, all passing). Coverage now matches the 271 bash integration tests minus ~9 intentionally skipped (performance/large_repo/framework-self-tests) and ~10 remaining edge cases. New scenarios added per command: - checkout-branch: +15 (carry tests, workflow, security, direnv) - completions: +6 (dynamic-branch, flags-consistent, position-aware) - fetch: +12 (config, force, dirty/clean target, cross-branch variants) - sync: +10 (help, verbose, cd-target, diverged, rebase, autostash) - rename: +10 (paths, dry-run, remote, cleanup, error cases) - list: +11 (JSON variants, -a/-r flags, many-worktrees, stat-config) - hooks: +4 (help, flags-exclusive, deprecated-warning, env-vars) - config: +6 (carry, upstream, bool-variants, flag overrides) - prune: +6 (from-current, cd-target, regular-repo, shell-wrapper) - branch-delete: +5 (absolute-path, cd-target, default variants) - clone: +2 (empty-repo variants) - checkout: +4 (help, direnv, outside-repo, carry-untracked) - simple: +1 (init-bare) - flow: +2 (eject-with-branch, eject-dirty) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rewrites tests/README.md to cover both bash and YAML test systems: YAML scenario format, assertions, variables, fixtures, path conventions, benchmarks, and CI integration. Updates CLAUDE.md with manual test commands and adds YAML tests to the new-command checklist. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Expands the Testing section with YAML manual tests (preferred for new tests), dev sandbox usage, and the TUI benchmark runner. Updates the new-command checklist to include YAML scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The build job was only running cargo build --release which doesn't build the xtask package. Explicitly build xtask and update the binary count verification from 1 to 2 (daft + xtask). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The integration-tests job now has two matrix dimensions: entry: [default, gitoxide] suite: [bash, yaml] This creates 4 parallel workers instead of 2 sequential steps, cutting CI wall time roughly in half. Each worker only runs its own suite (bash steps are skipped for yaml workers and vice versa). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds 6 new hooks scenarios: post-create-execution, pre-create-abort, migrate-basic, migrate-conflict, migrate-dry-run, hooks-execute-in-tui. Total: 258 YAML scenarios, 933 steps, all passing. The 59 file count difference vs 317 bash tests is entirely from consolidation (e.g., 27 completions bash tests → 10 YAML files with 95 steps) plus 9 intentionally skipped performance/framework tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

avihut and others added 17 commits March 14, 2026 22:19

feat: add benchmark for YAML manual tests

c9eb730

mise run bench:tests:manual runs all YAML tests with timing. Supports namespace filtering: mise run bench:tests:manual -- checkout Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: include YAML manual tests in integration benchmark

e646cb8

bench:tests:integration now depends on both bench:tests:integration:exec and bench:tests:manual, so running it benchmarks all test suites. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

avihut modified the milestones: Standalone daft, Public Launch Mar 14, 2026

avihut self-assigned this Mar 14, 2026

avihut added chore Maintenance tasks ci CI/CD changes labels Mar 14, 2026

avihut and others added 5 commits March 15, 2026 01:16

avihut merged commit b67b045 into master Mar 15, 2026
8 checks passed

avihut deleted the chore/convert-integration-tests branch March 15, 2026 00:10

wheatley-the-moronic-ci-bot bot mentioned this pull request Mar 15, 2026

chore: release v1.1.0 #300

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: convert integration tests to YAML manual test framework#302

chore: convert integration tests to YAML manual test framework#302
avihut merged 22 commits intomasterfrom
chore/convert-integration-tests

avihut commented Mar 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avihut commented Mar 14, 2026

Summary

Test coverage by command

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant