feat: manual testing framework by avihut · Pull Request #301 · avihut/daft

avihut · 2026-03-14T19:33:58Z

Summary

YAML-based manual test framework for declarative scenario definition and automated verification
Interactive step-through mode with keyboard controls (run, check, re-run, reset, skip)
CI mode with clean, concise output and captured command output
Dev sandbox with isolated git identity, config, and helper scripts
Shared repo fixtures with use_fixture references and {{NAME}} substitution
Namespace resolution for scenarios (exec:checkout-single)
Setup/reset test environments for manual exploration (setup-test, reset-test)
Benchmark comparing bash integration tests vs YAML manual tests
Ctrl+C/Esc cleanup of test environments
Fix unnecessary rebuilds in git worktrees (build.rs)

Test plan

mise run test:unit — all 45 xtask + 636 daft tests pass
mise run test:manual -- --ci tests/manual/scenarios/exec/ — all 8 exec scenarios pass (12 steps)
mise run test:manual -- --ci sync:rebase-conflict-skips-push — sync conflict scenario passes
mise run bench:tests:integration:exec — bash and YAML suites both pass with comparable timing
Interactive mode tested manually with clone-basic scenario
setup-test clone-basic --step 3 creates environment and cd's into work dir
reset-test resets test environment
Ctrl+C cleans up test environment in both modes

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Implements runner.rs for the manual test framework with: - Individual assertion functions (exit code, dir/file existence, file content, git worktree/branch checks) - Aggregate run_assertions() with relative path resolution - execute_step() that runs shell commands with variable expansion - run_non_interactive() for sequential step execution with reporting - 12 unit tests covering all assertion functions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Interactive runner for manual test scenarios that shows each step, waits for keypress, executes, displays assertion results, and offers re-run/reset/quit controls. Supports --step to jump to a specific step and --loop-count to repeat a step with environment reset. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace stub mod.rs with full implementation that discovers scenario files, creates test environments, generates repos, and dispatches to either interactive or non-interactive runner based on TTY detection. Adds --list support and --keep flag for debugging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Use RAII guard for raw mode in interactive.rs to handle panics - Write hook scripts to .daft/hooks/ instead of .daft/ - Resolve relative cwd paths against env.work_dir in runner.rs - Insert scenario vars before safety vars so safety vars always win - Downgrade cleanup errors to warnings instead of propagating - Replace unwrap() on to_str() with context() in repo_gen.rs - Fix trust command in hooks-lifecycle.yml to use daft hooks trust - Validate --loop-count requires --step Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Rename sandbox helpers to avoid daft- prefix conflict (daft-rebuild → rebuild-daft, daft-manual-test → manual-test). Add bare name resolution so scenarios can be run as `manual-test clone-basic` instead of requiring the full path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add go-to-worktree script that opens a shell in the worktree directory. Add zsh and bash completion functions for manual-test (flags and scenario names). Completions are loaded via 'source enable-completions' since direnv only propagates env vars, not shell completions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add clean-sandbox helper that removes the sandbox via mise sandbox:clean and drops the user back into the worktree. Generate a README.md in the sandbox with setup instructions (completions, shell-init) and a command reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add bat to mise tools and generate a `help` command that renders the sandbox README with syntax highlighting. Move all sandbox scripts and completions into bin/ subdirectory for cleaner layout. Add empty run-help hint file in sandbox root. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Convert all 8 tests from test_exec.sh to YAML manual test format in tests/manual/scenarios/exec/. Add shared repo fixtures system with use_fixture references and {{NAME}} placeholder substitution to eliminate duplicated repo definitions across scenarios. Framework additions: - file_not_contains assertion (schema + runner) - Directory argument support in resolve_scenario_paths - RepoEntry untagged enum for fixture vs inline repo specs - RepoFixture/RawScenario types for fixture resolution pipeline - Benchmark task at bench:tests:integration:exec comparing bash vs YAML - Fix check_git_worktree to use git branch --show-current - Fix repo_gen to pass -b flag to git init (eliminates default branch hint) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The lockfile had cached the x86_64-apple-darwin asset for macos-arm64, causing mise to install the wrong architecture and fail to resolve the cog binary path. Reinstalling picked up the correct aarch64 build. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Capture command stdout/stderr in CI mode instead of inheriting stdio - Suppress repo generation git output (null stdout/stderr) - Show captured output only on step failure for debugging - Replace per-scenario summaries with single overall summary - Improve file_contains/file_not_contains failure messages to show actual content - Add ScenarioResult type for aggregate tracking across scenarios Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The build script used `.git/HEAD` and `.git/refs/tags` as rerun-if-changed paths, but in worktrees `.git` is a file (not a directory) so these paths don't exist. Cargo treats non-existent rerun-if-changed paths as "always re-run", causing a recompile on every build. Fix by resolving the actual git dir via `git rev-parse --git-dir` and watching the real HEAD and branch ref files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add -v/--verbose flag to show full check details - Use cyan for scenario names (higher visual hierarchy) - Show compact check summary by default (e.g. "ok (3 checks)") - Show individual check details only on failure or with --verbose - Fix interactive UI color inversion (dim for meta, bold for names) - Reduce indentation and remove separator lines for cleaner output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The completion scripts only globbed top-level *.yml files, missing scenarios in subdirectories like exec/. Use recursive globs (zsh) and find (bash) to discover all scenarios. Also add -v/--verbose to flag completions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Scenarios in subdirectories can now be referenced with colon-separated namespaces (e.g., exec:checkout-single maps to exec/checkout-single.yml). Bare names without a namespace fall back to recursive subdirectory search. - Rename exec scenario files to drop redundant exec- prefix - Add find_scenario_file and find_scenario_recursive helpers - Update --list to show namespaced names (exec:checkout-single) - Update sandbox completions to generate namespaced names Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Commands shown in cyan (visible, distinct from output) - Prompts shown in yellow (action needed, stands out) - Check summary in normal weight (not dim) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…mode Split execute_step into run_step_command (runs command only) and check_step (runs assertions only). In interactive mode, checks are now triggered explicitly via [c] in the post-run prompt rather than running automatically after every step. The [c] option only appears when the step has expectations defined. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When checks all pass, automatically move to the next step without waiting for a keypress. The post-run prompt only appears when checks fail, were skipped (via [x]), or the step has no expectations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Register a ctrlc handler that removes the active sandbox directory on SIGINT, preventing leftover /tmp/daft-manual-test-* directories. The handler also restores terminal raw mode in case interactive mode was active. Respects --keep flag. In interactive mode, Esc now works as a quit trigger alongside 'q'. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Print "Cleaned up test environment." when quitting via q/Esc/Ctrl+C. Change step counter from regular blue to bright blue for better visibility on dark terminals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Display the work directory path when starting a scenario in interactive mode. Shows a relative path if running from a parent directory, or the full path otherwise. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Change pre-commit hook from fmt:check to fmt with stage_fixed, matching the existing prettier and biome behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add --setup-only flag to xtask manual-test that creates the test environment (repos, sandbox) without running any steps. Support DAFT_MANUAL_TEST_BASE env var for deterministic paths under the sandbox directory. New sandbox scripts: - setup-test <scenario>: creates test env under sandbox/test/ - reset-test [name]: resets work dir and remotes from template Includes tab completions for both (scenarios for setup-test, active test envs for reset-test). Updated sandbox README. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

setup-test now runs all scenario steps silently (with captured output) to leave the environment in a usable state. Use --step N to stop after step N. Example: setup-test clone-basic --step 3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Converts test_sync_rebase_conflict_skips_push from test_sync.sh to the YAML manual test format. Verifies that git-sync --rebase --push skips the push phase when rebase encounters a conflict. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

setup-test now spawns a new shell in the test environment's work directory after setup completes. Exit the shell to return to the sandbox. The xtask prints the work dir to stdout for the script to capture. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace enable-completions with a single `source init` that loads tab completions for all sandbox commands AND runs daft shell-init for shell integration (cd wrappers). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

avihut and others added 30 commits March 14, 2026 10:24

chore(xtask): add serde_yaml and crossterm deps for manual test runner

ab4bf96

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(xtask): add YAML scenario schema for manual test framework

d207d61

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(xtask): wire manual-test subcommand with stub

410e00a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(xtask): add TestEnv for manual test environment management

552f406

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(xtask): add repo generator for manual test scenarios

35fa174

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add mise task and sandbox helper for manual tests

c600328

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add manual test scenarios for clone and hooks workflows

82191fa

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style(manual-test): remove indentation from CI runner output

8cd5839

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style(manual-test): remove indentation from interactive runner output

8cd0fa4

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style(manual-test): improve interactive runner color contrast

b5dc177

- Commands shown in cyan (visible, distinct from output) - Prompts shown in yellow (action needed, stands out) - Check summary in normal weight (not dim) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style(manual-test): use cyan for step counter indicator

51f13a9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

avihut and others added 8 commits March 14, 2026 20:57

chore: auto-format Rust code on commit instead of just checking

ad3fdba

Change pre-commit hook from fmt:check to fmt with stage_fixed, matching the existing prettier and biome behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: remove unused exported_vars method

b6aa6a1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

avihut added this to the Public Launch milestone Mar 14, 2026

avihut added feat New feature ci CI/CD changes labels Mar 14, 2026

avihut self-assigned this Mar 14, 2026

avihut merged commit c694f6a into master Mar 14, 2026
6 checks passed

avihut deleted the feat/manual-testing-framework branch March 14, 2026 19:43

wheatley-the-moronic-ci-bot bot mentioned this pull request Mar 14, 2026

chore: release v1.1.0 #300

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: manual testing framework#301

feat: manual testing framework#301
avihut merged 38 commits intomasterfrom
feat/manual-testing-framework

avihut commented Mar 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avihut commented Mar 14, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant