Skip to content

feat: manual testing framework#301

Merged
avihut merged 38 commits intomasterfrom
feat/manual-testing-framework
Mar 14, 2026
Merged

feat: manual testing framework#301
avihut merged 38 commits intomasterfrom
feat/manual-testing-framework

Conversation

@avihut
Copy link
Owner

@avihut avihut commented Mar 14, 2026

Summary

  • YAML-based manual test framework for declarative scenario definition and automated verification
  • Interactive step-through mode with keyboard controls (run, check, re-run, reset, skip)
  • CI mode with clean, concise output and captured command output
  • Dev sandbox with isolated git identity, config, and helper scripts
  • Shared repo fixtures with use_fixture references and {{NAME}} substitution
  • Namespace resolution for scenarios (exec:checkout-single)
  • Setup/reset test environments for manual exploration (setup-test, reset-test)
  • Benchmark comparing bash integration tests vs YAML manual tests
  • Ctrl+C/Esc cleanup of test environments
  • Fix unnecessary rebuilds in git worktrees (build.rs)

Test plan

  • mise run test:unit — all 45 xtask + 636 daft tests pass
  • mise run test:manual -- --ci tests/manual/scenarios/exec/ — all 8 exec scenarios pass (12 steps)
  • mise run test:manual -- --ci sync:rebase-conflict-skips-push — sync conflict scenario passes
  • mise run bench:tests:integration:exec — bash and YAML suites both pass with comparable timing
  • Interactive mode tested manually with clone-basic scenario
  • setup-test clone-basic --step 3 creates environment and cd's into work dir
  • reset-test resets test environment
  • Ctrl+C cleans up test environment in both modes

🤖 Generated with Claude Code

avihut and others added 30 commits March 14, 2026 10:24
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements runner.rs for the manual test framework with:
- Individual assertion functions (exit code, dir/file existence, file
  content, git worktree/branch checks)
- Aggregate run_assertions() with relative path resolution
- execute_step() that runs shell commands with variable expansion
- run_non_interactive() for sequential step execution with reporting
- 12 unit tests covering all assertion functions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Interactive runner for manual test scenarios that shows each step,
waits for keypress, executes, displays assertion results, and offers
re-run/reset/quit controls. Supports --step to jump to a specific
step and --loop-count to repeat a step with environment reset.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace stub mod.rs with full implementation that discovers scenario
files, creates test environments, generates repos, and dispatches to
either interactive or non-interactive runner based on TTY detection.
Adds --list support and --keep flag for debugging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use RAII guard for raw mode in interactive.rs to handle panics
- Write hook scripts to .daft/hooks/ instead of .daft/
- Resolve relative cwd paths against env.work_dir in runner.rs
- Insert scenario vars before safety vars so safety vars always win
- Downgrade cleanup errors to warnings instead of propagating
- Replace unwrap() on to_str() with context() in repo_gen.rs
- Fix trust command in hooks-lifecycle.yml to use daft hooks trust
- Validate --loop-count requires --step

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename sandbox helpers to avoid daft- prefix conflict (daft-rebuild →
rebuild-daft, daft-manual-test → manual-test). Add bare name resolution
so scenarios can be run as `manual-test clone-basic` instead of requiring
the full path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add go-to-worktree script that opens a shell in the worktree directory.
Add zsh and bash completion functions for manual-test (flags and scenario
names). Completions are loaded via 'source enable-completions' since
direnv only propagates env vars, not shell completions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add clean-sandbox helper that removes the sandbox via mise sandbox:clean
and drops the user back into the worktree. Generate a README.md in the
sandbox with setup instructions (completions, shell-init) and a command
reference.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add bat to mise tools and generate a `help` command that renders the
sandbox README with syntax highlighting. Move all sandbox scripts and
completions into bin/ subdirectory for cleaner layout. Add empty
run-help hint file in sandbox root.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Convert all 8 tests from test_exec.sh to YAML manual test format in
tests/manual/scenarios/exec/. Add shared repo fixtures system with
use_fixture references and {{NAME}} placeholder substitution to
eliminate duplicated repo definitions across scenarios.

Framework additions:
- file_not_contains assertion (schema + runner)
- Directory argument support in resolve_scenario_paths
- RepoEntry untagged enum for fixture vs inline repo specs
- RepoFixture/RawScenario types for fixture resolution pipeline
- Benchmark task at bench:tests:integration:exec comparing bash vs YAML
- Fix check_git_worktree to use git branch --show-current
- Fix repo_gen to pass -b flag to git init (eliminates default branch hint)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The lockfile had cached the x86_64-apple-darwin asset for macos-arm64,
causing mise to install the wrong architecture and fail to resolve the
cog binary path. Reinstalling picked up the correct aarch64 build.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Capture command stdout/stderr in CI mode instead of inheriting stdio
- Suppress repo generation git output (null stdout/stderr)
- Show captured output only on step failure for debugging
- Replace per-scenario summaries with single overall summary
- Improve file_contains/file_not_contains failure messages to show actual content
- Add ScenarioResult type for aggregate tracking across scenarios

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The build script used `.git/HEAD` and `.git/refs/tags` as
rerun-if-changed paths, but in worktrees `.git` is a file (not a
directory) so these paths don't exist. Cargo treats non-existent
rerun-if-changed paths as "always re-run", causing a recompile on
every build.

Fix by resolving the actual git dir via `git rev-parse --git-dir`
and watching the real HEAD and branch ref files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add -v/--verbose flag to show full check details
- Use cyan for scenario names (higher visual hierarchy)
- Show compact check summary by default (e.g. "ok (3 checks)")
- Show individual check details only on failure or with --verbose
- Fix interactive UI color inversion (dim for meta, bold for names)
- Reduce indentation and remove separator lines for cleaner output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The completion scripts only globbed top-level *.yml files, missing
scenarios in subdirectories like exec/. Use recursive globs (zsh)
and find (bash) to discover all scenarios. Also add -v/--verbose
to flag completions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scenarios in subdirectories can now be referenced with colon-separated
namespaces (e.g., exec:checkout-single maps to exec/checkout-single.yml).
Bare names without a namespace fall back to recursive subdirectory search.

- Rename exec scenario files to drop redundant exec- prefix
- Add find_scenario_file and find_scenario_recursive helpers
- Update --list to show namespaced names (exec:checkout-single)
- Update sandbox completions to generate namespaced names

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Commands shown in cyan (visible, distinct from output)
- Prompts shown in yellow (action needed, stands out)
- Check summary in normal weight (not dim)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…mode

Split execute_step into run_step_command (runs command only) and
check_step (runs assertions only). In interactive mode, checks are
now triggered explicitly via [c] in the post-run prompt rather than
running automatically after every step. The [c] option only appears
when the step has expectations defined.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When checks all pass, automatically move to the next step without
waiting for a keypress. The post-run prompt only appears when checks
fail, were skipped (via [x]), or the step has no expectations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Register a ctrlc handler that removes the active sandbox directory on
SIGINT, preventing leftover /tmp/daft-manual-test-* directories. The
handler also restores terminal raw mode in case interactive mode was
active. Respects --keep flag.

In interactive mode, Esc now works as a quit trigger alongside 'q'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Print "Cleaned up test environment." when quitting via q/Esc/Ctrl+C.
Change step counter from regular blue to bright blue for better
visibility on dark terminals.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
avihut and others added 8 commits March 14, 2026 20:57
Display the work directory path when starting a scenario in interactive
mode. Shows a relative path if running from a parent directory, or the
full path otherwise.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change pre-commit hook from fmt:check to fmt with stage_fixed, matching
the existing prettier and biome behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add --setup-only flag to xtask manual-test that creates the test
environment (repos, sandbox) without running any steps. Support
DAFT_MANUAL_TEST_BASE env var for deterministic paths under the
sandbox directory.

New sandbox scripts:
- setup-test <scenario>: creates test env under sandbox/test/
- reset-test [name]: resets work dir and remotes from template

Includes tab completions for both (scenarios for setup-test, active
test envs for reset-test). Updated sandbox README.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
setup-test now runs all scenario steps silently (with captured output)
to leave the environment in a usable state. Use --step N to stop after
step N.

Example: setup-test clone-basic --step 3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Converts test_sync_rebase_conflict_skips_push from test_sync.sh to
the YAML manual test format. Verifies that git-sync --rebase --push
skips the push phase when rebase encounters a conflict.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
setup-test now spawns a new shell in the test environment's work
directory after setup completes. Exit the shell to return to the
sandbox. The xtask prints the work dir to stdout for the script
to capture.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace enable-completions with a single `source init` that loads
tab completions for all sandbox commands AND runs daft shell-init
for shell integration (cd wrappers).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@avihut avihut added this to the Public Launch milestone Mar 14, 2026
@avihut avihut added feat New feature ci CI/CD changes labels Mar 14, 2026
@avihut avihut self-assigned this Mar 14, 2026
@avihut avihut merged commit c694f6a into master Mar 14, 2026
6 checks passed
@avihut avihut deleted the feat/manual-testing-framework branch March 14, 2026 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci CI/CD changes feat New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant