-
Notifications
You must be signed in to change notification settings - Fork 385
Closed
Description
Problem
We currently have 3 bash e2e test scripts in e2e/bash/ that test CLI functionality by shelling out to the nemoclaw binary:
| Script | What it tests |
|---|---|
test_sandbox_custom_image.sh |
Custom Dockerfile build + sandbox creation via --from |
test_sandbox_sync.sh |
Bidirectional file sync (directories, single files, large files with checksum verification) |
test_port_forward.sh |
TCP port forwarding through a sandbox via sandbox forward start |
These bash tests are brittle, hard to maintain, and inconsistent with the rest of the test suite which is written in Rust and Python. They rely on hand-rolled helpers (strip_ansi(), poll loops, trap-based cleanup) that would be better served by Rust's type system, assert! macros, and RAII cleanup.
Proposed Solution
Replace all 3 bash e2e test scripts with Rust integration tests that invoke the nemoclaw CLI binary as a subprocess (using std::process::Command or assert_cmd). The new tests should live in crates/navigator-cli/tests/ alongside the existing Rust integration tests (provider_commands_integration.rs, mtls_integration.rs).
Key design decisions
- Invoke the actual binary (
cargo buildartifact orenv!("CARGO_BIN_EXE_nemoclaw")) rather than calling library functions directly — these are true e2e tests that should exercise the full CLI entrypoint. - Use
assert_cmd(orstd::process::Command+ helpers) for ergonomic subprocess assertions. - Use
tempfile(already a dev-dependency) for temp directories and cleanup via RAII/Drop. - Port all test scenarios faithfully — each bash test has specific edge cases that must be preserved:
- Custom image: Dockerfile build, marker file verification in sandbox output
- Sync: nested directories, single-file mode, large file (~512 KiB) with SHA-256 checksum + size verification, multi-chunk ordering
- Port forward: background process management, TCP echo server, retry logic for tunnel readiness
Tasks
- Add
assert_cmdas a dev-dependency fornavigator-cli - Create
crates/navigator-cli/tests/e2e_custom_image.rs— porttest_sandbox_custom_image.sh - Create
crates/navigator-cli/tests/e2e_sync.rs— porttest_sandbox_sync.sh(all 5 steps) - Create
crates/navigator-cli/tests/e2e_port_forward.rs— porttest_port_forward.sh - Extract shared helpers (binary resolution, ANSI stripping, sandbox cleanup) into a shared test utility module (e.g.,
crates/navigator-cli/tests/common/mod.rs) - Update
tasks/test.toml— replace bash task definitions (test:e2e:custom-image,test:e2e:sync,test:e2e:port-forward) to run the new Rust tests (e.g., viacargo test --test e2e_*) - Delete
e2e/bash/directory and all 3 bash scripts - Verify all new tests pass against a running cluster (
mise run e2e)
Acceptance Criteria
- All 3 bash e2e test scripts are deleted
- Equivalent Rust integration tests exist that invoke the
nemoclawbinary as a subprocess - All test scenarios from the bash scripts are faithfully ported (no coverage regression)
- Shared test helpers are extracted to avoid duplication
tasks/test.tomlis updated somise runtasks point to the new Rust tests- All new tests pass in CI
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels