feat: /auto-test — auto-run tests after code changes#18
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds an auto-test feature: TUI toggle detects a project test command, stores it on the runtime, and ConversationRuntime runs the test command after write-like tool actions, emitting test events and injecting failures into the conversation loop. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant TUI as TUI Handler
participant Runtime as ConversationRuntime
participant Shell as Test Shell
participant LLM
User->>TUI: /auto-test
TUI->>TUI: detect_test_command()
alt command detected
TUI->>Runtime: toggle auto_test_cmd = Some(cmd)
TUI-->>User: output_lines status (ON)
else none detected
TUI-->>User: output_lines status (no test runner)
end
User->>Runtime: trigger run_turn (includes write_file/edit_file)
Runtime->>Runtime: detect write-like tool execution
alt auto_test_cmd set
Runtime->>Shell: sh -c <auto_test_cmd>
Shell-->>Runtime: exit code + stdout + stderr
alt exit code == 0
Runtime->>Runtime: emit ToolOutputDelta (test pass)
else
Runtime->>Runtime: emit ToolOutputDelta (test fail)
Runtime->>Runtime: push formatted failure as user message
Runtime->>LLM: LLM receives failure for next response
end
end
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Toggle with /auto-test. Detects test framework (cargo, npm, pytest, go, make). After write_file/edit_file, runs tests automatically. If tests fail, feeds error output back to LLM to fix.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Around line 1323-1337: The function detect_test_command has formatting issues
causing cargo fmt failures and duplicates logic found in cmd_test
(mc-tui/src/commands.rs); fix by running rustfmt/`cargo fmt` to correct
spacing/formatting in detect_test_command, and remove duplication by extracting
the file-detection + command-construction logic into a shared helper (e.g., a
new function like detect_test_command_impl) or by reusing the existing cmd_test
implementation from the TUI crate; update detect_test_command to call that
shared helper and ensure it checks the same files (Cargo.toml, mc/Cargo.toml,
package.json, pytest.ini, setup.py, pyproject.toml, go.mod, Makefile) and
returns the same command strings.
In `@mc/crates/mc-core/src/runtime.rs`:
- Around line 456-484: The auto-test block incorrectly uses a cumulative
tool_calls list to compute had_writes (causing tests to rerun on later read-only
iterations); change the check to compute had_writes from only the
current-iteration tool calls (inspect where tool_calls is appended/produced and
compute had_writes from that per-iteration collection, referencing had_writes
and tool_calls in this function), and wrap the
tokio::process::Command::new(...).output().await call in a tokio::time::timeout
to enforce a configurable timeout, handling the Ok(Ok(output)) success,
Ok(Err(e)) spawn errors (log/warn via tracing::warn) and Err(_) timeout case
(emit ProviderEvent::ToolOutputDelta "⏱️ Tests timed out" and avoid blocking),
and fix the formatting/brace style in the failure message assembly (the
multi-branch if !stderr.is_empty() block) so it passes cargo fmt.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6d0e0bf6-7e01-44d9-bb20-27d8d920d005
📒 Files selected for processing (4)
mc/crates/mc-cli/src/main.rsmc/crates/mc-core/src/runtime.rsmc/crates/mc-tui/src/app.rsmc/crates/mc-tui/src/commands.rs
|
|
||
| // Auto-test: run tests after write tools, feed failures back to LLM | ||
| if let Some(ref test_cmd) = self.auto_test_cmd { | ||
| let had_writes = tool_calls.iter().any(|t| matches!(t.as_str(), "write_file" | "edit_file" | "batch_edit" | "apply_patch")); | ||
| if had_writes { | ||
| on_event(&ProviderEvent::ToolOutputDelta("\n🧪 Running tests...\n".into())); | ||
| if let Ok(output) = tokio::process::Command::new("sh") | ||
| .arg("-c") | ||
| .arg(test_cmd) | ||
| .output() | ||
| .await | ||
| { | ||
| let stdout = String::from_utf8_lossy(&output.stdout); | ||
| let stderr = String::from_utf8_lossy(&output.stderr); | ||
| if !output.status.success() { | ||
| let fail_msg = format!( | ||
| "Tests failed after code changes. Fix the errors:\n```\n{}{}\n```", | ||
| &stdout[..stdout.len().min(2000)], | ||
| if !stderr.is_empty() { format!("\nSTDERR:\n{}", &stderr[..stderr.len().min(500)]) } else { String::new() } | ||
| ); | ||
| on_event(&ProviderEvent::ToolOutputDelta("❌ Tests failed\n".into())); | ||
| self.session.messages.push(ConversationMessage::user(&fail_msg)); | ||
| // Continue the loop — LLM will see the failure and try to fix | ||
| continue; | ||
| } | ||
| on_event(&ProviderEvent::ToolOutputDelta("✅ Tests passed\n".into())); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Fix formatting and the cumulative had_writes check that causes repeated test runs.
Three issues:
-
Bug:
had_writesiterates over the cumulativetool_callslist across all iterations. If a write occurred in iteration 1 and tests failed, then in iteration 2 the LLM does only reads (e.g., to understand the error), tests will re-run because the old write tool name is still intool_calls. This can cause redundant test runs or loops untilMAX_ITERATIONS. -
Missing timeout: The test command could hang indefinitely, blocking the entire turn. Consider adding a timeout.
-
Formatting: CI reports
cargo fmtfailures on these lines.
🐛 Proposed fix to track writes per iteration and add timeout
+ // Track whether writes occurred in THIS iteration
+ let iteration_writes: Vec<&str> = batch_results.iter()
+ .map(|r| r.name.as_str())
+ .chain(sequential.iter().map(|(_, name, _)| name.as_str()))
+ .filter(|n| matches!(*n, "write_file" | "edit_file" | "batch_edit" | "apply_patch"))
+ .collect();
// Auto-test: run tests after write tools, feed failures back to LLM
if let Some(ref test_cmd) = self.auto_test_cmd {
- let had_writes = tool_calls.iter().any(|t| matches!(t.as_str(), "write_file" | "edit_file" | "batch_edit" | "apply_patch"));
+ let had_writes = !iteration_writes.is_empty();
if had_writes {
- on_event(&ProviderEvent::ToolOutputDelta("\n🧪 Running tests...\n".into()));
- if let Ok(output) = tokio::process::Command::new("sh")
+ on_event(&ProviderEvent::ToolOutputDelta(
+ "\n🧪 Running tests...\n".into(),
+ ));
+ let test_result = tokio::time::timeout(
+ std::time::Duration::from_secs(120),
+ tokio::process::Command::new("sh")
.arg("-c")
.arg(test_cmd)
- .output()
- .await
- {
+ .output(),
+ )
+ .await;
+ match test_result {
+ Ok(Ok(output)) => {
let stdout = String::from_utf8_lossy(&output.stdout);
let stderr = String::from_utf8_lossy(&output.stderr);
if !output.status.success() {
let fail_msg = format!(
"Tests failed after code changes. Fix the errors:\n```\n{}{}\n```",
&stdout[..stdout.len().min(2000)],
- if !stderr.is_empty() { format!("\nSTDERR:\n{}", &stderr[..stderr.len().min(500)]) } else { String::new() }
+ if !stderr.is_empty() {
+ format!("\nSTDERR:\n{}", &stderr[..stderr.len().min(500)])
+ } else {
+ String::new()
+ }
);
on_event(&ProviderEvent::ToolOutputDelta("❌ Tests failed\n".into()));
self.session.messages.push(ConversationMessage::user(&fail_msg));
- // Continue the loop — LLM will see the failure and try to fix
continue;
}
on_event(&ProviderEvent::ToolOutputDelta("✅ Tests passed\n".into()));
}
+ Ok(Err(e)) => {
+ tracing::warn!("test command failed to spawn: {e}");
+ }
+ Err(_) => {
+ on_event(&ProviderEvent::ToolOutputDelta("⏱️ Tests timed out\n".into()));
+ }
+ }
}
}🧰 Tools
🪛 GitHub Actions: CI
[error] 456-458: cargo fmt --all -- --check failed due to formatting differences around matches! and had_writes computation.
[error] 456-462: cargo fmt --all -- --check failed due to formatting differences in ProviderEvent::ToolOutputDelta argument formatting.
[error] 471-478: cargo fmt --all -- --check failed due to formatting differences in fail_msg construction and push() call formatting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@mc/crates/mc-core/src/runtime.rs` around lines 456 - 484, The auto-test block
incorrectly uses a cumulative tool_calls list to compute had_writes (causing
tests to rerun on later read-only iterations); change the check to compute
had_writes from only the current-iteration tool calls (inspect where tool_calls
is appended/produced and compute had_writes from that per-iteration collection,
referencing had_writes and tool_calls in this function), and wrap the
tokio::process::Command::new(...).output().await call in a tokio::time::timeout
to enforce a configurable timeout, handling the Ok(Ok(output)) success,
Ok(Err(e)) spawn errors (log/warn via tracing::warn) and Err(_) timeout case
(emit ProviderEvent::ToolOutputDelta "⏱️ Tests timed out" and avoid blocking),
and fix the formatting/brace style in the failure message assembly (the
multi-branch if !stderr.is_empty() block) so it passes cargo fmt.
beff11b to
430dd1b
Compare
What
New
/auto-testtoggle. When enabled, automatically runs tests after any write tool (write_file, edit_file, batch_edit, apply_patch). If tests fail, the error output is fed back to the LLM so it can fix the code.How it works
/auto-test— detects test framework and enables auto-testSupported frameworks
Changes
mc-core/src/runtime.rs: Addauto_test_cmdfield, post-write test execution with retry loopmc-tui/src/app.rs: AddAutoTestTogglecommandmc-tui/src/commands.rs: Add/auto-testcommandmc-cli/src/main.rs: Handle toggle +detect_test_command()helper152 tests pass.
Summary by CodeRabbit
/auto-testcommand to toggle automatic test execution.