Skip to content

feat: /auto-test — auto-run tests after code changes#18

Merged
kienbui1995 merged 1 commit intomainfrom
feat/auto-test
Apr 11, 2026
Merged

feat: /auto-test — auto-run tests after code changes#18
kienbui1995 merged 1 commit intomainfrom
feat/auto-test

Conversation

@kienbui1995
Copy link
Copy Markdown
Owner

@kienbui1995 kienbui1995 commented Apr 11, 2026

What

New /auto-test toggle. When enabled, automatically runs tests after any write tool (write_file, edit_file, batch_edit, apply_patch). If tests fail, the error output is fed back to the LLM so it can fix the code.

How it works

  1. /auto-test — detects test framework and enables auto-test
  2. After write tools execute, test command runs automatically
  3. ✅ Tests pass → continue normally
  4. ❌ Tests fail → error injected as user message → LLM retries fix

Supported frameworks

  • Rust (cargo test)
  • Node.js (npm test)
  • Python (pytest)
  • Go (go test)
  • Make (make test)

Changes

  • mc-core/src/runtime.rs: Add auto_test_cmd field, post-write test execution with retry loop
  • mc-tui/src/app.rs: Add AutoTestToggle command
  • mc-tui/src/commands.rs: Add /auto-test command
  • mc-cli/src/main.rs: Handle toggle + detect_test_command() helper

152 tests pass.

Summary by CodeRabbit

  • New Features
    • Added /auto-test command to toggle automatic test execution.
    • Automatic detection of project test runners for Rust, Node.js, Python, Go, and Make.
    • Tests run automatically after code-modifying actions, with pass/fail results surfaced.
    • Test failures are captured and shown for review to help iterative fixes.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 11, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d5595065-7df6-49d2-baa1-4415fa93187f

📥 Commits

Reviewing files that changed from the base of the PR and between beff11b and 430dd1b.

📒 Files selected for processing (4)
  • mc/crates/mc-cli/src/main.rs
  • mc/crates/mc-core/src/runtime.rs
  • mc/crates/mc-tui/src/app.rs
  • mc/crates/mc-tui/src/commands.rs

📝 Walkthrough

Walkthrough

Adds an auto-test feature: TUI toggle detects a project test command, stores it on the runtime, and ConversationRuntime runs the test command after write-like tool actions, emitting test events and injecting failures into the conversation loop.

Changes

Cohort / File(s) Summary
Core Runtime Logic
mc/crates/mc-core/src/runtime.rs
Added pub auto_test_cmd: Option<String> to ConversationRuntime; run_turn now runs the configured test command after write-like tool executions, emits test start/pass/fail events, and pushes formatted failure messages into the conversation on test failures.
CLI TUI Main
mc/crates/mc-cli/src/main.rs
Added handling for PendingCommand::AutoTestToggle using a non-blocking runtime.try_lock() to toggle rt.auto_test_cmd and append status lines to app.output_lines; added detect_test_command() -> Option<String> which probes common project files and returns a shell test command.
TUI Command Surface
mc/crates/mc-tui/src/app.rs, mc/crates/mc-tui/src/commands.rs
Added PendingCommand::AutoTestToggle enum variant and /auto-test slash command that sets app.pending_command = Some(PendingCommand::AutoTestToggle).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant TUI as TUI Handler
    participant Runtime as ConversationRuntime
    participant Shell as Test Shell
    participant LLM

    User->>TUI: /auto-test
    TUI->>TUI: detect_test_command()
    alt command detected
        TUI->>Runtime: toggle auto_test_cmd = Some(cmd)
        TUI-->>User: output_lines status (ON)
    else none detected
        TUI-->>User: output_lines status (no test runner)
    end

    User->>Runtime: trigger run_turn (includes write_file/edit_file)
    Runtime->>Runtime: detect write-like tool execution
    alt auto_test_cmd set
        Runtime->>Shell: sh -c <auto_test_cmd>
        Shell-->>Runtime: exit code + stdout + stderr
        alt exit code == 0
            Runtime->>Runtime: emit ToolOutputDelta (test pass)
        else
            Runtime->>Runtime: emit ToolOutputDelta (test fail)
            Runtime->>Runtime: push formatted failure as user message
            Runtime->>LLM: LLM receives failure for next response
        end
    end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Poem

🐰 I sniffed the tests beneath the tree,

I toggled auto-run with glee.
When files are written, I hop and see,
A blink—tests run, then back to me.
Cheers for green, and fixes for red—huzzah for CI!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly describes the main feature: adding an /auto-test command that automatically runs tests after code changes, which aligns with all the changes across the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/auto-test

Comment @coderabbitai help to get the list of available commands and usage tips.

Toggle with /auto-test. Detects test framework (cargo, npm, pytest, go, make).
After write_file/edit_file, runs tests automatically.
If tests fail, feeds error output back to LLM to fix.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@mc/crates/mc-cli/src/main.rs`:
- Around line 1323-1337: The function detect_test_command has formatting issues
causing cargo fmt failures and duplicates logic found in cmd_test
(mc-tui/src/commands.rs); fix by running rustfmt/`cargo fmt` to correct
spacing/formatting in detect_test_command, and remove duplication by extracting
the file-detection + command-construction logic into a shared helper (e.g., a
new function like detect_test_command_impl) or by reusing the existing cmd_test
implementation from the TUI crate; update detect_test_command to call that
shared helper and ensure it checks the same files (Cargo.toml, mc/Cargo.toml,
package.json, pytest.ini, setup.py, pyproject.toml, go.mod, Makefile) and
returns the same command strings.

In `@mc/crates/mc-core/src/runtime.rs`:
- Around line 456-484: The auto-test block incorrectly uses a cumulative
tool_calls list to compute had_writes (causing tests to rerun on later read-only
iterations); change the check to compute had_writes from only the
current-iteration tool calls (inspect where tool_calls is appended/produced and
compute had_writes from that per-iteration collection, referencing had_writes
and tool_calls in this function), and wrap the
tokio::process::Command::new(...).output().await call in a tokio::time::timeout
to enforce a configurable timeout, handling the Ok(Ok(output)) success,
Ok(Err(e)) spawn errors (log/warn via tracing::warn) and Err(_) timeout case
(emit ProviderEvent::ToolOutputDelta "⏱️ Tests timed out" and avoid blocking),
and fix the formatting/brace style in the failure message assembly (the
multi-branch if !stderr.is_empty() block) so it passes cargo fmt.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6d0e0bf6-7e01-44d9-bb20-27d8d920d005

📥 Commits

Reviewing files that changed from the base of the PR and between 64a0298 and beff11b.

📒 Files selected for processing (4)
  • mc/crates/mc-cli/src/main.rs
  • mc/crates/mc-core/src/runtime.rs
  • mc/crates/mc-tui/src/app.rs
  • mc/crates/mc-tui/src/commands.rs

Comment on lines +456 to +484

// Auto-test: run tests after write tools, feed failures back to LLM
if let Some(ref test_cmd) = self.auto_test_cmd {
let had_writes = tool_calls.iter().any(|t| matches!(t.as_str(), "write_file" | "edit_file" | "batch_edit" | "apply_patch"));
if had_writes {
on_event(&ProviderEvent::ToolOutputDelta("\n🧪 Running tests...\n".into()));
if let Ok(output) = tokio::process::Command::new("sh")
.arg("-c")
.arg(test_cmd)
.output()
.await
{
let stdout = String::from_utf8_lossy(&output.stdout);
let stderr = String::from_utf8_lossy(&output.stderr);
if !output.status.success() {
let fail_msg = format!(
"Tests failed after code changes. Fix the errors:\n```\n{}{}\n```",
&stdout[..stdout.len().min(2000)],
if !stderr.is_empty() { format!("\nSTDERR:\n{}", &stderr[..stderr.len().min(500)]) } else { String::new() }
);
on_event(&ProviderEvent::ToolOutputDelta("❌ Tests failed\n".into()));
self.session.messages.push(ConversationMessage::user(&fail_msg));
// Continue the loop — LLM will see the failure and try to fix
continue;
}
on_event(&ProviderEvent::ToolOutputDelta("✅ Tests passed\n".into()));
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix formatting and the cumulative had_writes check that causes repeated test runs.

Three issues:

  1. Bug: had_writes iterates over the cumulative tool_calls list across all iterations. If a write occurred in iteration 1 and tests failed, then in iteration 2 the LLM does only reads (e.g., to understand the error), tests will re-run because the old write tool name is still in tool_calls. This can cause redundant test runs or loops until MAX_ITERATIONS.

  2. Missing timeout: The test command could hang indefinitely, blocking the entire turn. Consider adding a timeout.

  3. Formatting: CI reports cargo fmt failures on these lines.

🐛 Proposed fix to track writes per iteration and add timeout
+            // Track whether writes occurred in THIS iteration
+            let iteration_writes: Vec<&str> = batch_results.iter()
+                .map(|r| r.name.as_str())
+                .chain(sequential.iter().map(|(_, name, _)| name.as_str()))
+                .filter(|n| matches!(*n, "write_file" | "edit_file" | "batch_edit" | "apply_patch"))
+                .collect();

             // Auto-test: run tests after write tools, feed failures back to LLM
             if let Some(ref test_cmd) = self.auto_test_cmd {
-                let had_writes = tool_calls.iter().any(|t| matches!(t.as_str(), "write_file" | "edit_file" | "batch_edit" | "apply_patch"));
+                let had_writes = !iteration_writes.is_empty();
                 if had_writes {
-                    on_event(&ProviderEvent::ToolOutputDelta("\n🧪 Running tests...\n".into()));
-                    if let Ok(output) = tokio::process::Command::new("sh")
+                    on_event(&ProviderEvent::ToolOutputDelta(
+                        "\n🧪 Running tests...\n".into(),
+                    ));
+                    let test_result = tokio::time::timeout(
+                        std::time::Duration::from_secs(120),
+                        tokio::process::Command::new("sh")
                             .arg("-c")
                             .arg(test_cmd)
-                            .output()
-                            .await
-                    {
+                            .output(),
+                    )
+                    .await;
+                    match test_result {
+                        Ok(Ok(output)) => {
                             let stdout = String::from_utf8_lossy(&output.stdout);
                             let stderr = String::from_utf8_lossy(&output.stderr);
                             if !output.status.success() {
                                 let fail_msg = format!(
                                     "Tests failed after code changes. Fix the errors:\n```\n{}{}\n```",
                                     &stdout[..stdout.len().min(2000)],
-                                    if !stderr.is_empty() { format!("\nSTDERR:\n{}", &stderr[..stderr.len().min(500)]) } else { String::new() }
+                                    if !stderr.is_empty() {
+                                        format!("\nSTDERR:\n{}", &stderr[..stderr.len().min(500)])
+                                    } else {
+                                        String::new()
+                                    }
                                 );
                                 on_event(&ProviderEvent::ToolOutputDelta("❌ Tests failed\n".into()));
                                 self.session.messages.push(ConversationMessage::user(&fail_msg));
-                                // Continue the loop — LLM will see the failure and try to fix
                                 continue;
                             }
                             on_event(&ProviderEvent::ToolOutputDelta("✅ Tests passed\n".into()));
                         }
+                        Ok(Err(e)) => {
+                            tracing::warn!("test command failed to spawn: {e}");
+                        }
+                        Err(_) => {
+                            on_event(&ProviderEvent::ToolOutputDelta("⏱️ Tests timed out\n".into()));
+                        }
+                    }
                 }
             }
🧰 Tools
🪛 GitHub Actions: CI

[error] 456-458: cargo fmt --all -- --check failed due to formatting differences around matches! and had_writes computation.


[error] 456-462: cargo fmt --all -- --check failed due to formatting differences in ProviderEvent::ToolOutputDelta argument formatting.


[error] 471-478: cargo fmt --all -- --check failed due to formatting differences in fail_msg construction and push() call formatting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mc/crates/mc-core/src/runtime.rs` around lines 456 - 484, The auto-test block
incorrectly uses a cumulative tool_calls list to compute had_writes (causing
tests to rerun on later read-only iterations); change the check to compute
had_writes from only the current-iteration tool calls (inspect where tool_calls
is appended/produced and compute had_writes from that per-iteration collection,
referencing had_writes and tool_calls in this function), and wrap the
tokio::process::Command::new(...).output().await call in a tokio::time::timeout
to enforce a configurable timeout, handling the Ok(Ok(output)) success,
Ok(Err(e)) spawn errors (log/warn via tracing::warn) and Err(_) timeout case
(emit ProviderEvent::ToolOutputDelta "⏱️ Tests timed out" and avoid blocking),
and fix the formatting/brace style in the failure message assembly (the
multi-branch if !stderr.is_empty() block) so it passes cargo fmt.

@kienbui1995 kienbui1995 merged commit 24db3a4 into main Apr 11, 2026
3 of 5 checks passed
@kienbui1995 kienbui1995 deleted the feat/auto-test branch April 11, 2026 09:29
@kienbui1995 kienbui1995 mentioned this pull request Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant