Skip to content

Migrate subprocess spawns to running-process + regression linter#200

Merged
zackees merged 1 commit intomainfrom
migrate-to-running-process
Apr 24, 2026
Merged

Migrate subprocess spawns to running-process + regression linter#200
zackees merged 1 commit intomainfrom
migrate-to-running-process

Conversation

@zackees
Copy link
Copy Markdown
Member

@zackees zackees commented Apr 24, 2026

Summary

Closes #141. Routes every fbuild subprocess through
fbuild_core::subprocess::{run_command, run_command_passthrough},
backed by running_process_core::NativeProcess, which drains stdout
and stderr concurrently from the moment the child starts.

This eliminates the pipe-buffer deadlock class that wrecked ESP32-C6
CI after the containment rewrite (#108) — and prevents it from coming
back: ci/find_direct_subprocess.py is promoted to a CI-enforced
linter that fails the build on any new direct Command::new spawn
without an // allow-direct-spawn: <reason> marker.

What moved

Category Sites Migrated via
Wrapper internals subprocess.rs rewrote to NativeProcess
Build helpers compiler_version, script_runtime, find_python run_command
CLI external tools find_pio, run_pio_command, open_in_browser, kill_process, find_daemon_pids run_command / run_command_passthrough

Intentional hold-outs (annotated)

  • Daemon spawns from CLI / Python (must outlive parent)
  • zccache bootstrap (independent lifecycle)
  • Containment module's own regression tests
  • Integration-test drivers that spawn binaries under test
  • tokio-async streaming emulator handlers (QEMU, avr8js/node) — the
    synchronous NativeProcess API is a poor fit for tokio::select
    over stdout+stderr+wait+timeout; tracked separately
  • tokio parallel async fan-out in the CLI (IWYU, clang-tidy) inside
    a process with no daemon containment group

21 sites allowlisted, 0 to migrate; uv run python ci/find_direct_subprocess.py --fail exits 0.

Linter

ci/find_direct_subprocess.py now lives as a first-class regression
guard:

  • --fail flag for CI (returns 1 on any unmarked spawn)
  • Wired into .github/workflows/check-ubuntu.yml alongside clippy
  • Dedicated .github/workflows/lint-subprocess.yml for
    defence-in-depth

Test plan

  • uv run soldr cargo check --workspace --all-targets — passes
  • uv run soldr cargo clippy --workspace --all-targets -- -D warnings — 0 errors
  • uv run soldr cargo test -p fbuild-core — 81 passed, incl. new subprocess::tests::{run_echo, run_captures_stderr, run_nonexistent_command, run_empty_args}
  • uv run soldr cargo test -p fbuild-build --lib — 498 passed
  • uv run python ci/find_direct_subprocess.py --fail — exits 0
  • New Ubuntu workflow "Lint subprocess spawns" runs green in CI

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added subprocess spawning linter to CI workflows to enforce controlled direct process execution.
  • Improvements

    • Standardized subprocess execution across codebase using unified subprocess handling for improved consistency and robustness.
    • Added run_command_passthrough function for seamless passthrough process integration with inherited I/O streams.

Closes #141. Every fbuild subprocess now flows through
`fbuild_core::subprocess::{run_command,run_command_passthrough}`,
backed by `running_process_core::NativeProcess`, which drains stdout
and stderr concurrently from the moment the child starts. This
eliminates the class of pipe-buffer deadlocks that broke ESP32-C6 CI
after the hand-rolled drain in the containment rewrite (#108).

Migrated sites:
- fbuild-build: compiler_version, script_runtime harness + find_python
- fbuild-cli: find_pio, run_pio_command (via run_command_passthrough),
  open_in_browser, kill_process, find_daemon_pids

Intentional hold-outs annotated with `// allow-direct-spawn: <reason>`:
- zccache bootstrap and daemon spawns from CLI / Python (must outlive
  parent)
- containment module's own regression tests
- integration-test drivers that spawn binaries under test
- tokio async streaming emulator handlers (QEMU, avr8js/node) where
  the blocking NativeProcess API is unsuitable
- tokio parallel async fan-out in the CLI (IWYU, clang-tidy)

`ci/find_direct_subprocess.py` is promoted from a migration tracker
to a CI-enforced linter (`--fail`), wired into the Ubuntu check
workflow and a dedicated `lint-subprocess.yml`. Future direct
`Command::new` spawns without an allow-marker break the build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

📝 Walkthrough

Walkthrough

This PR migrates subprocess spawning across fbuild from direct std::process::Command and tokio::process::Command usage to a centralized fbuild_core::subprocess module backed by running-process-core::NativeProcess. CI enforcement is added via a Python linter script, core execution paths are refactored, and intentional exceptions are documented inline.

Changes

Cohort / File(s) Summary
CI Workflows
.github/workflows/check-ubuntu.yml, .github/workflows/lint-subprocess.yml
New CI workflows that execute Python linter script to enforce subprocess spawning policy; step added to ubuntu check, standalone lint-subprocess workflow triggers on main pushes and pull requests.
Linter Script
ci/find_direct_subprocess.py
Updated documentation to clarify disallowance of direct std::process::Command::new / tokio::process::Command::new outside enumerated hold-outs; adjusted output messaging from generic migration guidance to explicit "NEW direct spawns" error with routing instructions.
Core Subprocess Refactoring
crates/fbuild-core/src/subprocess.rs
Replaced manual concurrent stdout/stderr draining with running_process_core::NativeProcess + ProcessConfig-driven execution; run_command now constructs process config with timeout kill behavior and returns ToolOutput from captured pipes; added new public function run_command_passthrough for non-captured execution with exit code return; unified environment and containment handling.
Build Script Runtime Migration
crates/fbuild-build/src/script_runtime.rs
Refactored extra_scripts execution from manual Command + spawn_contained to run_command with argv built from Python executable plus harness/input paths; Python discovery (--version probing) switched to run_command; output parsing updated to use serde_json::from_str rather than slice-based approach.
Compiler & Build Tool Migrations
crates/fbuild-build/src/compiler.rs, crates/fbuild-build/src/zccache.rs
compiler_version switched from direct Command to run_command with updated success/output semantics; single comment added to zccache::ensure_running documenting intentional direct spawn hold-out.
CLI Tool Migration
crates/fbuild-cli/src/main.rs, crates/fbuild-cli/src/daemon_client.rs
Standardized process launching through fbuild_core::subprocess for PlatformIO passthrough, browser launching, and daemon discovery/termination; find_pio and tool-finding functions now use run_command; run_pio_command uses run_command_passthrough; comments document intentional daemon spawn hold-outs.
Emulator & Containment Hold-outs
crates/fbuild-daemon/src/bin/containment_harness.rs, crates/fbuild-daemon/src/handlers/emulator.rs, crates/fbuild-daemon/tests/port_recovery.rs, crates/fbuild-daemon/tests/process_containment.rs, crates/fbuild-python/src/lib.rs
Inline allow-direct-spawn comments added to document intentional direct spawn sites (daemon bootstrap, emulator async streaming, test harnesses); no functional logic changes.
Test & Documentation Comments
crates/fbuild-build/tests/zccache_hit_across_workspace_rename.rs, crates/fbuild-cli/tests/test_emu_exit_code.rs, crates/fbuild-core/src/containment.rs
Single-line clarifying comments added to document test-specific direct spawn rationale; no executable logic or control flow modifications.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as fbuild Code
    participant Sub as fbuild_core::subprocess
    participant Config as ProcessConfig
    participant NProc as NativeProcess
    participant OS as OS Process & Pipes

    Caller->>Sub: run_command(args, cwd, env, timeout)
    Sub->>Config: build_config(args, timeout, env)
    Config->>Config: compute_env() / apply containment
    Sub->>NProc: new(ProcessConfig)
    NProc->>OS: spawn process w/ piped stdout/stderr
    OS-->>NProc: process handle + pipes
    NProc->>OS: execute & capture output
    OS-->>NProc: stdout/stderr chunks
    NProc->>NProc: join_lines(captured_stdout/stderr)
    NProc-->>Sub: ProcessResult
    Sub->>Sub: ToolOutput { stdout, stderr, success, exit_code }
    Sub-->>Caller: Result<ToolOutput>
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 Hop hop, let processes flow!
No more raw Command, watch spawning grow—
Through NativeProcess pipes run clean,
drains dead, no more deadlock scene!
Each daemon knows: stay alive, little daemon—
The migration's complete, fbuild's much saner! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: migrating subprocess spawns to running-process and introducing a regression linter.
Linked Issues check ✅ Passed The PR comprehensively addresses issue #141 by migrating all subprocess spawns to running-process-based implementations, enforcing migration through a CI linter, and annotating intentional hold-outs.
Out of Scope Changes check ✅ Passed All changes are directly scoped to subprocess migration objectives: refactoring spawns to run_command/run_command_passthrough, adding the linter, annotating hold-outs, and integrating CI checks.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch migrate-to-running-process

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
crates/fbuild-core/src/subprocess.rs (2)

316-373: Add a smoke test for run_command_passthrough.

New public API with no direct coverage. Even without asserting captured output (by design it doesn't capture), a test confirming exit-code propagation and empty-args rejection would catch regressions in the shared build_config wiring:

🧪 Suggested tests
     #[test]
     fn run_captures_stderr() {
         ...
     }
+
+    #[test]
+    fn passthrough_returns_exit_code() {
+        let args = if cfg!(windows) {
+            vec!["cmd", "/C", "exit 3"]
+        } else {
+            vec!["sh", "-c", "exit 3"]
+        };
+        let code = run_command_passthrough(&args, None, None, None).unwrap();
+        assert_eq!(code, 3);
+    }
+
+    #[test]
+    fn passthrough_rejects_empty_args() {
+        let result = run_command_passthrough(&[], None, None, None);
+        assert!(result.is_err());
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/fbuild-core/src/subprocess.rs` around lines 316 - 373, Add a smoke
test module for the new public API run_command_passthrough that mirrors existing
run_command tests: call run_command_passthrough with a simple command that exits
successfully (e.g., "echo hello" or platform-appropriate equivalent) and assert
the returned result indicates success/propagated exit code; also call
run_command_passthrough with an empty args slice and assert it returns an Err to
catch empty-args rejection; reference the run_command_passthrough symbol and
reuse the test patterns from run_command (including platform branching and use
of ToolOutput-like checks) to ensure shared build_config wiring is exercised.

68-82: Duplication of wait/timeout/kill logic between run_command_passthrough and run_captured.

The two match process.wait(...) arms format identical error messages and both call process.kill() on timeout. Extracting a shared helper keeps error wording in sync and shrinks the module:

♻️ Suggested extraction
fn wait_or_kill(process: &NativeProcess, args: &[&str], timeout: Option<Duration>) -> Result<i32> {
    match process.wait(timeout) {
        Ok(code) => Ok(code),
        Err(ProcessError::Timeout) => {
            let _ = process.kill();
            Err(FbuildError::Timeout(format!(
                "command timed out after {}s",
                timeout.map(|d| d.as_secs()).unwrap_or(0)
            )))
        }
        Err(e) => Err(FbuildError::Other(format!("command {:?} failed: {}", args, e))),
    }
}

Both callers then become let exit_code = wait_or_kill(&process, args, timeout)?;.

Also applies to: 91-106

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/fbuild-core/src/subprocess.rs` around lines 68 - 82, Duplicate
wait/timeout/kill logic exists in run_command_passthrough and run_captured;
extract a helper (e.g., wait_or_kill) that takes (&NativeProcess, args: &[&str],
timeout: Option<Duration>) and performs the match on process.wait(timeout),
calls process.kill() on ProcessError::Timeout, and returns the same
FbuildError::Timeout or FbuildError::Other messages using args and the timeout
seconds; then replace the duplicated match in run_command_passthrough and
run_captured with a call like let exit_code = wait_or_kill(&process, args,
timeout)?; ensure you reference ProcessError, process.wait, process.kill, and
the existing FbuildError variants to keep behavior identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/check-ubuntu.yml:
- Around line 32-33: The workflow runs the step named "Lint subprocess spawns"
which invokes the `uv` binary (`uv run python ci/find_direct_subprocess.py
--fail`) but never installs it; insert the `astral-sh/setup-uv@v3` action
immediately before the "Lint subprocess spawns" step so `uv` is provisioned (use
the same configuration pattern as `lint-subprocess.yml` to ensure `uv` is
available to run the linter).

In `@ci/find_direct_subprocess.py`:
- Around line 31-37: scan_workspace() currently only globs crates/**/*.rs
(missing build.rs and other Rust files) and _is_doc_or_string() only removes
comments so it still flags occurrences inside string literals; update
scan_workspace() to scan the whole repo for Rust source files (e.g., **/*.rs,
including build.rs, examples/, benches/, tests/) so no Rust file is skipped, and
strengthen _is_doc_or_string() so it ignores matches inside Rust
string/char/raw-string literals (handle ", ', raw strings r#"..."#, and escaped
quotes) by scanning the line with a simple state machine or regex that tracks
whether the current position is inside a string/char/raw-string before treating
a "Command::new(" hit as a violation; keep comment/doc filtering but add the
literal-aware check to avoid false positives.

In `@crates/fbuild-core/src/subprocess.rs`:
- Around line 198-235: In compute_env, avoid adding a trailing semicolon when
there is no existing PATH and prevent the exe-dir you just prepended from being
removed by strip_msys_env: change the PATH construction to check
current_path.is_empty() and use exe_dir_str alone if empty (instead of
format!("{};{}", ...)), and move the exe_dir prepend to after the
is_msys_environment/strip_msys_env block (or alternatively adjust strip_msys_env
to ignore the exact exe_dir_str) so the injected exe_dir value (exe_dir_str) is
not filtered out; update references to env_map, exe_dir_str,
is_msys_environment, and strip_msys_env in compute_env accordingly.
- Around line 59-82: Add a smoke test that invokes run_command_passthrough with
a simple command (e.g., exit with a known code) and asserts the returned i32
matches the child exit code to cover the public API; then refactor the
duplicated wait/timeout/error-handling into a shared helper used by both
run_command_passthrough and run_captured (extract the match on process.wait(...)
that handles Ok(code), Err(ProcessError::Timeout) -> kill +
FbuildError::Timeout, and Err(e) -> FbuildError::Other), referencing
NativeProcess::wait, spawn_err, FbuildError and ProcessError to locate the logic
to move.

---

Nitpick comments:
In `@crates/fbuild-core/src/subprocess.rs`:
- Around line 316-373: Add a smoke test module for the new public API
run_command_passthrough that mirrors existing run_command tests: call
run_command_passthrough with a simple command that exits successfully (e.g.,
"echo hello" or platform-appropriate equivalent) and assert the returned result
indicates success/propagated exit code; also call run_command_passthrough with
an empty args slice and assert it returns an Err to catch empty-args rejection;
reference the run_command_passthrough symbol and reuse the test patterns from
run_command (including platform branching and use of ToolOutput-like checks) to
ensure shared build_config wiring is exercised.
- Around line 68-82: Duplicate wait/timeout/kill logic exists in
run_command_passthrough and run_captured; extract a helper (e.g., wait_or_kill)
that takes (&NativeProcess, args: &[&str], timeout: Option<Duration>) and
performs the match on process.wait(timeout), calls process.kill() on
ProcessError::Timeout, and returns the same FbuildError::Timeout or
FbuildError::Other messages using args and the timeout seconds; then replace the
duplicated match in run_command_passthrough and run_captured with a call like
let exit_code = wait_or_kill(&process, args, timeout)?; ensure you reference
ProcessError, process.wait, process.kill, and the existing FbuildError variants
to keep behavior identical.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e1cc28eb-6c66-4f61-859e-df472bc8a771

📥 Commits

Reviewing files that changed from the base of the PR and between 2aa2e34 and a48d20c.

📒 Files selected for processing (17)
  • .github/workflows/check-ubuntu.yml
  • .github/workflows/lint-subprocess.yml
  • ci/find_direct_subprocess.py
  • crates/fbuild-build/src/compiler.rs
  • crates/fbuild-build/src/script_runtime.rs
  • crates/fbuild-build/src/zccache.rs
  • crates/fbuild-build/tests/zccache_hit_across_workspace_rename.rs
  • crates/fbuild-cli/src/daemon_client.rs
  • crates/fbuild-cli/src/main.rs
  • crates/fbuild-cli/tests/test_emu_exit_code.rs
  • crates/fbuild-core/src/containment.rs
  • crates/fbuild-core/src/subprocess.rs
  • crates/fbuild-daemon/src/bin/containment_harness.rs
  • crates/fbuild-daemon/src/handlers/emulator.rs
  • crates/fbuild-daemon/tests/port_recovery.rs
  • crates/fbuild-daemon/tests/process_containment.rs
  • crates/fbuild-python/src/lib.rs

Comment on lines +32 to +33
- name: Lint subprocess spawns
run: uv run python ci/find_direct_subprocess.py --fail
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Does zackees/setup-soldr@v0installuv, or should GitHub Actions workflows explicitly add astral-sh/setup-uvbefore runninguv run ...?

💡 Result:

No, zackees/setup-soldr@v0 does not install uv. The action installs the soldr binary, provisions Rust via rustup (bootstrapping it if needed), sets up cache roots for Soldr/Cargo/rustup, and puts soldr on PATH. There is no mention of uv (the Python package manager) in its functionality or scripts. The repository zackees/setup-soldr has not been created or released as of 2026-04-24; it is planned for future extraction from zackees/soldr, with @v0 as the planned beta tag. Currently, use the in-repo action from zackees/soldr@ (pin by SHA or tag). To run uv run ... in GitHub Actions, explicitly add astral-sh/setup-uv before using uv, as setup-soldr (even when available) handles Rust/Soldr, not Python tooling.

Citations:


🏁 Script executed:

# Locate and read the workflow file
fd 'check-ubuntu.yml' --type f

# Read the relevant section
cat -n .github/workflows/check-ubuntu.yml | sed -n '1,50p'

Repository: FastLED/fbuild

Length of output: 1131


Add astral-sh/setup-uv before the linter step.

Line 33 runs uv run ... but the job never provisions uv. The zackees/setup-soldr@v0 action handles Rust tooling only, not Python package managers. Add astral-sh/setup-uv@v3 (as lint-subprocess.yml does) before line 32 to fix the workflow failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/check-ubuntu.yml around lines 32 - 33, The workflow runs
the step named "Lint subprocess spawns" which invokes the `uv` binary (`uv run
python ci/find_direct_subprocess.py --fail`) but never installs it; insert the
`astral-sh/setup-uv@v3` action immediately before the "Lint subprocess spawns"
step so `uv` is provisioned (use the same configuration pattern as
`lint-subprocess.yml` to ensure `uv` is available to run the linter).

Comment on lines +31 to 37
Run in CI with `--fail` so any new direct spawn without a marker
breaks the build.

Usage:
uv run python ci/find_direct_subprocess.py # report
uv run python ci/find_direct_subprocess.py --fail # exit 1 if >0
uv run python ci/find_direct_subprocess.py --fail # exit 1 if any
uv run python ci/find_direct_subprocess.py --json # machine output
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harden the scanner before making it build-breaking.

With --fail enabled here, the current implementation can both miss real violations and fail on harmless code: scan_workspace() only walks crates/**/*.rs, so direct spawns in files like build.rs or other Rust files outside crates/ bypass the gate, while _is_doc_or_string() only filters comment/doc lines and still matches string literals such as let s = "Command::new(". That makes the new CI check both incomplete and flaky.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ci/find_direct_subprocess.py` around lines 31 - 37, scan_workspace()
currently only globs crates/**/*.rs (missing build.rs and other Rust files) and
_is_doc_or_string() only removes comments so it still flags occurrences inside
string literals; update scan_workspace() to scan the whole repo for Rust source
files (e.g., **/*.rs, including build.rs, examples/, benches/, tests/) so no
Rust file is skipped, and strengthen _is_doc_or_string() so it ignores matches
inside Rust string/char/raw-string literals (handle ", ', raw strings r#"..."#,
and escaped quotes) by scanning the line with a simple state machine or regex
that tracks whether the current position is inside a string/char/raw-string
before treating a "Command::new(" hit as a violation; keep comment/doc filtering
but add the literal-aware check to avoid false positives.

Comment on lines +59 to +82
pub fn run_command_passthrough(
args: &[&str],
cwd: Option<&Path>,
env: Option<&[(&str, &str)]>,
timeout: Option<Duration>,
) -> Result<i32> {
let config = build_config(args, cwd, env, /*capture=*/ false, StdinMode::Inherit)?;
let process = NativeProcess::new(config);
process.start().map_err(|e| spawn_err(args, e))?;
match process.wait(timeout) {
Ok(code) => Ok(code),
Err(ProcessError::Timeout) => {
let _ = process.kill();
Err(FbuildError::Timeout(format!(
"command timed out after {}s",
timeout.map(|d| d.as_secs()).unwrap_or(0)
)))
}
Err(e) => Err(FbuildError::Other(format!(
"command {:?} failed: {}",
args, e
))),
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the StderrMode enum and its usage in running-process-core sources and ProcessConfig consumers.
fd -t f -e rs | xargs rg -nP '\b(StderrMode|capture|captured_stderr)\b' -C3
ast-grep --pattern $'enum StderrMode { $$$ }'

Repository: FastLED/fbuild

Length of output: 14872


🏁 Script executed:

#!/bin/bash
# Find running_process_core in Cargo.toml files and git submodules
echo "=== Searching for running_process_core in Cargo.toml ==="
find . -name "Cargo.toml" -type f | xargs rg -l "running.process.core" 2>/dev/null | head -5

echo "=== Checking git submodules ==="
git config --file .gitmodules --get-regexp path | grep -i process 2>/dev/null || echo "No matching submodules"

echo "=== Checking Cargo.lock for version info ==="
rg "name = \"running.process.core\"" -A 5 Cargo.lock 2>/dev/null || echo "Not found in Cargo.lock"

echo "=== Searching for StderrMode usage patterns in subprocess.rs ==="
rg "StderrMode" crates/fbuild-core/src/subprocess.rs -B 2 -A 2

Repository: FastLED/fbuild

Length of output: 831


🏁 Script executed:

#!/bin/bash
# Look for tests of run_command_passthrough
echo "=== Tests for run_command_passthrough ==="
rg "run_command_passthrough" crates/ --type rs -B 2 -A 5

echo ""
echo "=== Full subprocess.rs to understand design ==="
wc -l crates/fbuild-core/src/subprocess.rs

echo ""
echo "=== Comments about stderr/capture handling ==="
rg "stderr|Pipe|capture" crates/fbuild-core/src/subprocess.rs -B 1 -A 1 | head -50

Repository: FastLED/fbuild

Length of output: 1693


🏁 Script executed:

#!/bin/bash
# Read the subprocess.rs file to understand the full context and design
echo "=== Reading subprocess.rs module docstring and imports ==="
head -40 crates/fbuild-core/src/subprocess.rs

echo ""
echo "=== Looking for actual calls to run_command_passthrough ==="
rg "run_command_passthrough" . --type-list | head -1
rg "run_command_passthrough" crates/ -l 2>/dev/null

echo ""
echo "=== Search for callers across entire codebase ==="
rg "run_command_passthrough" . -B 1 -A 2 2>/dev/null | grep -v ".git" | head -30

Repository: FastLED/fbuild

Length of output: 3011


🏁 Script executed:

#!/bin/bash
# Let me look at the actual usage in main.rs and understand the context better
echo "=== Context around run_command_passthrough call in main.rs ==="
rg "run_command_passthrough" crates/fbuild-cli/src/main.rs -B 10 -A 5

echo ""
echo "=== Check if there are any enum variants or docs for StderrMode in this codebase ==="
rg "enum StderrMode\|StderrMode::" crates/ -B 2 -A 8

echo ""
echo "=== Look at the ProcessConfig struct to understand all fields ==="
rg "struct ProcessConfig\|ProcessConfig {" crates/ -B 2 -A 15 | head -60

Repository: FastLED/fbuild

Length of output: 1048


🏁 Script executed:

#!/bin/bash
# Fix the regex syntax
echo "=== Look at build_config function more carefully ==="
sed -n '117,160p' crates/fbuild-core/src/subprocess.rs

echo ""
echo "=== Look at run_captured to see how it handles captured output ==="
sed -n '85,115p' crates/fbuild-core/src/subprocess.rs

echo ""
echo "=== Look at the rest of subprocess.rs for any other clues ==="
tail -100 crates/fbuild-core/src/subprocess.rs | head -50

Repository: FastLED/fbuild

Length of output: 3802


🏁 Script executed:

#!/bin/bash
# Let me check if there's a way to understand the running_process_core behavior
# by looking at how it's documented or used elsewhere

echo "=== Check for any comments or documentation about StderrMode ==="
rg "StderrMode" crates/ -B 5 -A 2

echo ""
echo "=== Check if there are any other usages of NativeProcess ==="
rg "NativeProcess" crates/ -l

echo ""
echo "=== Check if there's any indication of what happens when capture=false with Pipe ==="
rg "capture.*false\|Pipe" crates/fbuild-core/src/subprocess.rs -B 2 -A 2

Repository: FastLED/fbuild

Length of output: 1414


🏁 Script executed:

#!/bin/bash
# Check the other uses of NativeProcess to understand the pattern
echo "=== Usage in script_runtime.rs ==="
rg "NativeProcess\|StderrMode" crates/fbuild-build/src/script_runtime.rs -B 3 -A 3 | head -40

echo ""
echo "=== Usage in emulator.rs ==="
rg "NativeProcess\|StderrMode" crates/fbuild-daemon/src/handlers/emulator.rs -B 3 -A 3 | head -40

echo ""
echo "=== Search for the GitHub repo to understand running_process_core behavior ==="
echo "The repo is at: https://github.com/zackees/running-process"
echo "Let me check the Cargo.toml for description"
rg "running.process.core" Cargo.toml -A 10

Repository: FastLED/fbuild

Length of output: 468


🏁 Script executed:

#!/bin/bash
# Let me search for web documentation or look at the PR/issue context
echo "=== Check if there's any issue reference or PR context ==="
rg "fbuild#141\|pipe\|stderr" crates/fbuild-core/src/subprocess.rs | head -20

echo ""
echo "=== Let's read the module docstring again to understand the design ==="
head -25 crates/fbuild-core/src/subprocess.rs

echo ""
echo "=== Search in commit messages or comments about the passthrough design ==="
rg "passthrough\|live output\|inherit" crates/fbuild-core/src/subprocess.rs -B 2 -A 2

Repository: FastLED/fbuild

Length of output: 1294


🏁 Script executed:

#!/bin/bash
# Let me try to find documentation or examples about the different StderrMode variants
echo "=== Check Cargo.lock for more details about running-process-core ==="
sed -n '/name = "running-process-core"/,/^$/p' Cargo.lock

echo ""
echo "=== Search the entire codebase for any indication of other StderrMode variants ==="
rg "StderrMode::" . --type rs 2>/dev/null | grep -v ".git"

echo ""
echo "=== Check if there's a README or docs that mention the design ==="
find . -name "README*" -o -name "DESIGN*" | xargs cat 2>/dev/null | grep -i "stderr\|pipe\|capture" | head -10

Repository: FastLED/fbuild

Length of output: 1501


🏁 Script executed:

#!/bin/bash
# Let me check if there's a web_search option to find documentation about running-process-core
# or check for any recent PRs/issues discussing this behavior
echo "=== Check if there are tests that show the expected behavior ==="
rg "test.*passthrough\|test.*pio" crates/fbuild-core/src/subprocess.rs -B 2 -A 10

echo ""
echo "=== Look at all test functions in subprocess.rs ==="
sed -n '200,$p' crates/fbuild-core/src/subprocess.rs | grep -A 8 "fn test_\|fn run_"

Repository: FastLED/fbuild

Length of output: 1277


🌐 Web query:

running-process-core StderrMode Pipe behavior capture false

💡 Result:

The query "running-process-core StderrMode Pipe behavior capture false" refers to configuration in the Rust crate running-process-core, a library for managing contained child processes (evident from source code snippets showing ContainedProcessGroup and process spawning with Command). In this crate, StderrMode::Pipe configures the child process's stderr to use a pipe (Stdio::piped). The capture false parameter or flag disables capturing/reading the stderr output into memory or a collector. Instead: - The stderr pipe is created but not read by the library. - Child stderr inherits the parent's stderr or writes directly to it (behavior depends on platform and exact setup). - No buffering or deadlock risk from unread stderr, as it's not captured. - Output appears directly in the parent's stderr stream in real-time, without library interception. This avoids memory usage for stderr collection while allowing live output visibility. When capture is true (default in many process libs), the library reads and stores stderr, potentially causing deadlocks if buffers fill. For standard Rust std::process::Command, equivalent is .stderr(Stdio::piped) without reading child.stderr, but running-process-core abstracts this with StderrMode for contained processes.

Citations:


Add test coverage for run_command_passthrough and refactor duplicated wait/timeout logic.

The StderrMode::Pipe behavior is correct for passthrough (stderr flows to parent in real-time when capture=false), so no stderr loss occurs. However, two legitimate issues remain:

  1. Missing test for run_command_passthrough: This public API has zero test coverage despite being critical for CLI delegation (e.g., pio command). At minimum, add a smoke test that verifies the exit code is propagated correctly.

  2. Code duplication: The wait/timeout/error-handling pattern in run_command_passthrough (lines 68–81) duplicates run_captured (lines 94–107). Consider extracting this into a helper function.

As a minor improvement: the capture parameter could be paired with explicit stderr_mode selection to clarify intent, but the current design works correctly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/fbuild-core/src/subprocess.rs` around lines 59 - 82, Add a smoke test
that invokes run_command_passthrough with a simple command (e.g., exit with a
known code) and asserts the returned i32 matches the child exit code to cover
the public API; then refactor the duplicated wait/timeout/error-handling into a
shared helper used by both run_command_passthrough and run_captured (extract the
match on process.wait(...) that handles Ok(code), Err(ProcessError::Timeout) ->
kill + FbuildError::Timeout, and Err(e) -> FbuildError::Other), referencing
NativeProcess::wait, spawn_err, FbuildError and ProcessError to locate the logic
to move.

Comment on lines +198 to +235
fn compute_env(program: &str, overlay: Option<&[(&str, &str)]>) -> Option<Vec<(String, String)>> {
#[cfg(windows)]
{
use std::os::windows::process::CommandExt;
const CREATE_NO_WINDOW: u32 = 0x08000000;
cmd.creation_flags(CREATE_NO_WINDOW);
let mut env_map: std::collections::BTreeMap<String, String> = std::env::vars().collect();

// Add the executable's parent directory to PATH so that child processes
// (e.g., cc1plus launched by g++) can find DLLs in the same bin/ dir.
if let Some(exe_dir) = Path::new(args[0]).parent() {
let exe_dir_str = exe_dir.to_string_lossy();
// Prepend the executable's directory to PATH so that child
// processes (e.g., cc1plus launched by g++) can find DLLs in
// the same bin/ dir.
if let Some(exe_dir) = Path::new(program).parent() {
let exe_dir_str = exe_dir.to_string_lossy().to_string();
if !exe_dir_str.is_empty() {
let current_path = std::env::var("PATH").unwrap_or_default();
cmd.env("PATH", format!("{};{}", exe_dir_str, current_path));
let current_path = env_map
.get("PATH")
.or_else(|| env_map.get("Path"))
.cloned()
.unwrap_or_default();
env_map.insert(
"PATH".to_string(),
format!("{};{}", exe_dir_str, current_path),
);
}
}

// Strip MSYS/MSYS2 environment variables that interfere with
// native Windows toolchain binaries finding their internal tools.
if is_msys_environment() {
strip_msys_env(&mut cmd);
// native Windows toolchain binaries finding their internal
// tools.
if is_msys_environment(&env_map) {
strip_msys_env(&mut env_map);
}

if let Some(vars) = overlay {
for (k, v) in vars {
env_map.insert((*k).to_string(), (*v).to_string());
}
}
}

cmd.stdin(Stdio::null());
cmd.stdout(Stdio::piped());
cmd.stderr(Stdio::piped());
cmd
Some(env_map.into_iter().collect())
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor: trailing ; in PATH when env has no prior PATH; and strip_msys_env could filter a legitimate exe_dir.

Two small edge cases on the Windows branch:

  1. Line 216: format!("{};{}", exe_dir_str, current_path) yields "exe_dir;" when current_path is empty. Harmless on Windows but noisy — guard with if current_path.is_empty() { exe_dir_str } else { format!("{};{}", …) }.
  2. Because the exe-dir prepend happens before strip_msys_env (which filters PATH entries containing \msys, /msys, or /usr/), an exe_dir whose path happens to contain any of those substrings would be stripped back out. Not realistic for MSVC/GCC distributions, but worth noting since ordering matters.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/fbuild-core/src/subprocess.rs` around lines 198 - 235, In compute_env,
avoid adding a trailing semicolon when there is no existing PATH and prevent the
exe-dir you just prepended from being removed by strip_msys_env: change the PATH
construction to check current_path.is_empty() and use exe_dir_str alone if empty
(instead of format!("{};{}", ...)), and move the exe_dir prepend to after the
is_msys_environment/strip_msys_env block (or alternatively adjust strip_msys_env
to ignore the exact exe_dir_str) so the injected exe_dir value (exe_dir_str) is
not filtered out; update references to env_map, exe_dir_str,
is_msys_environment, and strip_msys_env in compute_env accordingly.

@zackees zackees merged commit a5ed36a into main Apr 24, 2026
75 of 78 checks passed
zackees added a commit that referenced this pull request Apr 24, 2026
The `Lint subprocess spawns` step calls `uv run python
ci/find_direct_subprocess.py --fail` but the job never installed
`uv`: `zackees/setup-soldr@v0` only installs Rust tooling. Add the
same `astral-sh/setup-uv@v3` step the dedicated `lint-subprocess.yml`
workflow already uses so this step can actually execute.

Addresses CodeRabbit review on #200.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
zackees added a commit that referenced this pull request Apr 24, 2026
Two CodeRabbit findings on the linter, both failure-critical when the
step is run with `--fail`:

* `scan_workspace()` only globbed `crates/**/*.rs`, missing
  `build.rs`, `examples/**`, `benches/**`, top-level `tests/**`, and
  any other `**/*.rs` that might appear in the repo. Scan from
  `REPO_ROOT` instead and exclude known junk dirs (`target`, `.git`,
  `node_modules`, `.venv`, `venv`).

* `_is_doc_or_string()` only filtered comment/doc *lines*. A normal
  line of Rust code containing `"Command::new("` inside a string
  literal would therefore fire a false positive on `--fail`. Add a
  whole-file literal-stripping pass that understands:
    - normal strings with `\"` / `\` escapes
    - char literals (distinguished from lifetimes)
    - raw strings `r"..."`, `r#"..."#`, any number of `#`s
    - byte strings `b"..."` and `br#"..."#`
    - line and nestable block comments
  The regex is run on the stripped source while line numbers stay
  aligned with the original, so reporting still points at the real
  line.

Adds a `--self-test` mode exercising all of the above (string,
raw-string, byte-string, real spawn, block comment, top-level
`build.rs` discovery, string-literal false-positive suppression).

Addresses CodeRabbit review on #200.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
zackees added a commit that referenced this pull request Apr 24, 2026
…s, add passthrough test

Three CodeRabbit findings in the new running-process-backed
subprocess module:

* The wait/timeout/kill/error-mapping block was duplicated verbatim
  between `run_command_passthrough` and `run_captured`. Extract a
  private `wait_or_kill()` helper so the contract (timeout kills the
  child, errors flatten into `FbuildError::Timeout` / `::Other`)
  lives in one place.

* `run_command_passthrough` had zero test coverage. Add a smoke test
  that spawns `cmd /C exit 7` / `sh -c 'exit 7'` and asserts the
  returned `i32` equals the child's real exit code (plus a 0 case).

* `compute_env` on Windows had two bugs:
    1. `format!("{};{}", exe_dir_str, current_path)` produced a
       trailing-semicolon `PATH` of `"exe_dir;"` whenever the inherited
       `PATH` was empty. Guard the empty case explicitly.
    2. The exe-dir prepend happened *before* `strip_msys_env`, and
       `strip_msys_env` drops PATH entries containing `\msys`,
       `/msys`, or `/usr/`. An exe_dir whose path contained one of
       those substrings would therefore be silently stripped back
       out. Reorder so `strip_msys_env` runs first and the exe_dir
       prepend is applied to the already-cleaned PATH.

Addresses CodeRabbit review on #200.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@zackees
Copy link
Copy Markdown
Member Author

zackees commented Apr 24, 2026

Addressed all four CodeRabbit threads in follow-up commits on this branch:

  1. .github/workflows/check-ubuntu.yml (Major) — the Lint subprocess spawns step called uv run ... but the job never installed uv. Added astral-sh/setup-uv@v3 immediately before that step, matching lint-subprocess.yml. → c67f809

  2. ci/find_direct_subprocess.py (Major) — two bugs:

    • scan_workspace() only globbed crates/**/*.rs. Now scans from REPO_ROOT (covers build.rs, examples/, benches/, top-level tests/, etc.), excluding target/, .git/, node_modules/, .venv/, venv/.
    • _is_doc_or_string() only filtered comment lines, so let s = "Command::new("; would have tripped --fail. Added a whole-file literal-stripping pass that understands normal strings (with \"/\ escapes), char literals (distinguished from lifetimes), raw strings (r"...", r#"..."#, any number of #s), byte strings (b"...", br#"..."#), and nestable block comments. Line numbers stay aligned with the original.
    • Added --self-test covering: string-literal false-positive suppression, raw string, byte string, real spawn, block comment, and top-level build.rs discovery. All pass; the existing 21 allowlisted hits on this branch are preserved exactly. → aec6df4
  3. crates/fbuild-core/src/subprocess.rs — run_command_passthrough (Minor)

    • Extracted the duplicated wait/timeout/kill/error-mapping block out of run_command_passthrough and run_captured into a shared private wait_or_kill() helper.
    • Added a cross-platform smoke test asserting the returned i32 equals the child's real exit code (tests both 7 and 0). → d94cb5d
  4. crates/fbuild-core/src/subprocess.rs — compute_env Windows branch (Minor)

    • Guarded the format!("{};{}", exe_dir_str, current_path) so an empty inherited PATH no longer produces a trailing-semicolon "exe_dir;".
    • Reordered: strip_msys_env now runs before the exe-dir prepend, so an exe_dir whose path contains \msys / /usr/ isn't filtered back out of PATH by the MSYS scrubber. → d94cb5d

Local verification: cargo check --workspace clean, cargo test -p fbuild-core passes (82 tests, including the new run_passthrough_propagates_exit_code), uv run python ci/find_direct_subprocess.py --fail passes with 21/21 allowlisted sites, and --self-test passes.

zackees added a commit that referenced this pull request Apr 24, 2026
The "Lint subprocess spawns" step added in #200 invokes `uv run`, but
the Check Ubuntu job never installs uv, so the main-branch run fails
with `uv: command not found` (exit 127).

Add `astral-sh/setup-uv@v3` before the step — matches how the sibling
`lint-subprocess.yml` workflow already sets up uv.

Failing run: https://github.com/FastLED/fbuild/actions/runs/24905357266

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate all subprocess spawns to running-process; eliminate direct std::process / tokio::process usage

1 participant