Client::stop() SIGKILLs CLI without grace period — orphans MCP child processes downstream

## Summary

`Client::stop()` in `src/lib.rs` SIGKILLs the agent runtime CLI immediately after `session.destroy` returns, with no SIGTERM grace period. This races with the runtime's own MCP cleanup and causes orphaned MCP stdio child processes to accumulate across normal app restarts in every downstream consumer (notably the GitHub Copilot Tauri app).

## Current behavior

```rust
// crates/copilot-sdk/src/lib.rs:1894 (vendored into github/github-app)
if let Some(mut child) = child
    && let Err(e) = child.kill().await   // tokio Child::kill = SIGKILL on Unix
{
    errors.push(Error::Io(e));
}
```

Tokio's `Child::kill()` is unconditionally SIGKILL on Unix. SIGKILL is uncatchable, so the runtime's MCP cleanup (which calls a synchronous `pgrep -P` enumeration inside the Node process before signaling descendants via `process.kill`) is interrupted mid-flight whenever cleanup takes longer than the few ms between `session.destroy` returning and SIGKILL landing.

This nullifies the protections from:
- `copilot-agent-runtime` PR [#7517](https://github.com/github/copilot-agent-runtime/pull/7517) (killProcessTree on transport close)
- `copilot-agent-runtime` PR [#8103](https://github.com/github/copilot-agent-runtime/pull/8103) (fire-and-forget MCP shutdown in `dispose()`)

Both shipped, both verified present in the running runtime, leak still happens.

## Evidence

On my machine running github-app `0aa3a6b41` (May 21 build) + copilot-agent-runtime built from HEAD (May 20), after 3 days of normal Copilot usage:

```
11 orphaned `uv tool uvx microsoft-fabric-rti-mcp` processes
~220 MB resident total
oldest 3 days, newest 1 hour old (well after #8103 merged)
all ppid=1 (reparented to init)
each carries a live python child also orphaned (22 leaked processes total)
```

Detection:
```bash
ps -Ao pid,ppid,etime,rss,command | awk '$2==1 && /uvx|fabric-rti/'
```

The leak is reliably reproducible by quitting GitHub Copilot. Node-based MCP servers (npm-exec'd) clean up correctly via stdin-EOF voluntary exit; uvx-launched servers do not because `uv tool uvx` is a Rust supervisor that holds the child's stdin pipe open from its own side, so the python child never sees EOF. The only reliable cleanup path for uvx-launched servers is the runtime's `killProcessTree` — which is racing the SDK's SIGKILL and losing.

## Proposed fix

In `Client::stop()`, replace `child.kill().await` with a SIGTERM-then-SIGKILL escalation:

```rust
#[cfg(unix)]
{
    use nix::sys::signal::{self, Signal};
    use nix::unistd::Pid;
    if let Some(pid_raw) = child.id() {
        let _ = signal::kill(Pid::from_raw(pid_raw as i32), Signal::SIGTERM);
    }
    match tokio::time::timeout(std::time::Duration::from_secs(3), child.wait()).await {
        Ok(_) => {}  // graceful exit, killProcessTree had time to run
        Err(_) => {
            if let Err(e) = child.kill().await {
                errors.push(Error::Io(e));
            }
        }
    }
}
#[cfg(not(unix))]
{
    if let Err(e) = child.kill().await {
        errors.push(Error::Io(e));
    }
}
```

Pair this with a runtime-side fix that installs `process.on("SIGTERM"|"SIGINT")` handlers in the CLI entrypoint and awaits `session.dispose()` (also needed because today the runtime has **no signal handlers at all** — `grep process.on.*SIG src --include="*.ts"` → zero hits). Tracked at github/copilot-agent-runtime#8598.

## References

- Runtime issue (companion): https://github.com/github/copilot-agent-runtime/issues/8598
- Downstream symptom in Tauri app: https://github.com/github/github-app/issues/5557
- Original Slack thread: https://github.slack.com/archives/C093WSH3L8J/p1778030525897799
- Prior runtime PRs that don't close this gap: #7517, #8103


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client::stop() SIGKILLs CLI without grace period — orphans MCP child processes downstream #1381

Summary

Current behavior

Evidence

Proposed fix

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Client::stop() SIGKILLs CLI without grace period — orphans MCP child processes downstream #1381

Description

Summary

Current behavior

Evidence

Proposed fix

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions