Add `mcplex doctor` / healthcheck subcommand for liveness probing

## Problem

When MCPlex is deployed as a long-running service (launchd, systemd, docker, etc.) and the process dies or is accidentally unloaded, downstream clients (Claude Code bridge, Claude Desktop, custom agents) silently see `failed` connections with no easy way to diagnose whether the issue is:

- Gateway process not running
- Port collision / wrong bind address
- Config parse error
- One or more backing MCP servers failing handshake

In my case, launchd had silently unloaded `com.mcplex.gateway.local`. The downstream Hermes agent just showed `mcplex-gateway [http]: failed` with no actionable info, and the stale error log (see separate issue) pointed at a TOML error that had been fixed days earlier.

## Proposed

A `mcplex doctor` (or `mcplex health`, `mcplex status`) CLI subcommand that:

1. Reads the config and reports listen address + configured servers
2. Probes the configured listen address (e.g. `http://127.0.0.1:3100/mcp`) for an MCP `initialize` response
3. If reachable: prints per-server handshake status (similar to the startup log summary), tool/resource counts, and uptime
4. If unreachable: prints the probe error and a hint (`is the gateway running? check launchd/systemd/docker status`)
5. Exits non-zero on any failure so it can be wired into external monitors (launchd `WatchPaths`, cron, CI smoke tests, `hermes` pre-start checks)

Example output:

```
$ mcplex doctor --config ~/.config/mcplex/macmini.toml
Gateway: http://127.0.0.1:3100  [OK, uptime 2h 14m]
Dashboard: http://127.0.0.1:9090  [OK]
Servers (7/7 connected):
  memory              [OK]  16 tools  3 resources  5 prompts
  context-provider    [OK]  10 tools
  github              [OK]  26 tools
  applescript         [OK]   1 tool
  desktop-commander   [OK]  26 tools  2 resources
  constrictor         [OK]  13 tools
  telegram            [OK]   9 tools
Router: Semantic (MetaTool, top_k=5)
Cache: enabled (TTL 300s)
```

And on failure:

```
$ mcplex doctor
Gateway: http://127.0.0.1:3100  [UNREACHABLE]
  → connection refused; no process listening on 3100
  → hint: launchctl list | grep mcplex  (or: systemctl status mcplex)
exit 1
```

## Why it matters

- Turns silent failures into actionable output
- Enables external wrappers (agents, orchestrators, CI) to gate behavior on gateway health
- Complements `--check` (which only validates config without running), giving a runtime-health counterpart

Happy to contribute a PR if this direction looks right.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `mcplex doctor` / healthcheck subcommand for liveness probing #13

Problem

Proposed

Why it matters

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add mcplex doctor / healthcheck subcommand for liveness probing #13

Description

Problem

Proposed

Why it matters

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Add `mcplex doctor` / healthcheck subcommand for liveness probing #13