-
Notifications
You must be signed in to change notification settings - Fork 0
Home
noetl-doctor is the out-of-process runtime-reaper surface
for NoETL. It is the operator-facing CLI + MCP server that
exposes the same diagnostic and recovery surface the in-process
command reaper uses, so a monitoring system or an SRE can
trigger or observe recovery without poking the runtime
directly.
This wiki will grow with pages covering the bundled playbooks
and CLI verbs. For now the in-repo
README.md
is the most current reference.
-
noetl-doctor detect— surface stuck executions / stale commands. -
noetl-doctor reachability— verify cluster health and service-to-service routes. -
noetl-doctor repair trigger-reaper/repair run-playbook— nudge or run a recovery playbook. -
noetl-doctor provision <verb>— set up doctor MCP wiring on a fresh cluster. -
noetl-doctor mcp serve— expose the same surface as an MCP server. -
noetl-doctor playbooks— list the bundled diagnostic playbooks shipped with the CLI.
Bundled playbooks live in
playbooks/
in the doctor repo: detect_stuck_executions,
inspect_stale_commands, reachability_smoke,
trigger_command_reaper, provision_doctor_mcp. Each is a
NoETL playbook in the standard repos/ops shape (workload
action dispatch, kind:shell with psql/curl/jq, kube-context
guard).
The in-process command reaper inside noetl-server is the authority for command-table recovery and does the actual work. Doctor is the surface a monitoring system or an SRE calls to observe or trigger the same recovery path from outside the runtime. The two never disagree about claim-policy correctness; doctor delegates to the reaper.
Detailed pages will land as the surface grows. Start with the in-repo README.
- noetl wiki — the in-process reaper and command table.
- cli wiki — the main NoETL CLI.
- ops wiki — operational manifests and Helm install.
- Ephemeral Blueprints — why doctor stays out-of-process.