Skip to content
Kadyapam edited this page May 24, 2026 · 2 revisions

NoETL Doctor

noetl-doctor is the out-of-process runtime-reaper surface for NoETL. It is the operator-facing CLI + MCP server that exposes the same diagnostic and recovery surface the in-process command reaper uses, so a monitoring system or an SRE can trigger or observe recovery without poking the runtime directly.

This wiki will grow with pages covering the bundled playbooks and CLI verbs. For now the in-repo README.md is the most current reference.

What doctor provides

  • noetl-doctor detect — surface stuck executions / stale commands.
  • noetl-doctor reachability — verify cluster health and service-to-service routes.
  • noetl-doctor repair trigger-reaper / repair run-playbook — nudge or run a recovery playbook.
  • noetl-doctor provision <verb> — set up doctor MCP wiring on a fresh cluster.
  • noetl-doctor mcp serve — expose the same surface as an MCP server.
  • noetl-doctor playbooks — list the bundled diagnostic playbooks shipped with the CLI.

Bundled playbooks live in playbooks/ in the doctor repo: detect_stuck_executions, inspect_stale_commands, reachability_smoke, trigger_command_reaper, provision_doctor_mcp. Each is a NoETL playbook in the standard repos/ops shape (workload action dispatch, kind:shell with psql/curl/jq, kube-context guard).

Architecture position

The in-process command reaper inside noetl-server is the authority for command-table recovery and does the actual work. Doctor is the surface a monitoring system or an SRE calls to observe or trigger the same recovery path from outside the runtime. The two never disagree about claim-policy correctness; doctor delegates to the reaper.

Pages

Detailed pages will land as the surface grows. Start with the in-repo README.

Cross-references

Clone this wiki locally