agent-orchestrator

A gateway-first agent orchestration service inspired by OpenClaw patterns, adapted for remote clients (mobile/web/desktop).

Why this repo exists

This repo captures the implementation notes and architecture decisions from our research and discussion:

How OpenClaw handles agent-to-agent communication
How spawned CLI agents are executed and resumed
How provider routing chooses CLI vs embedded runtime
Where orchestration state should live
What changes are needed for mobile/remote clients

Recommended Architecture (v1)

One Gateway process owns sessions, runs, auth, and state.
Local-first runtime: single process + SQLite file DB (no required Docker setup).
Clients are thin remotes (mobile/web/desktop).
Agent execution uses an in-process async run queue in v1.
Runs stream events over WebSocket/SSE; clients also support cursor sync.
Per-session/provider thread IDs are persisted for resume semantics.

Target API Surface

POST /sessions
POST /sessions/:id/messages (idempotency key required)
POST /sessions/:id/spawn
POST /runs/:id/cancel
GET /runs/:id
GET /sessions/:id/events?cursor=...
WS /stream?sessionId=... (or SSE)

Implemented API (Current)

GET /health
POST /sessions
GET /sessions
GET /sessions/:id
POST /sessions/:id/messages
GET /sessions/:id/events?cursor=...&limit=...
POST /runs
GET /runs/:id
POST /runs/:id/transition
POST /runs/:id/spawn
POST /sessions/:id/send
GET /runs/:id/wait?timeout_s=...&interval_ms=...
GET /runs/:id/diagnose
GET /daemon/status
POST /runs/:id/reply-parent (child approval/permission escalation to parent session)

When daemon auth tokens are configured, all endpoints except GET /health require Authorization: Bearer <token>.

Docs

docs/summary.md: condensed technical summary of OpenClaw behavior and implementation guidance.
docs/v1-decisions.md: concrete v1 product and system defaults, including ownership and ACL model.
docs/requirements-and-plan.md: locked requirements and phased implementation plan for the first dogfoodable slice.
docs/api-contract.md: endpoint-level request/response/auth/error contract for agentd.
docs/runbook.md: operator runbook for startup, preflight, troubleshooting, and stale-daemon recovery.
skills/agent-orchestrator-cli/SKILL.md: operator skill for using agentd/agentctl workflows.

Phase 2 Scaffold (Implemented)

This repo now includes a runnable daemon + CLI with queued codex execution:

agentd: ./bin/agentd (wrapper for python3 -m orchestrator.server)
agentctl: ./bin/agentctl (wrapper for python3 -m orchestrator.cli)

Quickstart

Start daemon:
- ./bin/agentd
- Optional auth token:
  - AGENTD_AUTH_TOKEN=dev-token ./bin/agentd
Check daemon/worker-pool status:
- export AGENTCTL_AUTH_TOKEN=dev-token (if daemon auth is enabled)
- ./bin/agentctl daemon status
Create a session:
- ./bin/agentctl session create
Send user message (auto-enqueues run):
- ./bin/agentctl message send --session-id <SESSION_ID> --text "Respond with exactly: hi"
- Retry-safe:
  - ./bin/agentctl message send --session-id <SESSION_ID> --text "Respond with exactly: hi" --idempotency-key <REQUEST_KEY>
Wait for run completion:
- ./bin/agentctl run wait --run-id <RUN_ID>
Inspect events / tool calls:
- ./bin/agentctl events list --session-id <SESSION_ID>
- ./bin/agentctl events tail --session-id <SESSION_ID> --follow
- ./bin/agentctl run diagnose --run-id <RUN_ID>
Spawn and coordinate child sessions:
- ./bin/agentctl tool session-spawn --parent-run-id <PARENT_RUN_ID> --task "..."
- ./bin/agentctl tool session-send --session-id <CHILD_SESSION_ID> --text "..."
- ./bin/agentctl tool session-reply-parent --run-id <CHILD_RUN_ID> --text "Need approval to run: <command>" --request-kind approval_request
- Retry-safe escalation:
  - ./bin/agentctl tool session-reply-parent --run-id <CHILD_RUN_ID> --text "Need approval to run: <command>" --request-kind approval_request --idempotency-key <REQUEST_KEY>
- ./bin/agentctl tool session-wait --run-id <CHILD_RUN_ID>

Child-to-Parent Escalation Flow

Spawn child:
- SPAWN=$(./bin/agentctl tool session-spawn --parent-run-id <PARENT_RUN_ID> --task "Ask for approval before risky actions")
Capture child IDs:
- CHILD_SESSION_ID=$(printf '%s' "$SPAWN" | python3 -c 'import json,sys; print(json.load(sys.stdin)["session"]["id"])')
- CHILD_RUN_ID=$(printf '%s' "$SPAWN" | python3 -c 'import json,sys; print(json.load(sys.stdin)["run"]["id"])')
Child escalates to parent session without passing parent session ID:
- ./bin/agentctl tool session-reply-parent --run-id "$CHILD_RUN_ID" --request-kind approval_request --text "Need approval to run: rm -rf ./tmp-cache"
Parent observes escalation event/message in parent session stream and responds (for example by sending follow-up to child session):
- ./bin/agentctl events list --session-id <PARENT_SESSION_ID>
- ./bin/agentctl tool session-send --session-id "$CHILD_SESSION_ID" --text "Approved. Proceed with cleanup only in ./tmp-cache."

Default daemon URL is http://127.0.0.1:8765 and default DB path is .data/agent-orchestrator.db.

Codex Daemon Defaults

Optional daemon-wide codex profile:
- ./bin/agentd --codex-profile mobile
Env fallback for profile:
- AGENTD_CODEX_PROFILE=mobile ./bin/agentd
Optional repeatable codex args injected before exec:
- ./bin/agentd --codex-extra-arg=--verbose --codex-extra-arg=--color --codex-extra-arg never
Optional worker pool size (concurrent runs):
- ./bin/agentd --max-concurrency 4
Optional bearer auth tokens:
- ./bin/agentd --auth-token dev-token
- ./bin/agentd --auth-token token-a --auth-token token-b
- AGENTD_AUTH_TOKEN=dev-token ./bin/agentd
- AGENTD_AUTH_TOKENS=token-a,token-b ./bin/agentd
If one or more auth tokens are configured, all non-/health endpoints require a matching bearer token.
agentctl auth token options:
- ./bin/agentctl --auth-token dev-token daemon status
- AGENTCTL_AUTH_TOKEN=dev-token ./bin/agentctl daemon status
If no profile is configured, codex runs with your existing local codex config.
Default concurrency is 4 workers.

Machine-friendly CLI Output

Print compact JSON:
- ./bin/agentctl --format json session create
Print compact daemon/pool status JSON:
- ./bin/agentctl --format json daemon status
Extract a field directly:
- ./bin/agentctl --field session.id session create
- ./bin/agentctl --field run.id message send --session-id <SESSION_ID> --text "hi"
- ./bin/agentctl --field alive_workers daemon status
- ./bin/agentctl --field queue_counts.queued_count daemon status

Idempotency Keys

For retryable mutating requests, provide one idempotency_key and reuse it on retries.
Supported API endpoints:
- POST /sessions/:session_id/messages (role=user)
- POST /sessions/:session_id/send
- POST /runs/:parent_run_id/spawn
- POST /runs/:child_run_id/reply-parent
CLI flags:
- message send --idempotency-key <REQUEST_KEY>
- tool session-send --idempotency-key <REQUEST_KEY>
- tool session-spawn --idempotency-key <REQUEST_KEY>
- tool session-reply-parent --idempotency-key <REQUEST_KEY>

Run Diagnostics

Diagnose a run:
- ./bin/agentctl run diagnose --run-id <RUN_ID>
Useful fields:
- ./bin/agentctl --field diagnostics.worker_id run diagnose --run-id <RUN_ID>
- ./bin/agentctl --field diagnostics.worker_alive run diagnose --run-id <RUN_ID>
- ./bin/agentctl --field diagnostics.alive_workers run diagnose --run-id <RUN_ID>
- ./bin/agentctl --field diagnostics.max_concurrency run diagnose --run-id <RUN_ID>
- ./bin/agentctl --field diagnostics.queue_counts.queued_count run diagnose --run-id <RUN_ID>
- ./bin/agentctl --field diagnostics.queue_position run diagnose --run-id <RUN_ID>
- ./bin/agentctl --field diagnostics.likely_issue run diagnose --run-id <RUN_ID>
- ./bin/agentctl --field diagnostics.active_pid run diagnose --run-id <RUN_ID>
- ./bin/agentctl --field diagnostics.codex_invocation.profile run diagnose --run-id <RUN_ID>
- ./bin/agentctl --field diagnostics.codex_invocation.extra_args run diagnose --run-id <RUN_ID>

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
bin		bin
docs		docs
orchestrator		orchestrator
skills/agent-orchestrator-cli		skills/agent-orchestrator-cli
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-orchestrator

Why this repo exists

Recommended Architecture (v1)

Target API Surface

Implemented API (Current)

Docs

Phase 2 Scaffold (Implemented)

Quickstart

Child-to-Parent Escalation Flow

Codex Daemon Defaults

Machine-friendly CLI Output

Idempotency Keys

Run Diagnostics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-orchestrator

Why this repo exists

Recommended Architecture (v1)

Target API Surface

Implemented API (Current)

Docs

Phase 2 Scaffold (Implemented)

Quickstart

Child-to-Parent Escalation Flow

Codex Daemon Defaults

Machine-friendly CLI Output

Idempotency Keys

Run Diagnostics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages