Skip to content

feat(orchestration): preamble hardening + heartbeat detector + ask verb#1391

Closed
Jinwoo-H wants to merge 3 commits intomainfrom
Jinwoo-H/orca-improvements-preamble
Closed

feat(orchestration): preamble hardening + heartbeat detector + ask verb#1391
Jinwoo-H wants to merge 3 commits intomainfrom
Jinwoo-H/orca-improvements-preamble

Conversation

@Jinwoo-H
Copy link
Copy Markdown
Contributor

@Jinwoo-H Jinwoo-H commented May 4, 2026

Summary

  • Fixes ORCHESTRATOR_FEEDBACK.md items 15, 7, 9
  • Item 15 — AskUserQuestion reroute: preamble BEHAVIOR RULE 1 forbids local AskUserQuestion tool; new orca orchestration ask CLI verb (~60 LOC thin wrapper over send --type decision_gate + check --wait + parse) gives workers a single-command substitute. Thread-scoped wait, group-address reject, bare --json bypass of printResult.
  • Item 7 — worker_done body shape: preamble template now shows non-empty --body "3-sentence summary" + reportPath in payload example.
  • Item 9 — heartbeat end-to-end: new heartbeat MessageType + preamble BEHAVIOR RULE "send every 5 min" + coordinator-side last_heartbeat_at column on dispatch_contexts (schema v2 migration with user_version gate, explicit CREATE INDEX preservation for idx_messages_id / idx_inbox / idx_thread) + 10-min stale-warn in coordinator tick. Heartbeat carries dispatchId (not just taskId) for retry-race correctness.
  • Scope-violation guardrail: explicitly NOT shipped this PR. PR 1375 was user-authorized (not a real violation). Prose-honor-system rules were considered and rejected; a future PATH-shim PR can tackle that if a real need emerges.
  • Design doc at DESIGN_DOC_PREAMBLE_FIX.md (local-only)

Test plan

  • pnpm typecheck clean
  • 118 tests pass across preamble/db/coordinator/rpc-orchestration
  • Manual smoke: dispatch to a test terminal — verify preamble has all 3 BEHAVIOR RULEs
  • Manual smoke: orca orchestration ask --help — verify verb registered
  • Manual smoke: start a dispatch then don't heartbeat for >10 min — verify coordinator tick emits stale warning

Notes

  • Three logical commits in this PR: (1) preamble rules + schema v2 migration + DB helpers, (2) coordinator heartbeat handler + 10-min stale detector + dispatchId threading, (3) orca orchestration ask verb with thread-scoped wait

Jinwoo-H and others added 3 commits May 4, 2026 04:27
- Preamble (#7, #15, #9): worker_done body ("3-sentence summary" + reportPath),
  BEHAVIOR RULE #1 forbidding AskUserQuestion, heartbeat every 5 minutes with
  taskId+dispatchId payload, AFTER YOU SEND grace window.
- Schema v2 migration: adds 'heartbeat' to messages.type CHECK, adds
  dispatch_contexts.last_heartbeat_at, gated by user_version PRAGMA with
  transactional rebuild + explicit CREATE INDEX to avoid silent perf regress.
- DB helpers: recordHeartbeat (dispatched-only), getStaleDispatches,
  getThreadMessagesFor (thread+handle scoped for ask).

Co-authored-by: Orca <help@stably.ai>
Handle incoming 'heartbeat' messages by calling recordHeartbeat keyed on
payload.dispatchId (strict — log-and-skip if missing, no taskId fallback so
a straggler heartbeat from a previously-failed dispatch cannot mask a hung
retry per §5.3.4). On every tick after the 10-minute threshold, emit one
log per stale dispatched row — no auto-fail.

Also threads dispatchId through buildDispatchPreamble so workers can
attribute their heartbeats back to the correct dispatch context.

Co-authored-by: Orca <help@stably.ai>
Adds a CLI verb that sends a decision_gate message and blocks on the
coordinator's reply, scoped to the outbound message's thread. Group
addresses (@ALL, @idle, …) are rejected — fan-out questions must use
send --type decision_gate explicitly.

--json emits bare single-line JSON (bypassing printResult) so workers can
pipe `orca orchestration ask … --json | jq -r .answer` without unwrapping
an RPC envelope; human mode prints just the answer. On timeout the verb
exits 1 and returns {answer: null, timedOut: true}.

This is the CLI surface BEHAVIOR RULE #1 in the dispatch preamble points
workers at instead of AskUserQuestion.

Co-authored-by: Orca <help@stably.ai>
@Jinwoo-H
Copy link
Copy Markdown
Contributor Author

Jinwoo-H commented May 4, 2026

Superseded by #1403, which bundles this PR with the other three orchestration improvement PRs and resolves the merge conflicts they shared. Branch preserved for diff comparison — not deleting.

@Jinwoo-H Jinwoo-H closed this May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant