feat(harness-node): reactive approval resume + orchestrator cleanup by ytallo · Pull Request #156 · iii-hq/workers

ytallo · 2026-05-18T21:13:04Z

Summary

Approval gate becomes iii-native and fully reactive. policy::approval_gate replies synchronously with allow / deny / pending; the polling await-loop is gone. approval::resolve writes {decision, reason} to approvals/<sid>/<cid> and a state trigger on the approvals scope (approval::is_decision_write → approval::on_decision_written) wakes the orchestrator by republishing turn::step_requested.
Orchestrator FSM grows a function_awaiting_approval state (now 11 states). Pending calls accumulate on TurnStateRecord.awaiting_approval; PreparedEntry.pre_approved lets resumed turns dispatch without re-asking the gate. Aborts during pause write per-call {decision: 'aborted'} records so the same trigger releases the parked step.
Three more poll loops swapped for reactive triggers:
- run::start_and_wait → new turn::on_terminal_state state trigger + in-process waiter; drops sync_poll_interval_ms.
- hook-fanout::publish_collect → stream trigger on agent::hook_reply routes replies into an in-process collector; drops poll_interval_ms.
- Harness sessions fanout → state trigger on session/<id>/turn_state creates instead of a 1 Hz state::list diff loop.
- Orchestrator abort path → reactive turn::on_abort_signal trigger.
hook-fanout::publish_collect response now carries publish.{ok, error} and publish_failed so consultBefore can fail closed on bus unavailability with a gate_unavailable denial. first_block_wins preserves the full blocking reply (status, subscriber, approval_gate markers).
Session entries gain an explicit timestamp and IiiStateSessionStore.loadEntries sorts by (timestamp, id) so resumed approval replies land in the right transcript position.
Console wiring: model picker is driven by the harness-node models-catalog (composite provider::id keys); the real backend implements resolveApproval so the chat UI can approve/deny in-line.
Architecture docs refreshed to track everything above (system diagram, 11-state FSM, approval sequence, per-worker pages).

Test plan

pnpm exec tsc --noEmit clean
pnpm exec vitest run — 289 tests across 49 files green, including 11 new on-terminal tests, awaiting-approval, abort, on-decision-written, and the approval-resume e2e
Manual smoke: trigger an approval-gated tool from the chat UI, approve from the console, confirm the turn resumes and emits approval_resolved + the function result
Manual smoke: deny the same call, confirm function_call_denied event and orchestrator continues with denial result
Manual smoke: abort a session paused on approvals, confirm pending calls resolve with aborted and the turn tears down

Summary by CodeRabbit

Release Notes

New Features
- Approval workflow now processes synchronously with reactive decision resolution, eliminating polling delays.
- Added model grouping by provider in the model picker dropdown.
- Function call approvals display real-time status feedback (approving, denying, waiting for agent).
- Last-used model is now persisted and restored across sessions.
Bug Fixes
- Fixed streaming indicator accuracy across multi-turn conversations.
- Deduplicates repeated function-call events with the same ID.
Refactor
- Converted turn orchestrator and approval systems from polling to event-driven state management.

…152) * feat(console): drive model picker from harness-node models-catalog Replace the hardcoded `ModelId` union and four-entry `MODELS` constant with composite `provider::<catalog_model_id>` keys sourced from the harness-node models-catalog worker. - `lib/catalog-model-key.ts`: parse/format `provider::id` (uses `::` separator constant `CATALOG_MODEL_KEY_SEP`). - `lib/models-catalog.ts`: client wrapper around `models::list` / `models::get` for the picker. - `hooks/use-model-picker-source.ts`: live-updating picker source with a static fallback for when the engine is down or the mock backend is in use. - `types/chat.ts`: `ModelId` widened to `string`; `STATIC_MODEL_OPTIONS` + `DEFAULT_MODEL` replace the old enum-shaped `MODELS`. - `components/chat/ModelPicker.tsx`: accept `options` + `loading` props; fall back to `{id, label}` from the current value if the catalog is empty so the picker is never blank. - `components/chat/Composer.tsx`, `components/chat/ChatView.tsx`, `pages/Chat.tsx`, `hooks/use-conversations.ts`, `pages/Examples/sections/*`, `pages/Playground/index.tsx`: thread the catalog-shaped model ids through; legacy heuristic ids without `::` still get the old `claude*` / `gemini*` provider guess in `resolveRunParams`. - `lib/backend/real.ts`: `resolveRunParams` parses `provider::id` and falls back to the heuristic for legacy ids; doc comment updated. * refactor(real): remove debug instrumentation from session event handling This commit removes the debug instrumentation code that was previously used to log session events to a local ingestion endpoint. The removal streamlines the event handling process in the `realStream` function, improving code clarity and maintainability. * feat(approval-gate): port PR #150 with iii-native resume + console wiring Port the fail-closed approval-gate redesign from the Rust workspace to harness-node and finish wiring the console/web UI so chat sessions can approve/deny intercepted tool calls without polling. Backend: - intercept-at-gate: subscriber returns block_reply with status pending|denied; resolved decisions are drained later via approval::consume; pending entries can be force-resolved via approval::sweep_session - run::resume rebuilds terminal TurnStateRecord (preserving turn_count + max_turns) and publishes a fresh turn::step_requested - handle_resolve emits approval_resolved + asks the orchestrator to resume via a single iii::durable::publish on turn::step_requested; no polling loop and no explicit timeout (the bus default is the right backstop) - subscriber detects terminal records and rebuilds via build_resume_plan before stepping - hook-fanout merge_first_block_wins preserves the full blocking reply (status, function_call_id, subscriber, approval_gate, denial) - publish-collect emits publish.{ok,error} + publish_failed so the orchestrator can fail closed on bus unavailability - handle_awaiting + tearing_down each call approval::consume first; on non-empty result they stage the resolved entries as prepared calls and transition (or resurrect) into FunctionExecute - router::abort composes [state::set abort, approval::sweep_session] - session-tree sorts entries by (timestamp, id) so replies always appear in causal order Console/web: - approve/deny buttons call approval::resolve via the iii client - the ui::session::event stream stays subscribed for the chat lifetime (no auto-close on first agent_end) so multi-turn approvals continue to flow - approval prompts auto-scroll into view; submitting state shows disabled buttons + visible feedback while waiting for the agent to resume - fcall-start events deduplicate on functionCallId; pending_approval placeholders suppress their fcall-end until the real result lands Tests: TDD'd helpers, intercept/consume/sweep handlers, resume plumbing, the orchestrator integration points, and an end-to-end approval-resume.e2e covering intercept -> resolve -> publish -> subscriber wake -> rebuild -> consume -> execute. * feat(sessions-poll): implement reactive sessions fanout with state triggers Replaces the polling mechanism for session updates with a reactive approach using state triggers. The new implementation filters state events for session creation and directly pushes updates to subscribers without polling. This change enhances efficiency and reduces latency in session updates, streamlining the fanout process. Key components include condition and handler functions for session creation events, and a direct trigger registration for state changes. * Simplify iii-native approval gate flow (#154) * Make approval pause explicit in the turn state machine The iii-native approval flow needs a non-terminal state where the orchestrator can wait without tearing down the turn or rebuilding from provisioning. This adds the state value and locks its non-terminal behavior in the state test. Constraint: Plan task 1 targets harness-node TypeScript only; Rust workers remain untouched. Rejected: Keep using stopped/resume as the pause marker | later tasks remove that detour. Confidence: high Scope-risk: narrow Tested: pnpm test tests/turn-orchestrator/state.test.ts Not-tested: Full harness-node suite deferred until the batch verification step * Let the turn record own approval waits The pause-in-place design needs the orchestrator record to carry the function calls that are waiting on human decisions. This adds the awaiting_approval entry shape without changing any behavior yet. Constraint: Field remains optional so fresh and existing records continue to deserialize without migration. Rejected: Store pending calls only in approval-gate state | the simplification moves ownership into the orchestrator FSM. Confidence: high Scope-risk: narrow Tested: pnpm test tests/turn-orchestrator/state.test.ts Tested: pnpm typecheck Not-tested: Full harness-node suite deferred until the batch verification step * Make prepared function calls extensible for approvals The upcoming pause-in-place approval handler needs to persist per-call metadata without overloading tuple positions. This converts prepared calls to an object shape and adds the optional pre_approved flag while preserving the current execution behavior. Constraint: Existing staged values may use function_call or legacy tool_call keys, so the loader keeps both forms. Rejected: Add a parallel pre-approved lookup table | it would split one prepared-call concern across two state keys. Confidence: medium Scope-risk: moderate Directive: Keep PreparedEntry as the single persisted shape for prepared-call execution metadata. Tested: pnpm test tests/turn-orchestrator/persistence-prepared.test.ts tests/turn-orchestrator/functions.test.ts tests/turn-orchestrator/assistant.test.ts tests/turn-orchestrator/tearing-down.test.ts Tested: pnpm test tests/turn-orchestrator Tested: pnpm test Tested: pnpm typecheck Tested: pnpm lint * Define dispatch outcomes before approval cutover The approval pause cutover needs dispatchWithHook to distinguish successful execution, hard denial, and pending approval. This commit introduces the discriminated result type before changing runtime dispatch behavior. Constraint: Task 4 is type-only; dispatchWithHook still returns FunctionResult until the later refactor task. Rejected: Change dispatch behavior now | the plan keeps the behavioral cutover staged. Confidence: high Scope-risk: narrow Tested: pnpm test tests/turn-orchestrator/agent-call.test.ts Tested: pnpm typecheck Not-tested: Full harness-node suite deferred until batch verification * Resume approved calls from an explicit wait state The new approval pause state needs a handler that stays idle until every awaited decision is present, then updates prepared calls with either pre_approved metadata or a denial result before returning to function execution. Constraint: Decisions are read from the approvals state scope using the existing session/function-call key shape. Rejected: Poll approval::consume from generic assistant or teardown states | the simplification makes awaiting approval an explicit FSM state. Confidence: medium Scope-risk: moderate Directive: Keep missing decisions as a no-op so durable turn::step_requested wakeups remain idempotent. Tested: pnpm test tests/turn-orchestrator/awaiting-approval.test.ts Tested: pnpm test tests/turn-orchestrator Tested: pnpm typecheck Not-tested: Full harness-node suite deferred until batch verification * Route approval waits through the turn transition switch The FSM can now enter function_awaiting_approval, so the central step dispatcher must route that state to the new handler instead of silently doing nothing. Constraint: This only wires the state; handleExecute has not yet been cut over to produce it. Rejected: Depend on fallthrough/no-op behavior | the wait state needs explicit wake handling from turn::step_requested. Confidence: high Scope-risk: narrow Tested: pnpm test tests/turn-orchestrator/state.test.ts Tested: pnpm test tests/turn-orchestrator Tested: pnpm typecheck Not-tested: Full harness-node suite deferred until batch verification * Normalize awaiting-approval files for Biome The new wait-state handler passed behavior and type checks, but Biome required import ordering and line wrapping before the batch could be called clean. Constraint: Formatting-only follow-up after Task 5 and Task 6 verification. Confidence: high Scope-risk: narrow Tested: pnpm test Tested: pnpm typecheck Tested: pnpm lint * Return structured outcomes from hooked dispatch The approval cutover needs dispatchWithHook to report whether a call ran, was denied, or is waiting for human approval. This changes the dispatcher API to return DispatchResult and keeps the LLM-facing agent::call wrapper unwrapped as a FunctionResult. Constraint: handleExecute still consumes the old return shape until the next cutover task. Rejected: Keep pending as a placeholder FunctionResult | the new FSM state owns approval waiting explicitly. Confidence: medium Scope-risk: moderate Directive: Callers must branch on DispatchResult.kind before treating the value as a FunctionResult. Tested: pnpm test tests/turn-orchestrator/agent-call.test.ts tests/turn-orchestrator/functions.test.ts Not-tested: pnpm typecheck fails until handleExecute is cut over in Task 9 * Pause function execution in the approval wait state handleExecute now treats pending approval as a real FSM pause instead of manufacturing a placeholder function result. It also resumes pre-approved calls without re-entering the hook and emits denial results without dispatching blocked calls. Constraint: Old placeholder helper tests are skipped until the later cleanup task removes that machinery. Rejected: Keep pending approval as a terminating FunctionResult | that leaks an artificial result into the transcript and keeps the resume detour alive. Confidence: medium Scope-risk: moderate Directive: Preserve the existing executed-call check before dispatching so retrying a step does not duplicate approved function execution. Tested: pnpm test tests/turn-orchestrator/functions.test.ts Tested: pnpm test tests/turn-orchestrator Tested: pnpm typecheck Tested: pnpm test tests/integration/approval-resume.e2e.test.ts Tested: pnpm test Tested: pnpm lint * Keep approval gate pending state in the orchestrator The gate no longer writes a pending state-bus record when policy asks for human approval. It emits approval_requested and returns a pending hook reply; the turn orchestrator owns the pending-call list on its state record. Constraint: The approvals state bus now stores resolved decisions only. Rejected: Keep handleIntercept in the gate path | it preserves the obsolete pending-record lifecycle and duplicates orchestrator state. Confidence: medium Scope-risk: moderate Directive: Do not add state-bus pending writes back to policy::approval_gate; pending ownership belongs to TurnStateRecord.awaiting_approval. Tested: pnpm test tests/approval-gate/gate-subscriber.test.ts tests/approval-gate/events.test.ts Tested: pnpm typecheck * Drop approval request expiry from event types Approval prompts no longer time out through the gate, so the shared event types should not promise an expires_at timestamp. The console typecheck also needed the currently-unused mode parameter marked intentionally unused. Constraint: No timeout sweep behavior remains in the target approval-request event contract. Rejected: Keep expires_at as optional | it would preserve dead UI expectations around automatic expiry. Confidence: high Scope-risk: narrow Tested: pnpm typecheck in harness-node Tested: pnpm typecheck in console/web * Remove obsolete approval intercept lifecycle The gate-subscriber now handles needs_approval directly and the orchestrator owns pending approval state, so the old intercept helper and its state-writing tests are dead. The old approval-resume integration is paused until the new pause-in-place integration lands later in the plan. Constraint: Task 26 is responsible for replacing the legacy integration coverage. Rejected: Keep intercept.ts as compatibility shim | no production caller remains and it would preserve obsolete pending-record semantics. Confidence: medium Scope-risk: moderate Directive: Do not reintroduce approval-gate/intercept; route policy::approval_gate pending replies through gate-subscriber directly. Tested: pnpm typecheck in harness-node Tested: pnpm test tests/approval-gate tests/integration/approval-resume.e2e.test.ts * Normalize console backend imports Biome sorted the touched console backend imports after the approval event type cleanup. This keeps the batch verification clean without changing runtime behavior. Constraint: Formatting-only follow-up from console/web verification. Confidence: high Scope-risk: narrow Tested: pnpm typecheck in console/web Tested: pnpm exec biome check src/lib/backend/real.ts src/types/iii-agent-event.ts in console/web * Let approval resolution overwrite the wait record The orchestrator now pauses in a native awaiting-approval state and reads the final decision directly from the state bus, so resolve no longer needs to validate or mutate a pending lifecycle record. It writes the compact decision payload and wakes the turn subscriber once the write succeeds. Constraint: Keep approval::list_pending export until its registration is removed in the later registry cleanup task. Rejected: Preserve pending/status/resolved_at transitions | the paused turn only needs decision and reason, and stale lifecycle fields recreate the old consume path. Confidence: high Scope-risk: narrow Directive: Do not reintroduce preexisting pending-record checks here; the paused turn owns waiting semantics. Tested: pnpm --filter harness-node test tests/approval-gate/pending.test.ts * Retire wake-failed approval events Resolve now writes a decision and publishes the turn wake directly. There is no retry wrapper that converts wake errors into stream events, so keeping approval_wake_failed advertised a lifecycle branch that no longer exists. Constraint: Preserve approval_requested and approval_resolved event envelopes unchanged for existing consumers. Rejected: Keep an unused event helper for compatibility | no caller remains and the event would only describe obsolete resume failure handling. Confidence: high Scope-risk: narrow Directive: Add a new lifecycle event only when a real runtime branch emits it. Tested: rg -n "emitApprovalWakeFailed|approval_wake_failed" harness-node/src harness-node/tests Tested: pnpm --filter harness-node typecheck && pnpm --filter harness-node test tests/approval-gate * Remove approval placeholder resurrection helpers Approval waits now stay inside the function FSM, so the old placeholder, consume, and resurrection helpers no longer describe the runtime path. Finalize appends real function results directly, awaiting-assistant no longer polls approval::consume, and teardown no longer resurrects terminal turns. Constraint: Keep the paused-turn approval path in function_awaiting_approval; resolved decisions are read from state during function execution resumption. Rejected: Leave consume helpers until a later pass | deleting the helpers without their callers would break typecheck, and keeping them would preserve dead behavior. Confidence: high Scope-risk: moderate Directive: Do not add transcript placeholder replacement back unless function_awaiting_approval is removed or the transcript again stores synthetic pending results. Tested: pnpm --filter harness-node test tests/turn-orchestrator Tested: pnpm --filter harness-node typecheck Tested: pnpm --filter harness-node lint * Drop obsolete approval list and consume APIs Approval decisions are now read directly by the paused function state, so the separate list_pending and consume endpoints are dead runtime surface. Removing them keeps the approval gate contract to resolve, sweep-on-abort, and the policy subscriber. Constraint: Keep approval::sweep_session registered until abort cleanup is converted to inline state cleanup. Rejected: Remove sweep in the same pass | router::abort still calls approval::sweep_session and that cleanup belongs to the abort task. Confidence: high Scope-risk: narrow Directive: Do not re-add list_pending or consume for resume; function_awaiting_approval is the resume boundary. Tested: pnpm --filter harness-node test tests/approval-gate tests/harness/policy.test.ts tests/turn-orchestrator/assistant.test.ts tests/turn-orchestrator/tearing-down.test.ts Tested: pnpm --filter harness-node lint && pnpm --filter harness-node typecheck && pnpm --filter harness-node test * Wake turns from approval state writes Approval decisions are now intended to be reactive: the state write is the event that wakes the orchestrator. This adds the adapter that turns an approvals-scope state trigger into the existing turn::step_requested durable publish. Constraint: State trigger payload keys use the approval key shape <session_id>/<function_call_id>. Rejected: Debounce approval decision writes | repeated wakes are harmless and debouncing would add state to a pure adapter. Confidence: high Scope-risk: narrow Directive: Keep this adapter thin; orchestration policy belongs in function_awaiting_approval. Tested: pnpm --filter harness-node test tests/approval-gate/on-decision-written.test.ts * Register approval decision state trigger The approval gate now exposes the on-decision adapter as a state trigger on the approvals scope. Any write to an approval decision key can wake the turn orchestrator without the writer knowing about turn::step_requested. Constraint: Register with scope only and no key filter so every approvals/<session>/<call> decision write is observed. Rejected: Keep wake-up only in approval::resolve | abort and future override writers would still need duplicated publish logic. Confidence: high Scope-risk: narrow Directive: Keep the trigger scope tied to loadApprovalGateConfig so custom approval scopes keep working. Tested: pnpm --filter harness-node typecheck Tested: pnpm --filter harness-node test tests/approval-gate * Let approval resolve rely on reactive wakeups The approvals state trigger now owns turn wake-up, so approval::resolve should only persist the decision and emit approval_resolved. Removing resumeSession eliminates the duplicated explicit durable publish from the resolve path. Constraint: approval::on_decision_written is registered on the approvals scope and publishes turn::step_requested after the state write. Rejected: Keep resolve publishing as a fallback | it would double-wake sessions once the state trigger is active and blur responsibility for the wake-up. Confidence: high Scope-risk: narrow Directive: Writers to the approvals scope should not publish turn::step_requested directly; the state trigger is the boundary. Tested: pnpm --filter harness-node test tests/approval-gate/pending.test.ts Tested: pnpm --filter harness-node typecheck && pnpm --filter harness-node test tests/approval-gate && pnpm --filter harness-node lint * Abort approvals through state decisions The abort path now writes aborted approval decisions directly for turns paused in function_awaiting_approval. Those writes are enough to wake the orchestrator through the approvals state trigger, so abort side effects no longer call sweep_session or publish turn::step_requested. Constraint: This branch did not yet have performAbortSideEffects, so this applies the missing inline cleanup and the reactive no-publish behavior together. Rejected: Keep approval::sweep_session in abort side effects | it preserves the old lifecycle endpoint and bypasses the decision-write trigger boundary. Confidence: high Scope-risk: moderate Directive: Abort approval cleanup should write approvals/<session>/<call> decisions and let the state trigger wake the FSM. Tested: pnpm --filter harness-node test tests/turn-orchestrator/abort.test.ts Tested: pnpm --filter harness-node typecheck && pnpm --filter harness-node test tests/turn-orchestrator/abort.test.ts && pnpm --filter harness-node lint * Cover reactive approval wake integration The integration fixture now simulates the engine firing a state trigger after approvals-scope writes. A direct decision write proves the adapter publishes turn::step_requested without approval::resolve doing it explicitly. Constraint: The fake state trigger runs on a microtask to match the engine's async trigger behavior closely enough for this unit-level integration test. Rejected: Restore the skipped legacy approval-resume contract | it tested the old explicit resume path rather than the reactive state-trigger boundary. Confidence: high Scope-risk: narrow Directive: Approval resume integration should model approvals state writes as the wake source. Tested: pnpm --filter harness-node test tests/integration/approval-resume.e2e.test.ts Tested: pnpm --filter harness-node typecheck && pnpm --filter harness-node test tests/integration/approval-resume.e2e.test.ts && pnpm --filter harness-node lint * Remove terminal resume path from turn stepping Approval waits now pause inside function_awaiting_approval and wake through approvals state writes. Terminal-session resurrection via run::resume and buildResumePlan is dead behavior, so turn::step now treats terminal records as terminal and no longer republishes while waiting for approval. Constraint: Keep publishStep for the normal FSM pump after non-terminal transitions. Rejected: Preserve run::resume for compatibility | no active approval path calls it and it would revive the old stopped-session model. Confidence: high Scope-risk: moderate Directive: Do not reintroduce terminal resurrection unless approval waiting leaves the FSM again. Tested: pnpm --filter harness-node test tests/turn-orchestrator Tested: pnpm --filter harness-node typecheck Tested: pnpm --filter harness-node lint * Drop approval sweep and timeout remnants The gate no longer owns pending-record lifecycles or timeouts. Abort writes aborted decisions inline and approval decisions wake through the state trigger, so approval::sweep_session, buildPendingRecord, and approval_gate.default_timeout_ms are dead surface. Constraint: Leave unrelated hook-fanout and turn-orchestrator timeout config intact. Rejected: Keep sweep as an internal fallback | no caller remains and it depends on the deleted pending record shape. Confidence: high Scope-risk: narrow Directive: Approval cleanup should write decision records directly; do not restore pending/resolved sweep state. Tested: pnpm --filter harness-node test tests/approval-gate tests/turn-orchestrator/abort.test.ts Tested: pnpm --filter harness-node typecheck Tested: pnpm --filter harness-node lint * Remove stale approval placeholder branches The approval wait path no longer emits synthetic pending function results, so console translation does not need to suppress pending_approval function_execution_end events. The hook comment now describes the actual pending dispatch result instead of the removed placeholder flow. Constraint: Preserve approval_requested and approval_resolved UI event translation. Rejected: Keep placeholder suppression defensively | it preserves a deleted wire shape and hides unexpected function results. Confidence: high Scope-risk: narrow Directive: Pending approval UI should be driven by approval_requested, not synthetic function_result placeholders. Tested: pnpm --filter chat-app typecheck Tested: pnpm --filter chat-app exec biome check src/lib/backend/translate.ts Tested: pnpm --filter harness-node test tests/turn-orchestrator/agent-call.test.ts && pnpm --filter harness-node lint * Clean console approval lint drift Repo-wide lint surfaced small console issues around approval resolve handlers and import ordering. Removing non-null assertions and stale suppressions keeps the cleanup branch lintable without changing approval UI behavior. Constraint: Keep existing approval resolve callback contract unchanged. Rejected: Leave console lint failures as unrelated | the approval callback warnings are on this feature surface and are cheap to make safe. Confidence: high Scope-risk: narrow Directive: Keep approval IDs narrowed before invoking resolve callbacks instead of using non-null assertions. Tested: pnpm --filter chat-app lint && pnpm --filter chat-app typecheck Tested: pnpm -r lint * fix(approval-gate): match engine state:* event_type values * test(approval-gate): use engine state:* event_type values * test(integration): fake-iii emits engine-correct event_type for state writes * feat(approval-gate): filter approvals state trigger with condition function Add isDecisionWrite/CONDITION_FN_ID to on-decision-written.ts and wire condition_function_id into the state trigger config in register.ts so the iii engine skips the handler on state:deleted and writes lacking a decision field, without touching the in-handler guard (removed in Task 5). * refactor(approval-gate): handler relies on engine-side condition for filtering * feat(turn-orchestrator): adapter + condition for reactive abort wake * feat(turn-orchestrator): register reactive abort_signal state trigger * test(integration): abort_signal write triggers reactive wake * chore: final cleanup after reactive trigger follow-up Fix biome import-order lint: move `import type { ISdk }` before the value imports in on-abort-signal.test.ts. * Remove obsolete approval event handling and tests This commit deletes the `events.ts` file, which contained the `emitApprovalRequested` and `emitApprovalResolved` functions, as their functionality has been integrated directly into the `gate-subscriber` and `pending` modules. Corresponding test cases in `events.test.ts` have also been removed to reflect this change. The new implementation uses `streamSet` directly for emitting approval events, streamlining the approval gate flow.

Sync the harness-node docs introduced on main with this branch's pause-and-resume refactor: 11-state FSM with function_awaiting_approval, state-trigger wakes on the approvals and agent scopes, hook-fanout's stream-trigger reply path, and the harness sessions fanout moving from a 1 Hz state::list diff to a state trigger. Also drop poll_interval_ms from PublishCollectConfig — it's leftover from the old polling implementation and unused in the reactive path.

…ate trigger executeSync now installs an in-process waiter keyed by session_id, kicks the run, and races the waiter against sync_default_timeout_ms instead of polling persistence every sync_poll_interval_ms. The waiter is resolved by a new `state` trigger on scope=agent gated by turn::is_terminal_state_write, which fires the moment the FSM lands `session/<sid>/turn_state` with state=stopped. Drops sync_poll_interval_ms from the config type, default loader, and config.yaml — the only consumer is gone.

- Add realResolveApproval calling approval::resolve over the iii bus and expose it via ChatBackend.resolveApproval so the chat UI can approve or deny pending function calls. - Thread sessionId into translateAgentEvent so approval-related stream events carry their session context. - Drop the stale "Phase 3 adds approve/deny buttons" comment now that the flow is live end-to-end.

vercel · 2026-05-18T21:13:11Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
workers	Ready	Preview, Comment	May 18, 2026 9:13pm

github-actions · 2026-05-18T21:13:22Z

skill-check — worker

0 verified, 10 skipped (no docs/).

Layer	Result
structure	✓
vale	✓
ai	✓

Three for three. Nicely done.

coderabbitai · 2026-05-18T21:13:34Z

📝 Walkthrough

Walkthrough

This PR implements a reactive, event-driven approval flow for the turn orchestrator, replacing polling-based decision waiting with state-triggered resumption. The approval gate now returns synchronously with allow/deny/pending outcomes; pending calls are parked in a new function_awaiting_approval turn state; decisions written to durable state trigger turn::step_requested publications to resume parked turns; and the chat UI gains interactive approval resolution with async callback support and feedback.

Changes

Reactive approval flow

Layer / File(s)	Summary
Approval data contracts `console/web/src/lib/backend/types.ts`, `console/web/src/types/chat.ts`, `console/web/src/types/iii-agent-event.ts`, `harness-node/src/turn-orchestrator/state.ts`, `harness-node/src/types/agent-event.ts`	`fcall-start` events now carry optional `functionCallId` and `sessionId` for approval resolution pairing. `AgentEvent`'s `approval_requested` variant removes `expires_at` (synchronous gate behavior). `TurnStateRecord` adds optional `awaiting_approval` array of `AwaitingApprovalEntry` to track parked function calls.
Approval gate synchronous responses `harness-node/src/approval-gate/gate-subscriber.ts`, `harness-node/src/approval-gate/types.ts`, `harness-node/src/approval-gate/config.ts`	Gate subscriber now returns `allow`/`deny`/`pending` immediately without polling or state-backed pending records. Pending path emits `approval_requested` event; deny path emits `function_call_denied`. `blockReplyFor` helper standardizes gate response envelope with `subscriber` and `approval_gate` flags. Config removes `default_timeout_ms`.
Decision-written trigger and event emission `harness-node/src/approval-gate/on-decision-written.ts`, `harness-node/src/approval-gate/pending.ts`, `harness-node/src/approval-gate/register.ts`	New `on-decision-written` adapter detects decision writes to `approvals` state scope and publishes `turn::step_requested` to resume orchestrator. `handleResolveWithEvents` persists decision and emits `approval_resolved` event to `agent::events`. Registration wires decision-written trigger on `approval_state_scope`.
Function dispatch result types and hook routing `harness-node/src/turn-orchestrator/agent-call.ts`, `harness-node/src/turn-orchestrator/hook.ts`	`DispatchResult` union models three outcomes: `result`, `deny`, `pending`. `dispatchWithHook` now accepts optional `session_id` and returns `pending` when gate indicates needs_approval. `consultBefore` hook routes `session_id` into fanout payload to enable decision correlation.
Prepared entries and function execution `harness-node/src/turn-orchestrator/persistence.ts`, `harness-node/src/turn-orchestrator/states/functions.ts`	`PreparedEntry` type adds optional `pre_approved` flag for short-circuit execution. `handlePrepare` builds entry objects with `blocked: null` (replacing tuples). `handleExecute` introduces early bypass paths: pre_approved entries execute directly via `iii.trigger`; blocked entries persist as errors. Pending outcomes enqueue calls into `rec.awaiting_approval` and transition to `function_awaiting_approval`.
Awaiting-approval state handler `harness-node/src/turn-orchestrator/states/functions.ts`, `harness-node/src/turn-orchestrator/transitions.ts`	New `handleAwaitingApproval` loads awaiting entries, reads approval decisions in parallel from persisted state, updates prepared calls with either `pre_approved: true` or `blocked: <denial>`, persists updates, clears `awaiting_approval`, and transitions back to `function_execute`. Router includes handler in `function_awaiting_approval` case.
Terminal waiter and abort signal adapters `harness-node/src/turn-orchestrator/on-terminal.ts`, `harness-node/src/turn-orchestrator/on-abort-signal.ts`, `harness-node/src/turn-orchestrator/abort.ts`, `harness-node/src/turn-orchestrator/run-start.ts`, `harness-node/src/turn-orchestrator/register.ts`	Terminal waiter mechanism replaces polling: `installTerminalWaiter` registers promise resolver, `isTerminalStateWrite` filters for `session/<id>/turn_state` writes with `state='stopped'`, `handleTerminalStateWrite` resolves waiter. Abort adapter detects abort signal writes and publishes `turn::step_requested`. `performAbortSideEffects` writes aborted decisions for pending approvals. `run-start.ts` installs terminal waiter and races timeout.
Chat approval resolution UI `console/web/src/components/chat/FunctionCallMessage.tsx`, `console/web/src/components/chat/Message.tsx`, `console/web/src/components/chat/MessageList.tsx`, `console/web/src/components/chat/ChatView.tsx`, `console/web/src/lib/backend/real.ts`, `console/web/src/lib/backend/translate.ts`	`FunctionCallMessage` adds async `runResolve` helper with `submitting` and `submitError` state; buttons show "approving…/denying…" during in-flight and render error/waiting messages. `Message` derives `onApprove`/`onDeny` from `onResolveApproval` prop. `MessageList` forwards callback and auto-scrolls newest pending approval. `ChatView` wires `backend.resolveApproval`. `realResolveApproval` calls `approval::resolve` backend method. `translateAgentEvent` threads `sessionId` into `fcall-start` payloads.
Hook-fanout event-driven collection `harness-node/src/hook-fanout/publish-collect.ts`, `harness-node/src/hook-fanout/types.ts`, `harness-node/src/hook-fanout/merge.ts`	Shifts from polling `stream::list` loop to in-process `pending` map keyed by `event_id`. `handleStreamReply` stream trigger routes incoming replies to matching `Collector`. Response shape includes `publish` status object and `publish_failed` flag. `mergeFirstBlockWins` now preserves full blocking reply verbatim (including `status`, `subscriber`, `approval_gate` markers).
Session creation fanout event-driven `harness-node/src/harness/fanout/sessions-poll.ts`	Replaces 1Hz state-list polling diff loop with state-triggered condition/handler on `state:created` events matching `session/<id>/turn_state` keys. Extracts newly-created `session_id` and fans out to subscribers without storing polling state.
Session entry sorting and timestamp helper `harness-node/src/session/tree/types.ts`, `harness-node/src/session/tree/store.ts`	Adds `entryTimestamp(e)` helper to return `SessionEntry.timestamp` field. `loadEntries` now sorts by `(timestamp, id)` instead of id alone for correct transcript ordering on resumed replies.
Select component grouped optgroups `console/web/src/components/ui/Select.tsx`	Introduces `SelectGroup<T>` type and optional `groups` prop; `options` becomes optional. Conditional rendering: if `groups` present, renders `<optgroup>` + nested `<option>`s; otherwise maps `options`.
Model selection persistence `console/web/src/lib/storage.ts`, `console/web/src/hooks/use-conversations.ts`, `console/web/src/components/chat/ModelPicker.tsx`, `console/web/src/pages/Chat.tsx`	New `loadLastModel`/`saveLastModel` helpers persist selected model to `localStorage`. `useConversations` accepts optional `catalogReady` flag and initializes conversations with persisted last model or fallback. `setModel` persists selection. `Chat` passes `!catalogLoading` to gate model migration. `ModelPicker` groups options by provider via `SelectGroup`.
Docs and configuration `harness-node/config.yaml`, `harness-node/README.md`, `harness-node/docs/architecture.md`, `harness-node/docs/workers/*.md`, `iii-permissions.yaml`	Architecture docs updated to reflect reactive approval flow (synchronous gate, state-trigger resumption, no polling). Worker docs describe new trigger registration, decision-written wakeup, terminal waiter. Config removes `default_timeout_ms` and polling intervals. Permissions removes `approval::list_pending` default grant. Turn FSM updated to 10-state with explicit `function_awaiting_approval`.
Test coverage `harness-node/tests/approval-gate/.test.ts`, `harness-node/tests/hook-fanout/.test.ts`, `harness-node/tests/turn-orchestrator/.test.ts`, `harness-node/tests/session/tree/.test.ts`, `harness-node/tests/integration/approval-resume.e2e.test.ts`, `harness-node/tests/harness/policy.test.ts`	Added/updated 30+ test files covering gate-subscriber behavior, decision-written trigger, pending resolve, awaiting-approval state handler, terminal waiter, abort signal, stream-reply routing, session-entry sorting, and end-to-end approval resumption flow. Tests validate reactive behavior, event emission, state transitions, and integration scenarios.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

This is a large, multi-layer architectural refactor spanning both web console and harness orchestration: the approval flow shifts from polling to event-driven state triggers; turn orchestration gains a new function_awaiting_approval state with reactive resumption; hook-fanout and session polling convert from scheduled loops to stream/state triggers; chat UI gains approval resolution with async callbacks and streaming feedback; and configuration, permissions, and documentation are updated throughout. The changes are tightly coupled across multiple domains (types, data structures, event flow, state machines, UI, persistence, triggers) and introduce new async patterns and reactive trigger coordination that require careful review for correctness, concurrency safety, and test coverage completeness.

Possibly related PRs

iii-hq/workers#151: Both PRs modify console/web/src/components/chat/ChatView.tsx's streaming event handling around fcall-start and assistant-end (one adds function-call dedup/approval mapping, the other adds defensive boundary/stream reset behavior), so the changes are directly related at the same event-handling code level.
iii-hq/workers#106: Both PRs are related because they both shift the approval/resolve flow to function-call identifiers (function_call_id instead of tool-call ids) and update the corresponding UI/backend wiring to pair approvals with the correct call.

Suggested reviewers

sergiofilhowz

🐰 A harness turns from polls to events,
Approvals park and wait with debts,
State triggers wake the sleeping turn,
Chat buttons show decisions burn—
Reactive flow: the future's bet! ✨

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/harness-node

coderabbitai

Actionable comments posted: 9

🧹 Nitpick comments (6)

harness-node/src/approval-gate/state-bus.ts (1)

43-49: 💤 Low value

Consider simplifying the response shape detection.

The nested conditional with multiple type assertions works but could be more readable.

♻️ Alternative implementation

-      // The engine's state::list returns the values directly as an array.
-      // Earlier harness versions returned `{items: [...]}` so accept both.
-      const items = Array.isArray(resp)
-        ? resp
-        : Array.isArray((resp as { items?: unknown[] } | null)?.items)
-          ? ((resp as { items: unknown[] }).items as unknown[])
-          : [];
+      // The engine's state::list returns the values directly as an array.
+      // Earlier harness versions returned `{items: [...]}` so accept both.
+      let items: unknown[];
+      if (Array.isArray(resp)) {
+        items = resp;
+      } else if (resp && typeof resp === 'object' && 'items' in resp) {
+        const maybeItems = (resp as Record<string, unknown>).items;
+        items = Array.isArray(maybeItems) ? maybeItems : [];
+      } else {
+        items = [];
+      }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/approval-gate/state-bus.ts` around lines 43 - 49, The nested
conditional creating items from resp is verbose; simplify the shape detection by
checking Array.isArray(resp) first, then Array.isArray((resp as any)?.items) and
falling back to [] so you can remove the repeated type assertions and
nesting—update the logic that assigns items (the const items = ... block) to use
a single ternary or short if/else that returns resp when it's an array,
resp.items when that is an array, or [] otherwise, referencing resp and items in
state-bus.ts.

harness-node/src/turn-orchestrator/states/assistant.ts (1)

312-312: ⚡ Quick win

Remove development artifact comment.

The comment // reload 1779112003 appears to be a development or debugging marker with no documentation value.

🧹 Proposed cleanup

-// reload 1779112003

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/turn-orchestrator/states/assistant.ts` at line 312, Remove
the development artifact comment "// reload 1779112003" from assistant.ts;
locate the exact comment string in the file and delete it, ensuring no
surrounding code relies on it (it's a pure comment), then run linters/tests to
confirm no trailing whitespace or formatting issues remain after removal.

harness-node/tests/hook-fanout/merge.test.ts (1)

25-44: ⚡ Quick win

Consider verifying full object preservation.

The test asserts that individual known fields are preserved but doesn't verify that the merged result is deeply equal to the original blocking reply. If the blocking reply contains additional fields beyond the six checked, they wouldn't be validated.

🧪 Suggested enhancement for "verbatim" claim

   const merged = mergeFirstBlockWins(replies);
-  expect(merged.block).toBe(true);
-  expect(merged.status).toBe('pending');
-  expect(merged.function_call_id).toBe('call-1');
-  expect(merged.function_id).toBe('shell::exec');
-  expect(merged.subscriber).toBe('approval-gate');
-  expect(merged.approval_gate).toBe(true);
+  // Verify full object is preserved verbatim
+  expect(merged).toEqual(replies[1]);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/tests/hook-fanout/merge.test.ts` around lines 25 - 44, The test
for mergeFirstBlockWins currently checks only individual fields; update it to
assert the merged result is deeply equal to the original blocking reply to
guarantee full object preservation (including any extra fields). Locate the test
case using mergeFirstBlockWins and replace or augment the individual expects
with a single deep-equality assertion (e.g.,
expect(merged).toStrictEqual(replies[1]) or equivalent) so the merged object is
verified verbatim against the blocking reply.

harness-node/src/approval-gate/on-decision-written.ts (1)

33-54: 💤 Low value

Consider validating the extracted session_id.

The function extracts session_id from the key by taking the substring before the first slash, but doesn't validate that the extracted value is non-empty or well-formed. While the slash <= 0 check prevents keys starting with /, a key like "a/" would produce session_id = "a" without further validation.

🛡️ Suggested validation

   const session_id = key.slice(0, slash);
+  if (session_id.length === 0) return;

   try {
     await iii.trigger<unknown, unknown>({

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/approval-gate/on-decision-written.ts` around lines 33 - 54,
The extracted session_id in handleDecisionWritten is not validated beyond
slicing before the first slash; after computing session_id = key.slice(0, slash)
add a validation step to ensure session_id is non-empty and well-formed (e.g.,
not just whitespace and matching your expected pattern such as
/^[A-Za-z0-9_-]+$/ or UUID format if applicable). If validation fails, bail out
early (return) and optionally emit a debug/warn via logger.warn indicating an
invalid session_id with the original key; keep the existing iii.trigger call and
error handling unchanged when session_id is valid.

console/web/src/lib/backend/types.ts (1)

20-29: 💤 Low value

Clarify the semantic distinction between pendingApproval and the new approval resolution fields.

The pendingApproval boolean and the new functionCallId + sessionId fields work together but aren't redundant:

pendingApproval is a UI state flag indicating the message awaits user approval (pending → running → done lifecycle).

functionCallId + sessionId are identifiers needed to resolve the approval when the user clicks approve/deny.

These fields appear together in approval_requested events, but functionCallId + sessionId also appear independently in regular function_execution_start events (without pendingApproval). The boolean flag cannot be reliably inferred from the presence of IDs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/web/src/lib/backend/types.ts` around lines 20 - 29, Clarify in the
type definition for the 'fcall-start' message that pendingApproval is a UI
lifecycle flag and not derivable from the presence of IDs: update the inline
comments for pendingApproval, functionCallId, and sessionId in the 'fcall-start'
union branch to state that pendingApproval indicates the message is awaiting
user approval (pending → running → done), while functionCallId and sessionId are
identifiers used to resolve an approval action (approve/deny) and may appear
independently in other events such as function_execution_start; ensure the
comments mention that the boolean cannot be inferred from the presence of the
IDs and that approval resolution code should use functionCallId+sessionId when
handling user actions.

harness-node/tests/integration/approval-resume.e2e.test.ts (1)

22-22: 💤 Low value

Prefer strict equality (===) over loose equality (==).

Lines 22 and 33 use old_value == null. Since line 17 explicitly coerces undefined to null with ?? null, old_value will always be either the stored value or null. Using strict equality (old_value === null) is more conventional and makes the intent clearer.
♻️ Proposed fix
-          event_type: old_value == null ? 'state:created' : 'state:updated',
+          event_type: old_value === null ? 'state:created' : 'state:updated',
Apply the same change at line 33.
Also applies to: 33-33
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/tests/integration/approval-resume.e2e.test.ts` at line 22,
Change the loose null check `old_value == null` to a strict comparison
`old_value === null` wherever it appears (specifically the ternary that sets
`event_type` and the similar check later), since `old_value` is already
normalized to either a value or null; update both instances to use `===` to make
the intent explicit and avoid loose equality.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@harness-node/docs/workers/turn-orchestrator.md`:
- Line 56: Update the FSM docs row for the function_finalize state to use the
exact emitted event name: replace the incorrect `function_call_end` with
`function_execution_end` so the table reads that `function_finalize` persists
results and emits `function_execution_end` + `turn_end`; reference the
`function_finalize` state and the `function_execution_end` event to keep the
contract consistent with the orchestrator implementation.

In `@harness-node/src/approval-gate/gate-subscriber.ts`:
- Around line 65-108: The current logic calls streamSet before writeHookReply so
any streamSet error will prevent the gate response from being sent; to fix, wrap
the streamSet calls in a try/catch in both the deny branch (where
permissionsDenyEnvelope is used and reply is returned) and the pending branch
(approval_requested), log or handle the streamSet error but always proceed to
call writeHookReply with the constructed reply (and return it in the deny case);
reference streamSet, writeHookReply, permissionsDenyEnvelope, SUBSCRIBER_NAME,
uuidLike and ensure the deny branch still returns the reply after writeHookReply
even if streamSet failed.

In `@harness-node/src/approval-gate/pending.ts`:
- Around line 68-79: handleResolve currently calls streamSet to emit an
'approval_resolved' event (stream_name 'agent::events', item_id
`approval-${uuidLike()}`) but doesn't handle failures, so a thrown error can
make the API call fail even after the decision is committed; wrap the
streamSet(...) call in a try/catch inside handleResolve, and on error do not
rethrow—instead log the failure (using the module's existing logger or
console.error) with contextual details (session_id, function_call_id, decision,
and the caught error) so the resolve flow remains successful while still
recording the event-write failure.

In `@harness-node/src/hook-fanout/publish-collect.ts`:
- Around line 188-201: The publish try/catch silently records publish_error but
lets the IIFE continue waiting until the deadline; modify the catch in the async
IIFE that calls buildPublishEnvelope and iii.trigger (function_id
'iii::durable::publish') so it fails fast: after logging and setting
publish_error (and keeping the existing logger.warn and publish_error
assignment), immediately rethrow the caught error (or otherwise reject/abort the
surrounding wait) so the caller stops waiting instead of stalling until the
deadline.

In `@harness-node/src/turn-orchestrator/agent-call.ts`:
- Around line 152-157: The code is incorrectly classifying failures from
iii.trigger as gate denials by returning kind: 'deny' with
denialResult(gateUnavailableEnvelope(...)); change this path so arbitrary
runtime/tool errors from iii.trigger are not mapped to authorization denials:
replace the deny/denialResult/gateUnavailableEnvelope return with an
error/failure variant (for example kind: 'error' or 'failure') and wrap the
original error (String(err)) in an execution-specific envelope/result (e.g.,
executionErrorEnvelope or errorResult) that references
function_call.function_id; ensure callers that handle denials versus execution
errors can distinguish the new kind.

In `@harness-node/src/turn-orchestrator/hook.ts`:
- Around line 50-58: The handler currently marks session_id optional which
allows a missing session_id to create an inner payload without it and
accidentally bypass approval; change the function signature to require
session_id (remove the optional ?), and add a runtime guard in the function that
throws or returns a failing HookOutcome when session_id is falsy to enforce
fail-closed behavior; ensure the inner payload construction (the inner variable
that includes session_id, function_call, approval_required) always includes
session_id and update any callsites if needed so extractCall/approval-gate
always receive a non-null session_id.

In `@harness-node/src/turn-orchestrator/states/functions.ts`:
- Around line 82-86: The resumed pre_approval branch dispatches iii.trigger
using raw fc.arguments (entry.pre_approved path) which loses the session/call
metadata added in the normal path; modify the entry.pre_approved === true branch
to build the same augmented payload as the regular path (merge fc.arguments with
the session/call context fields used in the regular flow—e.g., call/session IDs,
caller info, metadata) and pass that merged object to iii.trigger with
fc.function_id so resumed executions receive the same enriched context as
first-pass executions.

In `@harness-node/src/turn-orchestrator/states/tearing-down.ts`:
- Line 27: Remove the development artifact comment "// reload 1779112003" from
the tearing-down.ts file; locate the exact trailing comment text and delete that
line (and any other identical "reload" marker comments if present) so the file
contains no leftover dev reload markers before merging.

In `@harness-node/tests/approval-gate/pending.test.ts`:
- Around line 96-109: The test currently only asserts that
'iii::durable::publish' wasn't called but that function is never invoked by
handleResolveWithEvents here; update the failure-path assertion to ensure no
side-effect triggers occur when InMemoryStateBus.set rejects: after mocking
bus.set to throw, assert that iii.trigger was never called (or specifically
assert that no call had function_id 'stream::set') — locate the InMemoryStateBus
instance 'bus', the mocked 'set' call, and the SDK stub 'iii.trigger' in the
test and replace the current fns check with an assertion on iii.trigger mock
call count or absence of 'stream::set'.

---

Nitpick comments:
In `@console/web/src/lib/backend/types.ts`:
- Around line 20-29: Clarify in the type definition for the 'fcall-start'
message that pendingApproval is a UI lifecycle flag and not derivable from the
presence of IDs: update the inline comments for pendingApproval, functionCallId,
and sessionId in the 'fcall-start' union branch to state that pendingApproval
indicates the message is awaiting user approval (pending → running → done),
while functionCallId and sessionId are identifiers used to resolve an approval
action (approve/deny) and may appear independently in other events such as
function_execution_start; ensure the comments mention that the boolean cannot be
inferred from the presence of the IDs and that approval resolution code should
use functionCallId+sessionId when handling user actions.

In `@harness-node/src/approval-gate/on-decision-written.ts`:
- Around line 33-54: The extracted session_id in handleDecisionWritten is not
validated beyond slicing before the first slash; after computing session_id =
key.slice(0, slash) add a validation step to ensure session_id is non-empty and
well-formed (e.g., not just whitespace and matching your expected pattern such
as /^[A-Za-z0-9_-]+$/ or UUID format if applicable). If validation fails, bail
out early (return) and optionally emit a debug/warn via logger.warn indicating
an invalid session_id with the original key; keep the existing iii.trigger call
and error handling unchanged when session_id is valid.

In `@harness-node/src/approval-gate/state-bus.ts`:
- Around line 43-49: The nested conditional creating items from resp is verbose;
simplify the shape detection by checking Array.isArray(resp) first, then
Array.isArray((resp as any)?.items) and falling back to [] so you can remove the
repeated type assertions and nesting—update the logic that assigns items (the
const items = ... block) to use a single ternary or short if/else that returns
resp when it's an array, resp.items when that is an array, or [] otherwise,
referencing resp and items in state-bus.ts.

In `@harness-node/src/turn-orchestrator/states/assistant.ts`:
- Line 312: Remove the development artifact comment "// reload 1779112003" from
assistant.ts; locate the exact comment string in the file and delete it,
ensuring no surrounding code relies on it (it's a pure comment), then run
linters/tests to confirm no trailing whitespace or formatting issues remain
after removal.

In `@harness-node/tests/hook-fanout/merge.test.ts`:
- Around line 25-44: The test for mergeFirstBlockWins currently checks only
individual fields; update it to assert the merged result is deeply equal to the
original blocking reply to guarantee full object preservation (including any
extra fields). Locate the test case using mergeFirstBlockWins and replace or
augment the individual expects with a single deep-equality assertion (e.g.,
expect(merged).toStrictEqual(replies[1]) or equivalent) so the merged object is
verified verbatim against the blocking reply.

In `@harness-node/tests/integration/approval-resume.e2e.test.ts`:
- Line 22: Change the loose null check `old_value == null` to a strict
comparison `old_value === null` wherever it appears (specifically the ternary
that sets `event_type` and the similar check later), since `old_value` is
already normalized to either a value or null; update both instances to use `===`
to make the intent explicit and avoid loose equality.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 374c6ace-4e44-4d62-9e1b-bc120413b032

📥 Commits

Reviewing files that changed from the base of the PR and between fce3152 and c7b7012.

📒 Files selected for processing (78)

.gitignore
console/web/src/components/chat/ChatView.tsx
console/web/src/components/chat/FunctionCallMessage.tsx
console/web/src/components/chat/Message.tsx
console/web/src/components/chat/MessageList.tsx
console/web/src/components/chat/ModelPicker.tsx
console/web/src/components/ui/Select.tsx
console/web/src/hooks/use-conversations.ts
console/web/src/index.css
console/web/src/lib/backend/real.ts
console/web/src/lib/backend/translate.ts
console/web/src/lib/backend/types.ts
console/web/src/lib/models-catalog.ts
console/web/src/lib/storage.ts
console/web/src/pages/Chat.tsx
console/web/src/pages/Playground/index.tsx
console/web/src/types/chat.ts
console/web/src/types/iii-agent-event.ts
harness-node/README.md
harness-node/config.yaml
harness-node/docs/architecture.md
harness-node/docs/workers/approval-gate.md
harness-node/docs/workers/harness.md
harness-node/docs/workers/hook-fanout.md
harness-node/docs/workers/session.md
harness-node/docs/workers/turn-orchestrator.md
harness-node/src/approval-gate/config.ts
harness-node/src/approval-gate/gate-subscriber.ts
harness-node/src/approval-gate/on-decision-written.ts
harness-node/src/approval-gate/pending.ts
harness-node/src/approval-gate/register.ts
harness-node/src/approval-gate/state-bus.ts
harness-node/src/approval-gate/types.ts
harness-node/src/harness/fanout/sessions-poll.ts
harness-node/src/hook-fanout/config.ts
harness-node/src/hook-fanout/merge.ts
harness-node/src/hook-fanout/publish-collect.ts
harness-node/src/hook-fanout/types.ts
harness-node/src/session/tree/store.ts
harness-node/src/session/tree/types.ts
harness-node/src/turn-orchestrator/abort.ts
harness-node/src/turn-orchestrator/agent-call.ts
harness-node/src/turn-orchestrator/config.ts
harness-node/src/turn-orchestrator/hook.ts
harness-node/src/turn-orchestrator/on-abort-signal.ts
harness-node/src/turn-orchestrator/on-terminal.ts
harness-node/src/turn-orchestrator/persistence.ts
harness-node/src/turn-orchestrator/register.ts
harness-node/src/turn-orchestrator/run-start.ts
harness-node/src/turn-orchestrator/state.ts
harness-node/src/turn-orchestrator/states/assistant.ts
harness-node/src/turn-orchestrator/states/functions.ts
harness-node/src/turn-orchestrator/states/tearing-down.ts
harness-node/src/turn-orchestrator/subscriber.ts
harness-node/src/turn-orchestrator/transitions.ts
harness-node/src/types/agent-event.ts
harness-node/tests/approval-gate/gate-subscriber.test.ts
harness-node/tests/approval-gate/on-decision-written.test.ts
harness-node/tests/approval-gate/pending.test.ts
harness-node/tests/approval-gate/types.test.ts
harness-node/tests/harness/policy.test.ts
harness-node/tests/hook-fanout/merge.test.ts
harness-node/tests/hook-fanout/publish-collect.test.ts
harness-node/tests/integration/approval-resume.e2e.test.ts
harness-node/tests/session/tree/store.test.ts
harness-node/tests/session/tree/types.test.ts
harness-node/tests/turn-orchestrator/abort.test.ts
harness-node/tests/turn-orchestrator/agent-call.test.ts
harness-node/tests/turn-orchestrator/assistant.test.ts
harness-node/tests/turn-orchestrator/awaiting-approval.test.ts
harness-node/tests/turn-orchestrator/functions.test.ts
harness-node/tests/turn-orchestrator/on-abort-signal.test.ts
harness-node/tests/turn-orchestrator/on-terminal.test.ts
harness-node/tests/turn-orchestrator/persistence-prepared.test.ts
harness-node/tests/turn-orchestrator/run-start.test.ts
harness-node/tests/turn-orchestrator/state.test.ts
harness-node/tests/turn-orchestrator/tearing-down.test.ts
iii-permissions.yaml

💤 Files with no reviewable changes (7)

harness-node/src/types/agent-event.ts
harness-node/config.yaml
iii-permissions.yaml
harness-node/src/turn-orchestrator/config.ts
harness-node/src/hook-fanout/config.ts
console/web/src/types/iii-agent-event.ts
harness-node/tests/harness/policy.test.ts

coderabbitai · 2026-05-18T21:21:58Z

-| `function_finalize` | same | Persist results; emit `function_call_end` events. |
+| `function_execute` | same | Run each call via `dispatchWithHook` → `agent::call`. If the gate returns `pending`, append the call to `awaiting_approval` and transition to `function_awaiting_approval` (the rest of the batch is left for the resumed step). |
+| `function_awaiting_approval` | same (`handleAwaitingApproval`) | Read `approvals/<sid>/<cid>` for every entry in `awaiting_approval`. While any decision is still missing, return without stepping (the approval-gate's state trigger will wake us). When all decisions are present, fold them into the prepared snapshot — `allow` → `pre_approved: true`, `deny`/`aborted` → `blocked` with a denial result — clear `awaiting_approval`, and transition back to `function_execute`. |
+| `function_finalize` | same | Persist results; emit `function_call_end` + `turn_end` events. |


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the emitted event name in the FSM table.

Line 56 says function_call_end, but the orchestrator emits function_execution_end. Keeping the exact event name avoids contract confusion in downstream consumers.

Suggested doc fix

-| `function_finalize` | same | Persist results; emit `function_call_end` + `turn_end` events. | +| `function_finalize` | same | Persist results; emit `function_execution_end` + `turn_end` events. |

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness-node/docs/workers/turn-orchestrator.md` at line 56, Update the FSM docs row for the function_finalize state to use the exact emitted event name: replace the incorrect `function_call_end` with `function_execution_end` so the table reads that `function_finalize` persists results and emits `function_execution_end` + `turn_end`; reference the `function_finalize` state and the `function_execution_end` event to keep the contract consistent with the orchestrator implementation.

coderabbitai · 2026-05-18T21:21:58Z

  if (outcome.kind === 'deny') {
    const env = permissionsDenyEnvelope(
      call.function_id,
      outcome.rule_id,
      outcome.matched_constraint,
      call.args,
    );
-    await writeEvent(ctx.iii, call.session_id, {
-      type: 'function_call_denied',
-      function_call_id: call.function_call_id,
-      tool_call_id: call.function_call_id,
-      function_id: call.function_id,
-      tool_name: call.function_id,
+    const reply = {
+      block: true,
+      reason: env.reason,
      denial: env,
+      subscriber: SUBSCRIBER_NAME,
+      approval_gate: true,
+    };
+    await streamSet(ctx.iii, {
+      stream_name: 'agent::events',
+      group_id: call.session_id,
+      item_id: `approval-${uuidLike()}`,
+      data: {
+        type: 'function_call_denied',
+        function_call_id: call.function_call_id,
+        tool_call_id: call.function_call_id,
+        function_id: call.function_id,
+        tool_name: call.function_id,
+        denial: env,
+      },
    });
-    const reply = blockReplyForDeny(env);
    await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);
    return reply;
  }

-  // Step 2: pause-and-wait flow.
-  const now = Date.now();
-  const expires_at = now + ctx.cfg.default_timeout_ms;
-  const record = buildPendingRecord(
-    call.function_call_id,
-    call.function_id,
-    call.args,
-    now,
-    ctx.cfg.default_timeout_ms,
-  );
-  try {
-    await ctx.bus.set(
-      ctx.cfg.approval_state_scope,
-      pendingKey(call.session_id, call.function_call_id),
-      record,
-    );
-  } catch {
-    const reply = blockReplyForAllow();
-    await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);
-    return reply;
-  }
-  await writeEvent(ctx.iii, call.session_id, {
-    type: 'approval_requested',
-    function_call_id: call.function_call_id,
-    tool_call_id: call.function_call_id,
-    function_id: call.function_id,
-    tool_name: call.function_id,
-    args: call.args,
-    expires_at,
-  });
-  const awaited = await awaitDecision(
-    ctx.bus,
-    ctx.cfg.approval_state_scope,
-    call.session_id,
-    call.function_call_id,
-    expires_at,
-  );
-  let reply: { block: false } | { block: true; denial: DenialEnvelope };
-  let decisionStr: 'allow' | 'deny';
-  let reasonForEvent: string | null;
-  if (awaited.kind === 'allow') {
-    reply = blockReplyForAllow();
-    decisionStr = 'allow';
-    reasonForEvent = null;
-  } else {
-    const env = userDenyEnvelope(call.function_id, awaited.reason, call.args);
-    reply = blockReplyForDeny(env);
-    decisionStr = 'deny';
-    reasonForEvent = awaited.reason;
-  }
-  await writeEvent(ctx.iii, call.session_id, {
-    type: 'approval_resolved',
-    function_call_id: call.function_call_id,
-    tool_call_id: call.function_call_id,
-    decision: decisionStr,
-    reason: reasonForEvent,
+  await streamSet(ctx.iii, {
+    stream_name: 'agent::events',
+    group_id: call.session_id,
+    item_id: `approval-${uuidLike()}`,
+    data: {
+      type: 'approval_requested',
+      function_call_id: call.function_call_id,
+      tool_call_id: call.function_call_id,
+      function_id: call.function_id,
+      tool_name: call.function_id,
+      args: call.args,
+    },
  });


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t let agent::events failures suppress gate replies.

In both deny and pending branches, a streamSet failure throws before writeHookReply, so the orchestrator may miss the actual gate response.

🧯 Suggested hardening

+import { logger } from '../runtime/otel.js'; @@ - await streamSet(ctx.iii, { - stream_name: 'agent::events', - group_id: call.session_id, - item_id: `approval-${uuidLike()}`, - data: { - type: 'function_call_denied', - function_call_id: call.function_call_id, - tool_call_id: call.function_call_id, - function_id: call.function_id, - tool_name: call.function_id, - denial: env, - }, - }); + try { + await streamSet(ctx.iii, { + stream_name: 'agent::events', + group_id: call.session_id, + item_id: `approval-${uuidLike()}`, + data: { + type: 'function_call_denied', + function_call_id: call.function_call_id, + tool_call_id: call.function_call_id, + function_id: call.function_id, + tool_name: call.function_id, + denial: env, + }, + }); + } catch (err) { + logger.warn('approval-gate: failed to emit function_call_denied event', { + err: String(err), + }); + } @@ - await streamSet(ctx.iii, { - stream_name: 'agent::events', - group_id: call.session_id, - item_id: `approval-${uuidLike()}`, - data: { - type: 'approval_requested', - function_call_id: call.function_call_id, - tool_call_id: call.function_call_id, - function_id: call.function_id, - tool_name: call.function_id, - args: call.args, - }, - }); + try { + await streamSet(ctx.iii, { + stream_name: 'agent::events', + group_id: call.session_id, + item_id: `approval-${uuidLike()}`, + data: { + type: 'approval_requested', + function_call_id: call.function_call_id, + tool_call_id: call.function_call_id, + function_id: call.function_id, + tool_name: call.function_id, + args: call.args, + }, + }); + } catch (err) { + logger.warn('approval-gate: failed to emit approval_requested event', { + err: String(err), + }); + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (outcome.kind === 'deny') {

const env = permissionsDenyEnvelope(

call.function_id,

outcome.rule_id,

outcome.matched_constraint,

call.args,

);

await writeEvent(ctx.iii, call.session_id, {

type: 'function_call_denied',

function_call_id: call.function_call_id,

tool_call_id: call.function_call_id,

function_id: call.function_id,

tool_name: call.function_id,

const reply = {

block: true,

reason: env.reason,

denial: env,

subscriber: SUBSCRIBER_NAME,

approval_gate: true,

};

await streamSet(ctx.iii, {

stream_name: 'agent::events',

group_id: call.session_id,

item_id: `approval-${uuidLike()}`,

data: {

type: 'function_call_denied',

function_call_id: call.function_call_id,

tool_call_id: call.function_call_id,

function_id: call.function_id,

tool_name: call.function_id,

denial: env,

},

});

const reply = blockReplyForDeny(env);

await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);

return reply;

}

// Step 2: pause-and-wait flow.

const now = Date.now();

const expires_at = now + ctx.cfg.default_timeout_ms;

const record = buildPendingRecord(

call.function_call_id,

call.function_id,

call.args,

now,

ctx.cfg.default_timeout_ms,

);

try {

await ctx.bus.set(

ctx.cfg.approval_state_scope,

pendingKey(call.session_id, call.function_call_id),

record,

);

} catch {

const reply = blockReplyForAllow();

await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);

return reply;

}

await writeEvent(ctx.iii, call.session_id, {

type: 'approval_requested',

function_call_id: call.function_call_id,

tool_call_id: call.function_call_id,

function_id: call.function_id,

tool_name: call.function_id,

args: call.args,

expires_at,

});

const awaited = await awaitDecision(

ctx.bus,

ctx.cfg.approval_state_scope,

call.session_id,

call.function_call_id,

expires_at,

);

let reply: { block: false } | { block: true; denial: DenialEnvelope };

let decisionStr: 'allow' | 'deny';

let reasonForEvent: string | null;

if (awaited.kind === 'allow') {

reply = blockReplyForAllow();

decisionStr = 'allow';

reasonForEvent = null;

} else {

const env = userDenyEnvelope(call.function_id, awaited.reason, call.args);

reply = blockReplyForDeny(env);

decisionStr = 'deny';

reasonForEvent = awaited.reason;

}

await writeEvent(ctx.iii, call.session_id, {

type: 'approval_resolved',

function_call_id: call.function_call_id,

tool_call_id: call.function_call_id,

decision: decisionStr,

reason: reasonForEvent,

await streamSet(ctx.iii, {

stream_name: 'agent::events',

group_id: call.session_id,

item_id: `approval-${uuidLike()}`,

data: {

type: 'approval_requested',

function_call_id: call.function_call_id,

tool_call_id: call.function_call_id,

function_id: call.function_id,

tool_name: call.function_id,

args: call.args,

},

});

if (outcome.kind === 'deny') {

const env = permissionsDenyEnvelope(

call.function_id,

outcome.rule_id,

outcome.matched_constraint,

call.args,

);

const reply = {

block: true,

reason: env.reason,

denial: env,

subscriber: SUBSCRIBER_NAME,

approval_gate: true,

};

try {

await streamSet(ctx.iii, {

stream_name: 'agent::events',

group_id: call.session_id,

item_id: `approval-${uuidLike()}`,

data: {

type: 'function_call_denied',

function_call_id: call.function_call_id,

tool_call_id: call.function_call_id,

function_id: call.function_id,

tool_name: call.function_id,

denial: env,

},

});

} catch (err) {

logger.warn('approval-gate: failed to emit function_call_denied event', {

err: String(err),

});

}

await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);

return reply;

}

try {

await streamSet(ctx.iii, {

stream_name: 'agent::events',

group_id: call.session_id,

item_id: `approval-${uuidLike()}`,

data: {

type: 'approval_requested',

function_call_id: call.function_call_id,

tool_call_id: call.function_call_id,

function_id: call.function_id,

tool_name: call.function_id,

args: call.args,

},

});

} catch (err) {

logger.warn('approval-gate: failed to emit approval_requested event', {

err: String(err),

});

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness-node/src/approval-gate/gate-subscriber.ts` around lines 65 - 108, The current logic calls streamSet before writeHookReply so any streamSet error will prevent the gate response from being sent; to fix, wrap the streamSet calls in a try/catch in both the deny branch (where permissionsDenyEnvelope is used and reply is returned) and the pending branch (approval_requested), log or handle the streamSet error but always proceed to call writeHookReply with the constructed reply (and return it in the deny case); reference streamSet, writeHookReply, permissionsDenyEnvelope, SUBSCRIBER_NAME, uuidLike and ensure the deny branch still returns the reply after writeHookReply even if streamSet failed.

coderabbitai · 2026-05-18T21:21:58Z

+  await streamSet(iii, {
+    stream_name: 'agent::events',
+    group_id: session_id,
+    item_id: `approval-${uuidLike()}`,
+    data: {
+      type: 'approval_resolved',
+      function_call_id,
+      tool_call_id: function_call_id,
+      decision,
+      reason,
+    },
+  });


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle approval_resolved event-write failures without failing resolve.

handleResolve already committed the decision. If streamSet throws here, the API path fails after a successful write, which invites duplicate retries and misleading client errors.

🛠️ Suggested resilience patch

- await streamSet(iii, { - stream_name: 'agent::events', - group_id: session_id, - item_id: `approval-${uuidLike()}`, - data: { - type: 'approval_resolved', - function_call_id, - tool_call_id: function_call_id, - decision, - reason, - }, - }); + try { + await streamSet(iii, { + stream_name: 'agent::events', + group_id: session_id, + item_id: `approval-${uuidLike()}`, + data: { + type: 'approval_resolved', + function_call_id, + tool_call_id: function_call_id, + decision, + reason, + }, + }); + } catch (err) { + logger.warn('approval-gate: resolved state written but event emit failed', { + err: String(err), + session_id, + function_call_id, + }); + }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness-node/src/approval-gate/pending.ts` around lines 68 - 79, handleResolve currently calls streamSet to emit an 'approval_resolved' event (stream_name 'agent::events', item_id `approval-${uuidLike()}`) but doesn't handle failures, so a thrown error can make the API call fail even after the decision is committed; wrap the streamSet(...) call in a try/catch inside handleResolve, and on error do not rethrow—instead log the failure (using the module's existing logger or console.error) with contextual details (session_id, function_call_id, decision, and the caught error) so the resolve flow remains successful while still recording the event-write failure.

coderabbitai · 2026-05-18T21:21:58Z

+      (async () => {
+        try {
+          const envelope = buildPublishEnvelope(topic, event_id, inner);
+          await iii.trigger<unknown, unknown>({
+            function_id: 'iii::durable::publish',
+            payload: envelope,
+          });
+        } catch (err) {
+          logger.warn('hook-fanout::publish_collect: publish trigger failed', {
+            topic,
+            err: String(err),
+          });
+          publish_error = err instanceof Error ? err.message : String(err);
+        }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when iii::durable::publish fails.

The catch block only sets publish_error; the call still waits until deadline. That stalls requests unnecessarily even though publish already failed.

⚡ Suggested fix

(async () => { try { const envelope = buildPublishEnvelope(topic, event_id, inner); await iii.trigger<unknown, unknown>({ function_id: 'iii::durable::publish', payload: envelope, }); } catch (err) { logger.warn('hook-fanout::publish_collect: publish trigger failed', { topic, err: String(err), }); publish_error = err instanceof Error ? err.message : String(err); + settle( + collector, + collector.replies.length === 0 ? 'deadline_no_replies' : 'deadline', + ); } })();

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

(async () => {

try {

const envelope = buildPublishEnvelope(topic, event_id, inner);

await iii.trigger<unknown, unknown>({

function_id: 'iii::durable::publish',

payload: envelope,

});

} catch (err) {

logger.warn('hook-fanout::publish_collect: publish trigger failed', {

topic,

err: String(err),

});

publish_error = err instanceof Error ? err.message : String(err);

}

(async () => {

try {

const envelope = buildPublishEnvelope(topic, event_id, inner);

await iii.trigger<unknown, unknown>({

function_id: 'iii::durable::publish',

payload: envelope,

});

} catch (err) {

logger.warn('hook-fanout::publish_collect: publish trigger failed', {

topic,

err: String(err),

});

publish_error = err instanceof Error ? err.message : String(err);

settle(

collector,

collector.replies.length === 0 ? 'deadline_no_replies' : 'deadline',

);

}

})();

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness-node/src/hook-fanout/publish-collect.ts` around lines 188 - 201, The publish try/catch silently records publish_error but lets the IIFE continue waiting until the deadline; modify the catch in the async IIFE that calls buildPublishEnvelope and iii.trigger (function_id 'iii::durable::publish') so it fails fast: after logging and setting publish_error (and keeping the existing logger.warn and publish_error assignment), immediately rethrow the caught error (or otherwise reject/abort the surrounding wait) so the caller stops waiting instead of stalling until the deadline.

coderabbitai · 2026-05-18T21:21:58Z

+    return {
+      kind: 'deny',
+      result: denialResult(
+        gateUnavailableEnvelope(function_call.function_id, `trigger_failed: ${String(err)}`),
+      ),
+    };


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not classify target function execution errors as gate denials.

This path wraps arbitrary iii.trigger execution failures as gate_unavailable denials, which conflates runtime/tool errors with authorization failures and can mislead downstream handling.

Suggested fix

- return { - kind: 'deny', - result: denialResult( - gateUnavailableEnvelope(function_call.function_id, `trigger_failed: ${String(err)}`), - ), - }; + return { + kind: 'result', + result: errorResult({ + error: 'trigger_failed', + function: function_call.function_id, + message: String(err), + }), + };

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness-node/src/turn-orchestrator/agent-call.ts` around lines 152 - 157, The code is incorrectly classifying failures from iii.trigger as gate denials by returning kind: 'deny' with denialResult(gateUnavailableEnvelope(...)); change this path so arbitrary runtime/tool errors from iii.trigger are not mapped to authorization denials: replace the deny/denialResult/gateUnavailableEnvelope return with an error/failure variant (for example kind: 'error' or 'failure') and wrap the original error (String(err)) in an execution-specific envelope/result (e.g., executionErrorEnvelope or errorResult) that references function_call.function_id; ensure callers that handle denials versus execution errors can distinguish the new kind.

coderabbitai · 2026-05-18T21:21:58Z

+  session_id?: string,
 ): Promise<HookOutcome> {
-  const inner = { function_call, approval_required };
+  // PR #150: include session_id at the inner-payload root so approval-gate's
+  // extractCall can route it. Without this, the gate sees a null session_id,
+  // returns {block:false}, and the call passes through unapproved.
+  const inner = session_id
+    ? { session_id, function_call, approval_required }
+    : { function_call, approval_required };
  const payload = {


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require session_id to preserve fail-closed approval behavior

Line 50 makes session_id optional, but Lines 52-54 explicitly document that missing session_id can produce block:false, which becomes allow at Line 88. That creates a bypass path if any caller omits it.

Suggested fix

export async function consultBefore( iii: ISdk, function_call: FunctionCall, approval_required: string[], - session_id?: string, + session_id: string, ): Promise<HookOutcome> { + if (!session_id) { + return { + kind: 'deny', + denial: gateUnavailableEnvelope( + function_call.function_id, + 'permission gate rejected request: missing session_id', + ), + }; + } - const inner = session_id - ? { session_id, function_call, approval_required } - : { function_call, approval_required }; + const inner = { session_id, function_call, approval_required };

Also applies to: 87-90

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness-node/src/turn-orchestrator/hook.ts` around lines 50 - 58, The handler currently marks session_id optional which allows a missing session_id to create an inner payload without it and accidentally bypass approval; change the function signature to require session_id (remove the optional ?), and add a runtime guard in the function that throws or returns a failing HookOutcome when session_id is falsy to enforce fail-closed behavior; ensure the inner payload construction (the inner variable that includes session_id, function_call, approval_required) always includes session_id and update any callsites if needed so extractCall/approval-gate always receive a non-null session_id.

coderabbitai · 2026-05-18T21:21:58Z

+    if (entry.pre_approved === true) {
+      const value = await iii.trigger<unknown, unknown>({
+        function_id: fc.function_id,
+        payload: fc.arguments ?? {},
+      });


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve augmented call context on resumed approvals.

Line 83 dispatches pre_approved calls with raw fc.arguments, but the regular path (Lines 104–126) augments args with session/call metadata. That makes resumed approved executions semantically different from first-pass executions.

Suggested fix

- if (entry.pre_approved === true) { - const value = await iii.trigger<unknown, unknown>({ - function_id: fc.function_id, - payload: fc.arguments ?? {}, - }); + if (entry.pre_approved === true) { + const augmentedPayload = + fc.arguments && typeof fc.arguments === 'object' && !Array.isArray(fc.arguments) + ? { ...(fc.arguments as Record<string, unknown>) } + : { arguments: fc.arguments }; + augmentedPayload.session_id = rec.session_id; + augmentedPayload.function_call_id = fc.id; + augmentedPayload.function_id = fc.function_id; + augmentedPayload.function_call = { + id: fc.id, + function_id: fc.function_id, + arguments: fc.arguments, + }; + const value = await iii.trigger<unknown, unknown>({ + function_id: fc.function_id, + payload: augmentedPayload, + });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness-node/src/turn-orchestrator/states/functions.ts` around lines 82 - 86, The resumed pre_approval branch dispatches iii.trigger using raw fc.arguments (entry.pre_approved path) which loses the session/call metadata added in the normal path; modify the entry.pre_approved === true branch to build the same augmented payload as the regular path (merge fc.arguments with the session/call context fields used in the regular flow—e.g., call/session IDs, caller info, metadata) and pass that merged object to iii.trigger with fc.function_id so resumed executions receive the same enriched context as first-pass executions.

coderabbitai · 2026-05-18T21:21:58Z

  await emit(iii, rec.session_id, { type: 'agent_end', messages });
  transitionTo(rec, 'stopped');
 }
+// reload 1779112003


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the reload marker comment.

The trailing // reload 1779112003 comment appears to be a development artifact with no functional purpose and should be removed before merge.

🧹 Proposed fix

-// reload 1779112003

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// reload 1779112003

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness-node/src/turn-orchestrator/states/tearing-down.ts` at line 27, Remove the development artifact comment "// reload 1779112003" from the tearing-down.ts file; locate the exact trailing comment text and delete that line (and any other identical "reload" marker comments if present) so the file contains no leftover dev reload markers before merging.

coderabbitai · 2026-05-18T21:21:58Z

+  it('does not publish when state write fails', async () => {
    const bus = new InMemoryStateBus();
-    const out = await awaitDecision(bus, 'approvals', 's1', 'tc-1', Date.now() + 1_000);
-    expect(out.kind).toBe('deny');
-    if (out.kind === 'deny') expect(out.reason).toBe('state_unavailable');
+    vi.spyOn(bus, 'set').mockRejectedValue(new Error('boom'));
+    const iii = { trigger: vi.fn() } as unknown as ISdk;
+    const out = await handleResolveWithEvents(iii, bus, 'approvals', {
+      session_id: 's1',
+      function_call_id: 'fc-1',
+      decision: 'allow',
+    });
+    expect((out as Record<string, unknown>).ok).toBe(false);
+    const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map(
+      (c) => (c[0] as { function_id: string }).function_id,
+    );
+    expect(fns).not.toContain('iii::durable::publish');


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Strengthen the failure-path side-effect assertion.

This assertion won’t catch regressions: handleResolveWithEvents never calls iii::durable::publish here, even on success. Assert no trigger calls (or specifically no stream::set) when bus.set fails.

✅ Suggested test fix

- const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map( - (c) => (c[0] as { function_id: string }).function_id, - ); - expect(fns).not.toContain('iii::durable::publish'); + const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map( + (c) => (c[0] as { function_id: string }).function_id, + ); + expect(fns).toHaveLength(0); + expect(fns).not.toContain('stream::set');

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

it('does not publish when state write fails', async () => {

const bus = new InMemoryStateBus();

const out = await awaitDecision(bus, 'approvals', 's1', 'tc-1', Date.now() + 1_000);

expect(out.kind).toBe('deny');

if (out.kind === 'deny') expect(out.reason).toBe('state_unavailable');

vi.spyOn(bus, 'set').mockRejectedValue(new Error('boom'));

const iii = { trigger: vi.fn() } as unknown as ISdk;

const out = await handleResolveWithEvents(iii, bus, 'approvals', {

session_id: 's1',

function_call_id: 'fc-1',

decision: 'allow',

});

expect((out as Record<string, unknown>).ok).toBe(false);

const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map(

(c) => (c[0] as { function_id: string }).function_id,

);

expect(fns).not.toContain('iii::durable::publish');

it('does not publish when state write fails', async () => {

const bus = new InMemoryStateBus();

vi.spyOn(bus, 'set').mockRejectedValue(new Error('boom'));

const iii = { trigger: vi.fn() } as unknown as ISdk;

const out = await handleResolveWithEvents(iii, bus, 'approvals', {

session_id: 's1',

function_call_id: 'fc-1',

decision: 'allow',

});

expect((out as Record<string, unknown>).ok).toBe(false);

const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map(

(c) => (c[0] as { function_id: string }).function_id,

);

expect(fns).toHaveLength(0);

expect(fns).not.toContain('stream::set');

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness-node/tests/approval-gate/pending.test.ts` around lines 96 - 109, The test currently only asserts that 'iii::durable::publish' wasn't called but that function is never invoked by handleResolveWithEvents here; update the failure-path assertion to ensure no side-effect triggers occur when InMemoryStateBus.set rejects: after mocking bus.set to throw, assert that iii.trigger was never called (or specifically assert that no call had function_id 'stream::set') — locate the InMemoryStateBus instance 'bus', the mocked 'set' call, and the SDK stub 'iii.trigger' in the test and replace the current fns check with an assertion on iii.trigger mock call count or absence of 'stream::set'.

ytallo added 4 commits May 18, 2026 17:43

sergiofilhowz approved these changes May 18, 2026

View reviewed changes

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

ytallo merged commit 0d41773 into main May 18, 2026
11 checks passed

ytallo mentioned this pull request May 19, 2026

feat(approval-gate): state-as-truth refactor — direct policy consult + reactive turn_state event + reload recovery #159

Merged

7 tasks

Conversation

ytallo commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

vercel Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

skill-check — worker

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ytallo commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading