Skip to content

feat(harness-node): reactive approval resume + orchestrator cleanup#156

Merged
ytallo merged 4 commits into
mainfrom
feat/harness-node
May 18, 2026
Merged

feat(harness-node): reactive approval resume + orchestrator cleanup#156
ytallo merged 4 commits into
mainfrom
feat/harness-node

Conversation

@ytallo
Copy link
Copy Markdown
Contributor

@ytallo ytallo commented May 18, 2026

Summary

  • Approval gate becomes iii-native and fully reactive. policy::approval_gate replies synchronously with allow / deny / pending; the polling await-loop is gone. approval::resolve writes {decision, reason} to approvals/<sid>/<cid> and a state trigger on the approvals scope (approval::is_decision_writeapproval::on_decision_written) wakes the orchestrator by republishing turn::step_requested.
  • Orchestrator FSM grows a function_awaiting_approval state (now 11 states). Pending calls accumulate on TurnStateRecord.awaiting_approval; PreparedEntry.pre_approved lets resumed turns dispatch without re-asking the gate. Aborts during pause write per-call {decision: 'aborted'} records so the same trigger releases the parked step.
  • Three more poll loops swapped for reactive triggers:
    • run::start_and_wait → new turn::on_terminal_state state trigger + in-process waiter; drops sync_poll_interval_ms.
    • hook-fanout::publish_collect → stream trigger on agent::hook_reply routes replies into an in-process collector; drops poll_interval_ms.
    • Harness sessions fanout → state trigger on session/<id>/turn_state creates instead of a 1 Hz state::list diff loop.
    • Orchestrator abort path → reactive turn::on_abort_signal trigger.
  • hook-fanout::publish_collect response now carries publish.{ok, error} and publish_failed so consultBefore can fail closed on bus unavailability with a gate_unavailable denial. first_block_wins preserves the full blocking reply (status, subscriber, approval_gate markers).
  • Session entries gain an explicit timestamp and IiiStateSessionStore.loadEntries sorts by (timestamp, id) so resumed approval replies land in the right transcript position.
  • Console wiring: model picker is driven by the harness-node models-catalog (composite provider::id keys); the real backend implements resolveApproval so the chat UI can approve/deny in-line.
  • Architecture docs refreshed to track everything above (system diagram, 11-state FSM, approval sequence, per-worker pages).

Test plan

  • pnpm exec tsc --noEmit clean
  • pnpm exec vitest run — 289 tests across 49 files green, including 11 new on-terminal tests, awaiting-approval, abort, on-decision-written, and the approval-resume e2e
  • Manual smoke: trigger an approval-gated tool from the chat UI, approve from the console, confirm the turn resumes and emits approval_resolved + the function result
  • Manual smoke: deny the same call, confirm function_call_denied event and orchestrator continues with denial result
  • Manual smoke: abort a session paused on approvals, confirm pending calls resolve with aborted and the turn tears down

Summary by CodeRabbit

Release Notes

  • New Features

    • Approval workflow now processes synchronously with reactive decision resolution, eliminating polling delays.
    • Added model grouping by provider in the model picker dropdown.
    • Function call approvals display real-time status feedback (approving, denying, waiting for agent).
    • Last-used model is now persisted and restored across sessions.
  • Bug Fixes

    • Fixed streaming indicator accuracy across multi-turn conversations.
    • Deduplicates repeated function-call events with the same ID.
  • Refactor

    • Converted turn orchestrator and approval systems from polling to event-driven state management.

ytallo added 4 commits May 18, 2026 17:43
…152)

* feat(console): drive model picker from harness-node models-catalog

Replace the hardcoded `ModelId` union and four-entry `MODELS` constant
with composite `provider::<catalog_model_id>` keys sourced from the
harness-node models-catalog worker.

- `lib/catalog-model-key.ts`: parse/format `provider::id` (uses `::`
  separator constant `CATALOG_MODEL_KEY_SEP`).
- `lib/models-catalog.ts`: client wrapper around `models::list` /
  `models::get` for the picker.
- `hooks/use-model-picker-source.ts`: live-updating picker source with a
  static fallback for when the engine is down or the mock backend is in
  use.
- `types/chat.ts`: `ModelId` widened to `string`; `STATIC_MODEL_OPTIONS`
  + `DEFAULT_MODEL` replace the old enum-shaped `MODELS`.
- `components/chat/ModelPicker.tsx`: accept `options` + `loading` props;
  fall back to `{id, label}` from the current value if the catalog is
  empty so the picker is never blank.
- `components/chat/Composer.tsx`, `components/chat/ChatView.tsx`,
  `pages/Chat.tsx`, `hooks/use-conversations.ts`,
  `pages/Examples/sections/*`, `pages/Playground/index.tsx`: thread the
  catalog-shaped model ids through; legacy heuristic ids without `::`
  still get the old `claude*` / `gemini*` provider guess in
  `resolveRunParams`.
- `lib/backend/real.ts`: `resolveRunParams` parses `provider::id` and
  falls back to the heuristic for legacy ids; doc comment updated.

* refactor(real): remove debug instrumentation from session event handling

This commit removes the debug instrumentation code that was previously used to log session events to a local ingestion endpoint. The removal streamlines the event handling process in the `realStream` function, improving code clarity and maintainability.

* feat(approval-gate): port PR #150 with iii-native resume + console wiring

Port the fail-closed approval-gate redesign from the Rust workspace to
harness-node and finish wiring the console/web UI so chat sessions can
approve/deny intercepted tool calls without polling.

Backend:
- intercept-at-gate: subscriber returns block_reply with status
  pending|denied; resolved decisions are drained later via
  approval::consume; pending entries can be force-resolved via
  approval::sweep_session
- run::resume rebuilds terminal TurnStateRecord (preserving turn_count
  + max_turns) and publishes a fresh turn::step_requested
- handle_resolve emits approval_resolved + asks the orchestrator to
  resume via a single iii::durable::publish on turn::step_requested;
  no polling loop and no explicit timeout (the bus default is the
  right backstop)
- subscriber detects terminal records and rebuilds via
  build_resume_plan before stepping
- hook-fanout merge_first_block_wins preserves the full blocking reply
  (status, function_call_id, subscriber, approval_gate, denial)
- publish-collect emits publish.{ok,error} + publish_failed so the
  orchestrator can fail closed on bus unavailability
- handle_awaiting + tearing_down each call approval::consume first;
  on non-empty result they stage the resolved entries as prepared
  calls and transition (or resurrect) into FunctionExecute
- router::abort composes [state::set abort, approval::sweep_session]
- session-tree sorts entries by (timestamp, id) so replies always
  appear in causal order

Console/web:
- approve/deny buttons call approval::resolve via the iii client
- the ui::session::event stream stays subscribed for the chat
  lifetime (no auto-close on first agent_end) so multi-turn
  approvals continue to flow
- approval prompts auto-scroll into view; submitting state shows
  disabled buttons + visible feedback while waiting for the agent
  to resume
- fcall-start events deduplicate on functionCallId; pending_approval
  placeholders suppress their fcall-end until the real result lands

Tests: TDD'd helpers, intercept/consume/sweep handlers, resume
plumbing, the orchestrator integration points, and an end-to-end
approval-resume.e2e covering intercept -> resolve -> publish ->
subscriber wake -> rebuild -> consume -> execute.

* feat(sessions-poll): implement reactive sessions fanout with state triggers

Replaces the polling mechanism for session updates with a reactive approach using state triggers. The new implementation filters state events for session creation and directly pushes updates to subscribers without polling. This change enhances efficiency and reduces latency in session updates, streamlining the fanout process. Key components include condition and handler functions for session creation events, and a direct trigger registration for state changes.

* Simplify iii-native approval gate flow (#154)

* Make approval pause explicit in the turn state machine

The iii-native approval flow needs a non-terminal state where the orchestrator can wait without tearing down the turn or rebuilding from provisioning. This adds the state value and locks its non-terminal behavior in the state test.

Constraint: Plan task 1 targets harness-node TypeScript only; Rust workers remain untouched.

Rejected: Keep using stopped/resume as the pause marker | later tasks remove that detour.

Confidence: high

Scope-risk: narrow

Tested: pnpm test tests/turn-orchestrator/state.test.ts

Not-tested: Full harness-node suite deferred until the batch verification step

* Let the turn record own approval waits

The pause-in-place design needs the orchestrator record to carry the function calls that are waiting on human decisions. This adds the awaiting_approval entry shape without changing any behavior yet.

Constraint: Field remains optional so fresh and existing records continue to deserialize without migration.

Rejected: Store pending calls only in approval-gate state | the simplification moves ownership into the orchestrator FSM.

Confidence: high

Scope-risk: narrow

Tested: pnpm test tests/turn-orchestrator/state.test.ts

Tested: pnpm typecheck

Not-tested: Full harness-node suite deferred until the batch verification step

* Make prepared function calls extensible for approvals

The upcoming pause-in-place approval handler needs to persist per-call metadata without overloading tuple positions. This converts prepared calls to an object shape and adds the optional pre_approved flag while preserving the current execution behavior.

Constraint: Existing staged values may use function_call or legacy tool_call keys, so the loader keeps both forms.

Rejected: Add a parallel pre-approved lookup table | it would split one prepared-call concern across two state keys.

Confidence: medium

Scope-risk: moderate

Directive: Keep PreparedEntry as the single persisted shape for prepared-call execution metadata.

Tested: pnpm test tests/turn-orchestrator/persistence-prepared.test.ts tests/turn-orchestrator/functions.test.ts tests/turn-orchestrator/assistant.test.ts tests/turn-orchestrator/tearing-down.test.ts

Tested: pnpm test tests/turn-orchestrator

Tested: pnpm test

Tested: pnpm typecheck

Tested: pnpm lint

* Define dispatch outcomes before approval cutover

The approval pause cutover needs dispatchWithHook to distinguish successful execution, hard denial, and pending approval. This commit introduces the discriminated result type before changing runtime dispatch behavior.

Constraint: Task 4 is type-only; dispatchWithHook still returns FunctionResult until the later refactor task.

Rejected: Change dispatch behavior now | the plan keeps the behavioral cutover staged.

Confidence: high

Scope-risk: narrow

Tested: pnpm test tests/turn-orchestrator/agent-call.test.ts

Tested: pnpm typecheck

Not-tested: Full harness-node suite deferred until batch verification

* Resume approved calls from an explicit wait state

The new approval pause state needs a handler that stays idle until every awaited decision is present, then updates prepared calls with either pre_approved metadata or a denial result before returning to function execution.

Constraint: Decisions are read from the approvals state scope using the existing session/function-call key shape.

Rejected: Poll approval::consume from generic assistant or teardown states | the simplification makes awaiting approval an explicit FSM state.

Confidence: medium

Scope-risk: moderate

Directive: Keep missing decisions as a no-op so durable turn::step_requested wakeups remain idempotent.

Tested: pnpm test tests/turn-orchestrator/awaiting-approval.test.ts

Tested: pnpm test tests/turn-orchestrator

Tested: pnpm typecheck

Not-tested: Full harness-node suite deferred until batch verification

* Route approval waits through the turn transition switch

The FSM can now enter function_awaiting_approval, so the central step dispatcher must route that state to the new handler instead of silently doing nothing.

Constraint: This only wires the state; handleExecute has not yet been cut over to produce it.

Rejected: Depend on fallthrough/no-op behavior | the wait state needs explicit wake handling from turn::step_requested.

Confidence: high

Scope-risk: narrow

Tested: pnpm test tests/turn-orchestrator/state.test.ts

Tested: pnpm test tests/turn-orchestrator

Tested: pnpm typecheck

Not-tested: Full harness-node suite deferred until batch verification

* Normalize awaiting-approval files for Biome

The new wait-state handler passed behavior and type checks, but Biome required import ordering and line wrapping before the batch could be called clean.

Constraint: Formatting-only follow-up after Task 5 and Task 6 verification.

Confidence: high

Scope-risk: narrow

Tested: pnpm test

Tested: pnpm typecheck

Tested: pnpm lint

* Return structured outcomes from hooked dispatch

The approval cutover needs dispatchWithHook to report whether a call ran, was denied, or is waiting for human approval. This changes the dispatcher API to return DispatchResult and keeps the LLM-facing agent::call wrapper unwrapped as a FunctionResult.

Constraint: handleExecute still consumes the old return shape until the next cutover task.

Rejected: Keep pending as a placeholder FunctionResult | the new FSM state owns approval waiting explicitly.

Confidence: medium

Scope-risk: moderate

Directive: Callers must branch on DispatchResult.kind before treating the value as a FunctionResult.

Tested: pnpm test tests/turn-orchestrator/agent-call.test.ts tests/turn-orchestrator/functions.test.ts

Not-tested: pnpm typecheck fails until handleExecute is cut over in Task 9

* Pause function execution in the approval wait state

handleExecute now treats pending approval as a real FSM pause instead of manufacturing a placeholder function result. It also resumes pre-approved calls without re-entering the hook and emits denial results without dispatching blocked calls.

Constraint: Old placeholder helper tests are skipped until the later cleanup task removes that machinery.

Rejected: Keep pending approval as a terminating FunctionResult | that leaks an artificial result into the transcript and keeps the resume detour alive.

Confidence: medium

Scope-risk: moderate

Directive: Preserve the existing executed-call check before dispatching so retrying a step does not duplicate approved function execution.

Tested: pnpm test tests/turn-orchestrator/functions.test.ts

Tested: pnpm test tests/turn-orchestrator

Tested: pnpm typecheck

Tested: pnpm test tests/integration/approval-resume.e2e.test.ts

Tested: pnpm test

Tested: pnpm lint

* Keep approval gate pending state in the orchestrator

The gate no longer writes a pending state-bus record when policy asks for human approval. It emits approval_requested and returns a pending hook reply; the turn orchestrator owns the pending-call list on its state record.

Constraint: The approvals state bus now stores resolved decisions only.

Rejected: Keep handleIntercept in the gate path | it preserves the obsolete pending-record lifecycle and duplicates orchestrator state.

Confidence: medium

Scope-risk: moderate

Directive: Do not add state-bus pending writes back to policy::approval_gate; pending ownership belongs to TurnStateRecord.awaiting_approval.

Tested: pnpm test tests/approval-gate/gate-subscriber.test.ts tests/approval-gate/events.test.ts

Tested: pnpm typecheck

* Drop approval request expiry from event types

Approval prompts no longer time out through the gate, so the shared event types should not promise an expires_at timestamp. The console typecheck also needed the currently-unused mode parameter marked intentionally unused.

Constraint: No timeout sweep behavior remains in the target approval-request event contract.

Rejected: Keep expires_at as optional | it would preserve dead UI expectations around automatic expiry.

Confidence: high

Scope-risk: narrow

Tested: pnpm typecheck in harness-node

Tested: pnpm typecheck in console/web

* Remove obsolete approval intercept lifecycle

The gate-subscriber now handles needs_approval directly and the orchestrator owns pending approval state, so the old intercept helper and its state-writing tests are dead. The old approval-resume integration is paused until the new pause-in-place integration lands later in the plan.

Constraint: Task 26 is responsible for replacing the legacy integration coverage.

Rejected: Keep intercept.ts as compatibility shim | no production caller remains and it would preserve obsolete pending-record semantics.

Confidence: medium

Scope-risk: moderate

Directive: Do not reintroduce approval-gate/intercept; route policy::approval_gate pending replies through gate-subscriber directly.

Tested: pnpm typecheck in harness-node

Tested: pnpm test tests/approval-gate tests/integration/approval-resume.e2e.test.ts

* Normalize console backend imports

Biome sorted the touched console backend imports after the approval event type cleanup. This keeps the batch verification clean without changing runtime behavior.

Constraint: Formatting-only follow-up from console/web verification.

Confidence: high

Scope-risk: narrow

Tested: pnpm typecheck in console/web

Tested: pnpm exec biome check src/lib/backend/real.ts src/types/iii-agent-event.ts in console/web

* Let approval resolution overwrite the wait record

The orchestrator now pauses in a native awaiting-approval state and reads the final decision directly from the state bus, so resolve no longer needs to validate or mutate a pending lifecycle record. It writes the compact decision payload and wakes the turn subscriber once the write succeeds.

Constraint: Keep approval::list_pending export until its registration is removed in the later registry cleanup task.

Rejected: Preserve pending/status/resolved_at transitions | the paused turn only needs decision and reason, and stale lifecycle fields recreate the old consume path.

Confidence: high

Scope-risk: narrow

Directive: Do not reintroduce preexisting pending-record checks here; the paused turn owns waiting semantics.

Tested: pnpm --filter harness-node test tests/approval-gate/pending.test.ts

* Retire wake-failed approval events

Resolve now writes a decision and publishes the turn wake directly. There is no retry wrapper that converts wake errors into stream events, so keeping approval_wake_failed advertised a lifecycle branch that no longer exists.

Constraint: Preserve approval_requested and approval_resolved event envelopes unchanged for existing consumers.

Rejected: Keep an unused event helper for compatibility | no caller remains and the event would only describe obsolete resume failure handling.

Confidence: high

Scope-risk: narrow

Directive: Add a new lifecycle event only when a real runtime branch emits it.

Tested: rg -n "emitApprovalWakeFailed|approval_wake_failed" harness-node/src harness-node/tests

Tested: pnpm --filter harness-node typecheck && pnpm --filter harness-node test tests/approval-gate

* Remove approval placeholder resurrection helpers

Approval waits now stay inside the function FSM, so the old placeholder, consume, and resurrection helpers no longer describe the runtime path. Finalize appends real function results directly, awaiting-assistant no longer polls approval::consume, and teardown no longer resurrects terminal turns.

Constraint: Keep the paused-turn approval path in function_awaiting_approval; resolved decisions are read from state during function execution resumption.

Rejected: Leave consume helpers until a later pass | deleting the helpers without their callers would break typecheck, and keeping them would preserve dead behavior.

Confidence: high

Scope-risk: moderate

Directive: Do not add transcript placeholder replacement back unless function_awaiting_approval is removed or the transcript again stores synthetic pending results.

Tested: pnpm --filter harness-node test tests/turn-orchestrator

Tested: pnpm --filter harness-node typecheck

Tested: pnpm --filter harness-node lint

* Drop obsolete approval list and consume APIs

Approval decisions are now read directly by the paused function state, so the separate list_pending and consume endpoints are dead runtime surface. Removing them keeps the approval gate contract to resolve, sweep-on-abort, and the policy subscriber.

Constraint: Keep approval::sweep_session registered until abort cleanup is converted to inline state cleanup.

Rejected: Remove sweep in the same pass | router::abort still calls approval::sweep_session and that cleanup belongs to the abort task.

Confidence: high

Scope-risk: narrow

Directive: Do not re-add list_pending or consume for resume; function_awaiting_approval is the resume boundary.

Tested: pnpm --filter harness-node test tests/approval-gate tests/harness/policy.test.ts tests/turn-orchestrator/assistant.test.ts tests/turn-orchestrator/tearing-down.test.ts

Tested: pnpm --filter harness-node lint && pnpm --filter harness-node typecheck && pnpm --filter harness-node test

* Wake turns from approval state writes

Approval decisions are now intended to be reactive: the state write is the event that wakes the orchestrator. This adds the adapter that turns an approvals-scope state trigger into the existing turn::step_requested durable publish.

Constraint: State trigger payload keys use the approval key shape <session_id>/<function_call_id>.

Rejected: Debounce approval decision writes | repeated wakes are harmless and debouncing would add state to a pure adapter.

Confidence: high

Scope-risk: narrow

Directive: Keep this adapter thin; orchestration policy belongs in function_awaiting_approval.

Tested: pnpm --filter harness-node test tests/approval-gate/on-decision-written.test.ts

* Register approval decision state trigger

The approval gate now exposes the on-decision adapter as a state trigger on the approvals scope. Any write to an approval decision key can wake the turn orchestrator without the writer knowing about turn::step_requested.

Constraint: Register with scope only and no key filter so every approvals/<session>/<call> decision write is observed.

Rejected: Keep wake-up only in approval::resolve | abort and future override writers would still need duplicated publish logic.

Confidence: high

Scope-risk: narrow

Directive: Keep the trigger scope tied to loadApprovalGateConfig so custom approval scopes keep working.

Tested: pnpm --filter harness-node typecheck

Tested: pnpm --filter harness-node test tests/approval-gate

* Let approval resolve rely on reactive wakeups

The approvals state trigger now owns turn wake-up, so approval::resolve should only persist the decision and emit approval_resolved. Removing resumeSession eliminates the duplicated explicit durable publish from the resolve path.

Constraint: approval::on_decision_written is registered on the approvals scope and publishes turn::step_requested after the state write.

Rejected: Keep resolve publishing as a fallback | it would double-wake sessions once the state trigger is active and blur responsibility for the wake-up.

Confidence: high

Scope-risk: narrow

Directive: Writers to the approvals scope should not publish turn::step_requested directly; the state trigger is the boundary.

Tested: pnpm --filter harness-node test tests/approval-gate/pending.test.ts

Tested: pnpm --filter harness-node typecheck && pnpm --filter harness-node test tests/approval-gate && pnpm --filter harness-node lint

* Abort approvals through state decisions

The abort path now writes aborted approval decisions directly for turns paused in function_awaiting_approval. Those writes are enough to wake the orchestrator through the approvals state trigger, so abort side effects no longer call sweep_session or publish turn::step_requested.

Constraint: This branch did not yet have performAbortSideEffects, so this applies the missing inline cleanup and the reactive no-publish behavior together.

Rejected: Keep approval::sweep_session in abort side effects | it preserves the old lifecycle endpoint and bypasses the decision-write trigger boundary.

Confidence: high

Scope-risk: moderate

Directive: Abort approval cleanup should write approvals/<session>/<call> decisions and let the state trigger wake the FSM.

Tested: pnpm --filter harness-node test tests/turn-orchestrator/abort.test.ts

Tested: pnpm --filter harness-node typecheck && pnpm --filter harness-node test tests/turn-orchestrator/abort.test.ts && pnpm --filter harness-node lint

* Cover reactive approval wake integration

The integration fixture now simulates the engine firing a state trigger after approvals-scope writes. A direct decision write proves the adapter publishes turn::step_requested without approval::resolve doing it explicitly.

Constraint: The fake state trigger runs on a microtask to match the engine's async trigger behavior closely enough for this unit-level integration test.

Rejected: Restore the skipped legacy approval-resume contract | it tested the old explicit resume path rather than the reactive state-trigger boundary.

Confidence: high

Scope-risk: narrow

Directive: Approval resume integration should model approvals state writes as the wake source.

Tested: pnpm --filter harness-node test tests/integration/approval-resume.e2e.test.ts

Tested: pnpm --filter harness-node typecheck && pnpm --filter harness-node test tests/integration/approval-resume.e2e.test.ts && pnpm --filter harness-node lint

* Remove terminal resume path from turn stepping

Approval waits now pause inside function_awaiting_approval and wake through approvals state writes. Terminal-session resurrection via run::resume and buildResumePlan is dead behavior, so turn::step now treats terminal records as terminal and no longer republishes while waiting for approval.

Constraint: Keep publishStep for the normal FSM pump after non-terminal transitions.

Rejected: Preserve run::resume for compatibility | no active approval path calls it and it would revive the old stopped-session model.

Confidence: high

Scope-risk: moderate

Directive: Do not reintroduce terminal resurrection unless approval waiting leaves the FSM again.

Tested: pnpm --filter harness-node test tests/turn-orchestrator

Tested: pnpm --filter harness-node typecheck

Tested: pnpm --filter harness-node lint

* Drop approval sweep and timeout remnants

The gate no longer owns pending-record lifecycles or timeouts. Abort writes aborted decisions inline and approval decisions wake through the state trigger, so approval::sweep_session, buildPendingRecord, and approval_gate.default_timeout_ms are dead surface.

Constraint: Leave unrelated hook-fanout and turn-orchestrator timeout config intact.

Rejected: Keep sweep as an internal fallback | no caller remains and it depends on the deleted pending record shape.

Confidence: high

Scope-risk: narrow

Directive: Approval cleanup should write decision records directly; do not restore pending/resolved sweep state.

Tested: pnpm --filter harness-node test tests/approval-gate tests/turn-orchestrator/abort.test.ts

Tested: pnpm --filter harness-node typecheck

Tested: pnpm --filter harness-node lint

* Remove stale approval placeholder branches

The approval wait path no longer emits synthetic pending function results, so console translation does not need to suppress pending_approval function_execution_end events. The hook comment now describes the actual pending dispatch result instead of the removed placeholder flow.

Constraint: Preserve approval_requested and approval_resolved UI event translation.

Rejected: Keep placeholder suppression defensively | it preserves a deleted wire shape and hides unexpected function results.

Confidence: high

Scope-risk: narrow

Directive: Pending approval UI should be driven by approval_requested, not synthetic function_result placeholders.

Tested: pnpm --filter chat-app typecheck

Tested: pnpm --filter chat-app exec biome check src/lib/backend/translate.ts

Tested: pnpm --filter harness-node test tests/turn-orchestrator/agent-call.test.ts && pnpm --filter harness-node lint

* Clean console approval lint drift

Repo-wide lint surfaced small console issues around approval resolve handlers and import ordering. Removing non-null assertions and stale suppressions keeps the cleanup branch lintable without changing approval UI behavior.

Constraint: Keep existing approval resolve callback contract unchanged.

Rejected: Leave console lint failures as unrelated | the approval callback warnings are on this feature surface and are cheap to make safe.

Confidence: high

Scope-risk: narrow

Directive: Keep approval IDs narrowed before invoking resolve callbacks instead of using non-null assertions.

Tested: pnpm --filter chat-app lint && pnpm --filter chat-app typecheck

Tested: pnpm -r lint

* fix(approval-gate): match engine state:* event_type values

* test(approval-gate): use engine state:* event_type values

* test(integration): fake-iii emits engine-correct event_type for state writes

* feat(approval-gate): filter approvals state trigger with condition function

Add isDecisionWrite/CONDITION_FN_ID to on-decision-written.ts and wire
condition_function_id into the state trigger config in register.ts so the
iii engine skips the handler on state:deleted and writes lacking a decision
field, without touching the in-handler guard (removed in Task 5).

* refactor(approval-gate): handler relies on engine-side condition for filtering

* feat(turn-orchestrator): adapter + condition for reactive abort wake

* feat(turn-orchestrator): register reactive abort_signal state trigger

* test(integration): abort_signal write triggers reactive wake

* chore: final cleanup after reactive trigger follow-up

Fix biome import-order lint: move `import type { ISdk }` before the
value imports in on-abort-signal.test.ts.

* Remove obsolete approval event handling and tests

This commit deletes the `events.ts` file, which contained the `emitApprovalRequested` and `emitApprovalResolved` functions, as their functionality has been integrated directly into the `gate-subscriber` and `pending` modules. Corresponding test cases in `events.test.ts` have also been removed to reflect this change. The new implementation uses `streamSet` directly for emitting approval events, streamlining the approval gate flow.
Sync the harness-node docs introduced on main with this branch's
pause-and-resume refactor: 11-state FSM with function_awaiting_approval,
state-trigger wakes on the approvals and agent scopes, hook-fanout's
stream-trigger reply path, and the harness sessions fanout moving from a
1 Hz state::list diff to a state trigger.

Also drop poll_interval_ms from PublishCollectConfig — it's leftover from
the old polling implementation and unused in the reactive path.
…ate trigger

executeSync now installs an in-process waiter keyed by session_id, kicks
the run, and races the waiter against sync_default_timeout_ms instead of
polling persistence every sync_poll_interval_ms. The waiter is resolved
by a new `state` trigger on scope=agent gated by
turn::is_terminal_state_write, which fires the moment the FSM lands
`session/<sid>/turn_state` with state=stopped.

Drops sync_poll_interval_ms from the config type, default loader, and
config.yaml — the only consumer is gone.
- Add realResolveApproval calling approval::resolve over the iii bus and
  expose it via ChatBackend.resolveApproval so the chat UI can approve or
  deny pending function calls.
- Thread sessionId into translateAgentEvent so approval-related stream
  events carry their session context.
- Drop the stale "Phase 3 adds approve/deny buttons" comment now that the
  flow is live end-to-end.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
workers Ready Ready Preview, Comment May 18, 2026 9:13pm

Request Review

@github-actions
Copy link
Copy Markdown
Contributor

skill-check — worker

0 verified, 10 skipped (no docs/).

Layer Result
structure
vale
ai

Three for three. Nicely done.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

📝 Walkthrough

Walkthrough

This PR implements a reactive, event-driven approval flow for the turn orchestrator, replacing polling-based decision waiting with state-triggered resumption. The approval gate now returns synchronously with allow/deny/pending outcomes; pending calls are parked in a new function_awaiting_approval turn state; decisions written to durable state trigger turn::step_requested publications to resume parked turns; and the chat UI gains interactive approval resolution with async callback support and feedback.

Changes

Reactive approval flow

Layer / File(s) Summary
Approval data contracts
console/web/src/lib/backend/types.ts, console/web/src/types/chat.ts, console/web/src/types/iii-agent-event.ts, harness-node/src/turn-orchestrator/state.ts, harness-node/src/types/agent-event.ts
fcall-start events now carry optional functionCallId and sessionId for approval resolution pairing. AgentEvent's approval_requested variant removes expires_at (synchronous gate behavior). TurnStateRecord adds optional awaiting_approval array of AwaitingApprovalEntry to track parked function calls.
Approval gate synchronous responses
harness-node/src/approval-gate/gate-subscriber.ts, harness-node/src/approval-gate/types.ts, harness-node/src/approval-gate/config.ts
Gate subscriber now returns allow/deny/pending immediately without polling or state-backed pending records. Pending path emits approval_requested event; deny path emits function_call_denied. blockReplyFor helper standardizes gate response envelope with subscriber and approval_gate flags. Config removes default_timeout_ms.
Decision-written trigger and event emission
harness-node/src/approval-gate/on-decision-written.ts, harness-node/src/approval-gate/pending.ts, harness-node/src/approval-gate/register.ts
New on-decision-written adapter detects decision writes to approvals state scope and publishes turn::step_requested to resume orchestrator. handleResolveWithEvents persists decision and emits approval_resolved event to agent::events. Registration wires decision-written trigger on approval_state_scope.
Function dispatch result types and hook routing
harness-node/src/turn-orchestrator/agent-call.ts, harness-node/src/turn-orchestrator/hook.ts
DispatchResult union models three outcomes: result, deny, pending. dispatchWithHook now accepts optional session_id and returns pending when gate indicates needs_approval. consultBefore hook routes session_id into fanout payload to enable decision correlation.
Prepared entries and function execution
harness-node/src/turn-orchestrator/persistence.ts, harness-node/src/turn-orchestrator/states/functions.ts
PreparedEntry type adds optional pre_approved flag for short-circuit execution. handlePrepare builds entry objects with blocked: null (replacing tuples). handleExecute introduces early bypass paths: pre_approved entries execute directly via iii.trigger; blocked entries persist as errors. Pending outcomes enqueue calls into rec.awaiting_approval and transition to function_awaiting_approval.
Awaiting-approval state handler
harness-node/src/turn-orchestrator/states/functions.ts, harness-node/src/turn-orchestrator/transitions.ts
New handleAwaitingApproval loads awaiting entries, reads approval decisions in parallel from persisted state, updates prepared calls with either pre_approved: true or blocked: <denial>, persists updates, clears awaiting_approval, and transitions back to function_execute. Router includes handler in function_awaiting_approval case.
Terminal waiter and abort signal adapters
harness-node/src/turn-orchestrator/on-terminal.ts, harness-node/src/turn-orchestrator/on-abort-signal.ts, harness-node/src/turn-orchestrator/abort.ts, harness-node/src/turn-orchestrator/run-start.ts, harness-node/src/turn-orchestrator/register.ts
Terminal waiter mechanism replaces polling: installTerminalWaiter registers promise resolver, isTerminalStateWrite filters for session/<id>/turn_state writes with state='stopped', handleTerminalStateWrite resolves waiter. Abort adapter detects abort signal writes and publishes turn::step_requested. performAbortSideEffects writes aborted decisions for pending approvals. run-start.ts installs terminal waiter and races timeout.
Chat approval resolution UI
console/web/src/components/chat/FunctionCallMessage.tsx, console/web/src/components/chat/Message.tsx, console/web/src/components/chat/MessageList.tsx, console/web/src/components/chat/ChatView.tsx, console/web/src/lib/backend/real.ts, console/web/src/lib/backend/translate.ts
FunctionCallMessage adds async runResolve helper with submitting and submitError state; buttons show "approving…/denying…" during in-flight and render error/waiting messages. Message derives onApprove/onDeny from onResolveApproval prop. MessageList forwards callback and auto-scrolls newest pending approval. ChatView wires backend.resolveApproval. realResolveApproval calls approval::resolve backend method. translateAgentEvent threads sessionId into fcall-start payloads.
Hook-fanout event-driven collection
harness-node/src/hook-fanout/publish-collect.ts, harness-node/src/hook-fanout/types.ts, harness-node/src/hook-fanout/merge.ts
Shifts from polling stream::list loop to in-process pending map keyed by event_id. handleStreamReply stream trigger routes incoming replies to matching Collector. Response shape includes publish status object and publish_failed flag. mergeFirstBlockWins now preserves full blocking reply verbatim (including status, subscriber, approval_gate markers).
Session creation fanout event-driven
harness-node/src/harness/fanout/sessions-poll.ts
Replaces 1Hz state-list polling diff loop with state-triggered condition/handler on state:created events matching session/<id>/turn_state keys. Extracts newly-created session_id and fans out to subscribers without storing polling state.
Session entry sorting and timestamp helper
harness-node/src/session/tree/types.ts, harness-node/src/session/tree/store.ts
Adds entryTimestamp(e) helper to return SessionEntry.timestamp field. loadEntries now sorts by (timestamp, id) instead of id alone for correct transcript ordering on resumed replies.
Select component grouped optgroups
console/web/src/components/ui/Select.tsx
Introduces SelectGroup<T> type and optional groups prop; options becomes optional. Conditional rendering: if groups present, renders <optgroup> + nested <option>s; otherwise maps options.
Model selection persistence
console/web/src/lib/storage.ts, console/web/src/hooks/use-conversations.ts, console/web/src/components/chat/ModelPicker.tsx, console/web/src/pages/Chat.tsx
New loadLastModel/saveLastModel helpers persist selected model to localStorage. useConversations accepts optional catalogReady flag and initializes conversations with persisted last model or fallback. setModel persists selection. Chat passes !catalogLoading to gate model migration. ModelPicker groups options by provider via SelectGroup.
Docs and configuration
harness-node/config.yaml, harness-node/README.md, harness-node/docs/architecture.md, harness-node/docs/workers/*.md, iii-permissions.yaml
Architecture docs updated to reflect reactive approval flow (synchronous gate, state-trigger resumption, no polling). Worker docs describe new trigger registration, decision-written wakeup, terminal waiter. Config removes default_timeout_ms and polling intervals. Permissions removes approval::list_pending default grant. Turn FSM updated to 10-state with explicit function_awaiting_approval.
Test coverage
harness-node/tests/approval-gate/*.test.ts, harness-node/tests/hook-fanout/*.test.ts, harness-node/tests/turn-orchestrator/*.test.ts, harness-node/tests/session/tree/*.test.ts, harness-node/tests/integration/approval-resume.e2e.test.ts, harness-node/tests/harness/policy.test.ts
Added/updated 30+ test files covering gate-subscriber behavior, decision-written trigger, pending resolve, awaiting-approval state handler, terminal waiter, abort signal, stream-reply routing, session-entry sorting, and end-to-end approval resumption flow. Tests validate reactive behavior, event emission, state transitions, and integration scenarios.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

This is a large, multi-layer architectural refactor spanning both web console and harness orchestration: the approval flow shifts from polling to event-driven state triggers; turn orchestration gains a new function_awaiting_approval state with reactive resumption; hook-fanout and session polling convert from scheduled loops to stream/state triggers; chat UI gains approval resolution with async callbacks and streaming feedback; and configuration, permissions, and documentation are updated throughout. The changes are tightly coupled across multiple domains (types, data structures, event flow, state machines, UI, persistence, triggers) and introduce new async patterns and reactive trigger coordination that require careful review for correctness, concurrency safety, and test coverage completeness.

Possibly related PRs

  • iii-hq/workers#151: Both PRs modify console/web/src/components/chat/ChatView.tsx's streaming event handling around fcall-start and assistant-end (one adds function-call dedup/approval mapping, the other adds defensive boundary/stream reset behavior), so the changes are directly related at the same event-handling code level.
  • iii-hq/workers#106: Both PRs are related because they both shift the approval/resolve flow to function-call identifiers (function_call_id instead of tool-call ids) and update the corresponding UI/backend wiring to pair approvals with the correct call.

Suggested reviewers

  • sergiofilhowz

🐰 A harness turns from polls to events,
Approvals park and wait with debts,
State triggers wake the sleeping turn,
Chat buttons show decisions burn—
Reactive flow: the future's bet!

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/harness-node

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (6)
harness-node/src/approval-gate/state-bus.ts (1)

43-49: 💤 Low value

Consider simplifying the response shape detection.

The nested conditional with multiple type assertions works but could be more readable.

♻️ Alternative implementation
-      // The engine's state::list returns the values directly as an array.
-      // Earlier harness versions returned `{items: [...]}` so accept both.
-      const items = Array.isArray(resp)
-        ? resp
-        : Array.isArray((resp as { items?: unknown[] } | null)?.items)
-          ? ((resp as { items: unknown[] }).items as unknown[])
-          : [];
+      // The engine's state::list returns the values directly as an array.
+      // Earlier harness versions returned `{items: [...]}` so accept both.
+      let items: unknown[];
+      if (Array.isArray(resp)) {
+        items = resp;
+      } else if (resp && typeof resp === 'object' && 'items' in resp) {
+        const maybeItems = (resp as Record<string, unknown>).items;
+        items = Array.isArray(maybeItems) ? maybeItems : [];
+      } else {
+        items = [];
+      }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/approval-gate/state-bus.ts` around lines 43 - 49, The nested
conditional creating items from resp is verbose; simplify the shape detection by
checking Array.isArray(resp) first, then Array.isArray((resp as any)?.items) and
falling back to [] so you can remove the repeated type assertions and
nesting—update the logic that assigns items (the const items = ... block) to use
a single ternary or short if/else that returns resp when it's an array,
resp.items when that is an array, or [] otherwise, referencing resp and items in
state-bus.ts.
harness-node/src/turn-orchestrator/states/assistant.ts (1)

312-312: ⚡ Quick win

Remove development artifact comment.

The comment // reload 1779112003 appears to be a development or debugging marker with no documentation value.

🧹 Proposed cleanup
-// reload 1779112003
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/turn-orchestrator/states/assistant.ts` at line 312, Remove
the development artifact comment "// reload 1779112003" from assistant.ts;
locate the exact comment string in the file and delete it, ensuring no
surrounding code relies on it (it's a pure comment), then run linters/tests to
confirm no trailing whitespace or formatting issues remain after removal.
harness-node/tests/hook-fanout/merge.test.ts (1)

25-44: ⚡ Quick win

Consider verifying full object preservation.

The test asserts that individual known fields are preserved but doesn't verify that the merged result is deeply equal to the original blocking reply. If the blocking reply contains additional fields beyond the six checked, they wouldn't be validated.

🧪 Suggested enhancement for "verbatim" claim
   const merged = mergeFirstBlockWins(replies);
-  expect(merged.block).toBe(true);
-  expect(merged.status).toBe('pending');
-  expect(merged.function_call_id).toBe('call-1');
-  expect(merged.function_id).toBe('shell::exec');
-  expect(merged.subscriber).toBe('approval-gate');
-  expect(merged.approval_gate).toBe(true);
+  // Verify full object is preserved verbatim
+  expect(merged).toEqual(replies[1]);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/tests/hook-fanout/merge.test.ts` around lines 25 - 44, The test
for mergeFirstBlockWins currently checks only individual fields; update it to
assert the merged result is deeply equal to the original blocking reply to
guarantee full object preservation (including any extra fields). Locate the test
case using mergeFirstBlockWins and replace or augment the individual expects
with a single deep-equality assertion (e.g.,
expect(merged).toStrictEqual(replies[1]) or equivalent) so the merged object is
verified verbatim against the blocking reply.
harness-node/src/approval-gate/on-decision-written.ts (1)

33-54: 💤 Low value

Consider validating the extracted session_id.

The function extracts session_id from the key by taking the substring before the first slash, but doesn't validate that the extracted value is non-empty or well-formed. While the slash <= 0 check prevents keys starting with /, a key like "a/" would produce session_id = "a" without further validation.

🛡️ Suggested validation
   const session_id = key.slice(0, slash);
+  if (session_id.length === 0) return;

   try {
     await iii.trigger<unknown, unknown>({
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/approval-gate/on-decision-written.ts` around lines 33 - 54,
The extracted session_id in handleDecisionWritten is not validated beyond
slicing before the first slash; after computing session_id = key.slice(0, slash)
add a validation step to ensure session_id is non-empty and well-formed (e.g.,
not just whitespace and matching your expected pattern such as
/^[A-Za-z0-9_-]+$/ or UUID format if applicable). If validation fails, bail out
early (return) and optionally emit a debug/warn via logger.warn indicating an
invalid session_id with the original key; keep the existing iii.trigger call and
error handling unchanged when session_id is valid.
console/web/src/lib/backend/types.ts (1)

20-29: 💤 Low value

Clarify the semantic distinction between pendingApproval and the new approval resolution fields.

The pendingApproval boolean and the new functionCallId + sessionId fields work together but aren't redundant:

  • pendingApproval is a UI state flag indicating the message awaits user approval (pending → running → done lifecycle).
  • functionCallId + sessionId are identifiers needed to resolve the approval when the user clicks approve/deny.

These fields appear together in approval_requested events, but functionCallId + sessionId also appear independently in regular function_execution_start events (without pendingApproval). The boolean flag cannot be reliably inferred from the presence of IDs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/web/src/lib/backend/types.ts` around lines 20 - 29, Clarify in the
type definition for the 'fcall-start' message that pendingApproval is a UI
lifecycle flag and not derivable from the presence of IDs: update the inline
comments for pendingApproval, functionCallId, and sessionId in the 'fcall-start'
union branch to state that pendingApproval indicates the message is awaiting
user approval (pending → running → done), while functionCallId and sessionId are
identifiers used to resolve an approval action (approve/deny) and may appear
independently in other events such as function_execution_start; ensure the
comments mention that the boolean cannot be inferred from the presence of the
IDs and that approval resolution code should use functionCallId+sessionId when
handling user actions.
harness-node/tests/integration/approval-resume.e2e.test.ts (1)

22-22: 💤 Low value

Prefer strict equality (===) over loose equality (==).

Lines 22 and 33 use old_value == null. Since line 17 explicitly coerces undefined to null with ?? null, old_value will always be either the stored value or null. Using strict equality (old_value === null) is more conventional and makes the intent clearer.

♻️ Proposed fix
-          event_type: old_value == null ? 'state:created' : 'state:updated',
+          event_type: old_value === null ? 'state:created' : 'state:updated',

Apply the same change at line 33.

Also applies to: 33-33

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/tests/integration/approval-resume.e2e.test.ts` at line 22,
Change the loose null check `old_value == null` to a strict comparison
`old_value === null` wherever it appears (specifically the ternary that sets
`event_type` and the similar check later), since `old_value` is already
normalized to either a value or null; update both instances to use `===` to make
the intent explicit and avoid loose equality.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@harness-node/docs/workers/turn-orchestrator.md`:
- Line 56: Update the FSM docs row for the function_finalize state to use the
exact emitted event name: replace the incorrect `function_call_end` with
`function_execution_end` so the table reads that `function_finalize` persists
results and emits `function_execution_end` + `turn_end`; reference the
`function_finalize` state and the `function_execution_end` event to keep the
contract consistent with the orchestrator implementation.

In `@harness-node/src/approval-gate/gate-subscriber.ts`:
- Around line 65-108: The current logic calls streamSet before writeHookReply so
any streamSet error will prevent the gate response from being sent; to fix, wrap
the streamSet calls in a try/catch in both the deny branch (where
permissionsDenyEnvelope is used and reply is returned) and the pending branch
(approval_requested), log or handle the streamSet error but always proceed to
call writeHookReply with the constructed reply (and return it in the deny case);
reference streamSet, writeHookReply, permissionsDenyEnvelope, SUBSCRIBER_NAME,
uuidLike and ensure the deny branch still returns the reply after writeHookReply
even if streamSet failed.

In `@harness-node/src/approval-gate/pending.ts`:
- Around line 68-79: handleResolve currently calls streamSet to emit an
'approval_resolved' event (stream_name 'agent::events', item_id
`approval-${uuidLike()}`) but doesn't handle failures, so a thrown error can
make the API call fail even after the decision is committed; wrap the
streamSet(...) call in a try/catch inside handleResolve, and on error do not
rethrow—instead log the failure (using the module's existing logger or
console.error) with contextual details (session_id, function_call_id, decision,
and the caught error) so the resolve flow remains successful while still
recording the event-write failure.

In `@harness-node/src/hook-fanout/publish-collect.ts`:
- Around line 188-201: The publish try/catch silently records publish_error but
lets the IIFE continue waiting until the deadline; modify the catch in the async
IIFE that calls buildPublishEnvelope and iii.trigger (function_id
'iii::durable::publish') so it fails fast: after logging and setting
publish_error (and keeping the existing logger.warn and publish_error
assignment), immediately rethrow the caught error (or otherwise reject/abort the
surrounding wait) so the caller stops waiting instead of stalling until the
deadline.

In `@harness-node/src/turn-orchestrator/agent-call.ts`:
- Around line 152-157: The code is incorrectly classifying failures from
iii.trigger as gate denials by returning kind: 'deny' with
denialResult(gateUnavailableEnvelope(...)); change this path so arbitrary
runtime/tool errors from iii.trigger are not mapped to authorization denials:
replace the deny/denialResult/gateUnavailableEnvelope return with an
error/failure variant (for example kind: 'error' or 'failure') and wrap the
original error (String(err)) in an execution-specific envelope/result (e.g.,
executionErrorEnvelope or errorResult) that references
function_call.function_id; ensure callers that handle denials versus execution
errors can distinguish the new kind.

In `@harness-node/src/turn-orchestrator/hook.ts`:
- Around line 50-58: The handler currently marks session_id optional which
allows a missing session_id to create an inner payload without it and
accidentally bypass approval; change the function signature to require
session_id (remove the optional ?), and add a runtime guard in the function that
throws or returns a failing HookOutcome when session_id is falsy to enforce
fail-closed behavior; ensure the inner payload construction (the inner variable
that includes session_id, function_call, approval_required) always includes
session_id and update any callsites if needed so extractCall/approval-gate
always receive a non-null session_id.

In `@harness-node/src/turn-orchestrator/states/functions.ts`:
- Around line 82-86: The resumed pre_approval branch dispatches iii.trigger
using raw fc.arguments (entry.pre_approved path) which loses the session/call
metadata added in the normal path; modify the entry.pre_approved === true branch
to build the same augmented payload as the regular path (merge fc.arguments with
the session/call context fields used in the regular flow—e.g., call/session IDs,
caller info, metadata) and pass that merged object to iii.trigger with
fc.function_id so resumed executions receive the same enriched context as
first-pass executions.

In `@harness-node/src/turn-orchestrator/states/tearing-down.ts`:
- Line 27: Remove the development artifact comment "// reload 1779112003" from
the tearing-down.ts file; locate the exact trailing comment text and delete that
line (and any other identical "reload" marker comments if present) so the file
contains no leftover dev reload markers before merging.

In `@harness-node/tests/approval-gate/pending.test.ts`:
- Around line 96-109: The test currently only asserts that
'iii::durable::publish' wasn't called but that function is never invoked by
handleResolveWithEvents here; update the failure-path assertion to ensure no
side-effect triggers occur when InMemoryStateBus.set rejects: after mocking
bus.set to throw, assert that iii.trigger was never called (or specifically
assert that no call had function_id 'stream::set') — locate the InMemoryStateBus
instance 'bus', the mocked 'set' call, and the SDK stub 'iii.trigger' in the
test and replace the current fns check with an assertion on iii.trigger mock
call count or absence of 'stream::set'.

---

Nitpick comments:
In `@console/web/src/lib/backend/types.ts`:
- Around line 20-29: Clarify in the type definition for the 'fcall-start'
message that pendingApproval is a UI lifecycle flag and not derivable from the
presence of IDs: update the inline comments for pendingApproval, functionCallId,
and sessionId in the 'fcall-start' union branch to state that pendingApproval
indicates the message is awaiting user approval (pending → running → done),
while functionCallId and sessionId are identifiers used to resolve an approval
action (approve/deny) and may appear independently in other events such as
function_execution_start; ensure the comments mention that the boolean cannot be
inferred from the presence of the IDs and that approval resolution code should
use functionCallId+sessionId when handling user actions.

In `@harness-node/src/approval-gate/on-decision-written.ts`:
- Around line 33-54: The extracted session_id in handleDecisionWritten is not
validated beyond slicing before the first slash; after computing session_id =
key.slice(0, slash) add a validation step to ensure session_id is non-empty and
well-formed (e.g., not just whitespace and matching your expected pattern such
as /^[A-Za-z0-9_-]+$/ or UUID format if applicable). If validation fails, bail
out early (return) and optionally emit a debug/warn via logger.warn indicating
an invalid session_id with the original key; keep the existing iii.trigger call
and error handling unchanged when session_id is valid.

In `@harness-node/src/approval-gate/state-bus.ts`:
- Around line 43-49: The nested conditional creating items from resp is verbose;
simplify the shape detection by checking Array.isArray(resp) first, then
Array.isArray((resp as any)?.items) and falling back to [] so you can remove the
repeated type assertions and nesting—update the logic that assigns items (the
const items = ... block) to use a single ternary or short if/else that returns
resp when it's an array, resp.items when that is an array, or [] otherwise,
referencing resp and items in state-bus.ts.

In `@harness-node/src/turn-orchestrator/states/assistant.ts`:
- Line 312: Remove the development artifact comment "// reload 1779112003" from
assistant.ts; locate the exact comment string in the file and delete it,
ensuring no surrounding code relies on it (it's a pure comment), then run
linters/tests to confirm no trailing whitespace or formatting issues remain
after removal.

In `@harness-node/tests/hook-fanout/merge.test.ts`:
- Around line 25-44: The test for mergeFirstBlockWins currently checks only
individual fields; update it to assert the merged result is deeply equal to the
original blocking reply to guarantee full object preservation (including any
extra fields). Locate the test case using mergeFirstBlockWins and replace or
augment the individual expects with a single deep-equality assertion (e.g.,
expect(merged).toStrictEqual(replies[1]) or equivalent) so the merged object is
verified verbatim against the blocking reply.

In `@harness-node/tests/integration/approval-resume.e2e.test.ts`:
- Line 22: Change the loose null check `old_value == null` to a strict
comparison `old_value === null` wherever it appears (specifically the ternary
that sets `event_type` and the similar check later), since `old_value` is
already normalized to either a value or null; update both instances to use `===`
to make the intent explicit and avoid loose equality.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 374c6ace-4e44-4d62-9e1b-bc120413b032

📥 Commits

Reviewing files that changed from the base of the PR and between fce3152 and c7b7012.

📒 Files selected for processing (78)
  • .gitignore
  • console/web/src/components/chat/ChatView.tsx
  • console/web/src/components/chat/FunctionCallMessage.tsx
  • console/web/src/components/chat/Message.tsx
  • console/web/src/components/chat/MessageList.tsx
  • console/web/src/components/chat/ModelPicker.tsx
  • console/web/src/components/ui/Select.tsx
  • console/web/src/hooks/use-conversations.ts
  • console/web/src/index.css
  • console/web/src/lib/backend/real.ts
  • console/web/src/lib/backend/translate.ts
  • console/web/src/lib/backend/types.ts
  • console/web/src/lib/models-catalog.ts
  • console/web/src/lib/storage.ts
  • console/web/src/pages/Chat.tsx
  • console/web/src/pages/Playground/index.tsx
  • console/web/src/types/chat.ts
  • console/web/src/types/iii-agent-event.ts
  • harness-node/README.md
  • harness-node/config.yaml
  • harness-node/docs/architecture.md
  • harness-node/docs/workers/approval-gate.md
  • harness-node/docs/workers/harness.md
  • harness-node/docs/workers/hook-fanout.md
  • harness-node/docs/workers/session.md
  • harness-node/docs/workers/turn-orchestrator.md
  • harness-node/src/approval-gate/config.ts
  • harness-node/src/approval-gate/gate-subscriber.ts
  • harness-node/src/approval-gate/on-decision-written.ts
  • harness-node/src/approval-gate/pending.ts
  • harness-node/src/approval-gate/register.ts
  • harness-node/src/approval-gate/state-bus.ts
  • harness-node/src/approval-gate/types.ts
  • harness-node/src/harness/fanout/sessions-poll.ts
  • harness-node/src/hook-fanout/config.ts
  • harness-node/src/hook-fanout/merge.ts
  • harness-node/src/hook-fanout/publish-collect.ts
  • harness-node/src/hook-fanout/types.ts
  • harness-node/src/session/tree/store.ts
  • harness-node/src/session/tree/types.ts
  • harness-node/src/turn-orchestrator/abort.ts
  • harness-node/src/turn-orchestrator/agent-call.ts
  • harness-node/src/turn-orchestrator/config.ts
  • harness-node/src/turn-orchestrator/hook.ts
  • harness-node/src/turn-orchestrator/on-abort-signal.ts
  • harness-node/src/turn-orchestrator/on-terminal.ts
  • harness-node/src/turn-orchestrator/persistence.ts
  • harness-node/src/turn-orchestrator/register.ts
  • harness-node/src/turn-orchestrator/run-start.ts
  • harness-node/src/turn-orchestrator/state.ts
  • harness-node/src/turn-orchestrator/states/assistant.ts
  • harness-node/src/turn-orchestrator/states/functions.ts
  • harness-node/src/turn-orchestrator/states/tearing-down.ts
  • harness-node/src/turn-orchestrator/subscriber.ts
  • harness-node/src/turn-orchestrator/transitions.ts
  • harness-node/src/types/agent-event.ts
  • harness-node/tests/approval-gate/gate-subscriber.test.ts
  • harness-node/tests/approval-gate/on-decision-written.test.ts
  • harness-node/tests/approval-gate/pending.test.ts
  • harness-node/tests/approval-gate/types.test.ts
  • harness-node/tests/harness/policy.test.ts
  • harness-node/tests/hook-fanout/merge.test.ts
  • harness-node/tests/hook-fanout/publish-collect.test.ts
  • harness-node/tests/integration/approval-resume.e2e.test.ts
  • harness-node/tests/session/tree/store.test.ts
  • harness-node/tests/session/tree/types.test.ts
  • harness-node/tests/turn-orchestrator/abort.test.ts
  • harness-node/tests/turn-orchestrator/agent-call.test.ts
  • harness-node/tests/turn-orchestrator/assistant.test.ts
  • harness-node/tests/turn-orchestrator/awaiting-approval.test.ts
  • harness-node/tests/turn-orchestrator/functions.test.ts
  • harness-node/tests/turn-orchestrator/on-abort-signal.test.ts
  • harness-node/tests/turn-orchestrator/on-terminal.test.ts
  • harness-node/tests/turn-orchestrator/persistence-prepared.test.ts
  • harness-node/tests/turn-orchestrator/run-start.test.ts
  • harness-node/tests/turn-orchestrator/state.test.ts
  • harness-node/tests/turn-orchestrator/tearing-down.test.ts
  • iii-permissions.yaml
💤 Files with no reviewable changes (7)
  • harness-node/src/types/agent-event.ts
  • harness-node/config.yaml
  • iii-permissions.yaml
  • harness-node/src/turn-orchestrator/config.ts
  • harness-node/src/hook-fanout/config.ts
  • console/web/src/types/iii-agent-event.ts
  • harness-node/tests/harness/policy.test.ts

| `function_finalize` | same | Persist results; emit `function_call_end` events. |
| `function_execute` | same | Run each call via `dispatchWithHook` → `agent::call`. If the gate returns `pending`, append the call to `awaiting_approval` and transition to `function_awaiting_approval` (the rest of the batch is left for the resumed step). |
| `function_awaiting_approval` | same (`handleAwaitingApproval`) | Read `approvals/<sid>/<cid>` for every entry in `awaiting_approval`. While any decision is still missing, return without stepping (the approval-gate's state trigger will wake us). When all decisions are present, fold them into the prepared snapshot — `allow` → `pre_approved: true`, `deny`/`aborted` → `blocked` with a denial result — clear `awaiting_approval`, and transition back to `function_execute`. |
| `function_finalize` | same | Persist results; emit `function_call_end` + `turn_end` events. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the emitted event name in the FSM table.

Line 56 says function_call_end, but the orchestrator emits function_execution_end. Keeping the exact event name avoids contract confusion in downstream consumers.

Suggested doc fix
-| `function_finalize` | same | Persist results; emit `function_call_end` + `turn_end` events. |
+| `function_finalize` | same | Persist results; emit `function_execution_end` + `turn_end` events. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/docs/workers/turn-orchestrator.md` at line 56, Update the FSM
docs row for the function_finalize state to use the exact emitted event name:
replace the incorrect `function_call_end` with `function_execution_end` so the
table reads that `function_finalize` persists results and emits
`function_execution_end` + `turn_end`; reference the `function_finalize` state
and the `function_execution_end` event to keep the contract consistent with the
orchestrator implementation.

Comment on lines 65 to 108
if (outcome.kind === 'deny') {
const env = permissionsDenyEnvelope(
call.function_id,
outcome.rule_id,
outcome.matched_constraint,
call.args,
);
await writeEvent(ctx.iii, call.session_id, {
type: 'function_call_denied',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
const reply = {
block: true,
reason: env.reason,
denial: env,
subscriber: SUBSCRIBER_NAME,
approval_gate: true,
};
await streamSet(ctx.iii, {
stream_name: 'agent::events',
group_id: call.session_id,
item_id: `approval-${uuidLike()}`,
data: {
type: 'function_call_denied',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
denial: env,
},
});
const reply = blockReplyForDeny(env);
await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);
return reply;
}

// Step 2: pause-and-wait flow.
const now = Date.now();
const expires_at = now + ctx.cfg.default_timeout_ms;
const record = buildPendingRecord(
call.function_call_id,
call.function_id,
call.args,
now,
ctx.cfg.default_timeout_ms,
);
try {
await ctx.bus.set(
ctx.cfg.approval_state_scope,
pendingKey(call.session_id, call.function_call_id),
record,
);
} catch {
const reply = blockReplyForAllow();
await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);
return reply;
}
await writeEvent(ctx.iii, call.session_id, {
type: 'approval_requested',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
args: call.args,
expires_at,
});
const awaited = await awaitDecision(
ctx.bus,
ctx.cfg.approval_state_scope,
call.session_id,
call.function_call_id,
expires_at,
);
let reply: { block: false } | { block: true; denial: DenialEnvelope };
let decisionStr: 'allow' | 'deny';
let reasonForEvent: string | null;
if (awaited.kind === 'allow') {
reply = blockReplyForAllow();
decisionStr = 'allow';
reasonForEvent = null;
} else {
const env = userDenyEnvelope(call.function_id, awaited.reason, call.args);
reply = blockReplyForDeny(env);
decisionStr = 'deny';
reasonForEvent = awaited.reason;
}
await writeEvent(ctx.iii, call.session_id, {
type: 'approval_resolved',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
decision: decisionStr,
reason: reasonForEvent,
await streamSet(ctx.iii, {
stream_name: 'agent::events',
group_id: call.session_id,
item_id: `approval-${uuidLike()}`,
data: {
type: 'approval_requested',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
args: call.args,
},
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t let agent::events failures suppress gate replies.

In both deny and pending branches, a streamSet failure throws before writeHookReply, so the orchestrator may miss the actual gate response.

🧯 Suggested hardening
+import { logger } from '../runtime/otel.js';
@@
-    await streamSet(ctx.iii, {
-      stream_name: 'agent::events',
-      group_id: call.session_id,
-      item_id: `approval-${uuidLike()}`,
-      data: {
-        type: 'function_call_denied',
-        function_call_id: call.function_call_id,
-        tool_call_id: call.function_call_id,
-        function_id: call.function_id,
-        tool_name: call.function_id,
-        denial: env,
-      },
-    });
+    try {
+      await streamSet(ctx.iii, {
+        stream_name: 'agent::events',
+        group_id: call.session_id,
+        item_id: `approval-${uuidLike()}`,
+        data: {
+          type: 'function_call_denied',
+          function_call_id: call.function_call_id,
+          tool_call_id: call.function_call_id,
+          function_id: call.function_id,
+          tool_name: call.function_id,
+          denial: env,
+        },
+      });
+    } catch (err) {
+      logger.warn('approval-gate: failed to emit function_call_denied event', {
+        err: String(err),
+      });
+    }
@@
-  await streamSet(ctx.iii, {
-    stream_name: 'agent::events',
-    group_id: call.session_id,
-    item_id: `approval-${uuidLike()}`,
-    data: {
-      type: 'approval_requested',
-      function_call_id: call.function_call_id,
-      tool_call_id: call.function_call_id,
-      function_id: call.function_id,
-      tool_name: call.function_id,
-      args: call.args,
-    },
-  });
+  try {
+    await streamSet(ctx.iii, {
+      stream_name: 'agent::events',
+      group_id: call.session_id,
+      item_id: `approval-${uuidLike()}`,
+      data: {
+        type: 'approval_requested',
+        function_call_id: call.function_call_id,
+        tool_call_id: call.function_call_id,
+        function_id: call.function_id,
+        tool_name: call.function_id,
+        args: call.args,
+      },
+    });
+  } catch (err) {
+    logger.warn('approval-gate: failed to emit approval_requested event', {
+      err: String(err),
+    });
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (outcome.kind === 'deny') {
const env = permissionsDenyEnvelope(
call.function_id,
outcome.rule_id,
outcome.matched_constraint,
call.args,
);
await writeEvent(ctx.iii, call.session_id, {
type: 'function_call_denied',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
const reply = {
block: true,
reason: env.reason,
denial: env,
subscriber: SUBSCRIBER_NAME,
approval_gate: true,
};
await streamSet(ctx.iii, {
stream_name: 'agent::events',
group_id: call.session_id,
item_id: `approval-${uuidLike()}`,
data: {
type: 'function_call_denied',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
denial: env,
},
});
const reply = blockReplyForDeny(env);
await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);
return reply;
}
// Step 2: pause-and-wait flow.
const now = Date.now();
const expires_at = now + ctx.cfg.default_timeout_ms;
const record = buildPendingRecord(
call.function_call_id,
call.function_id,
call.args,
now,
ctx.cfg.default_timeout_ms,
);
try {
await ctx.bus.set(
ctx.cfg.approval_state_scope,
pendingKey(call.session_id, call.function_call_id),
record,
);
} catch {
const reply = blockReplyForAllow();
await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);
return reply;
}
await writeEvent(ctx.iii, call.session_id, {
type: 'approval_requested',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
args: call.args,
expires_at,
});
const awaited = await awaitDecision(
ctx.bus,
ctx.cfg.approval_state_scope,
call.session_id,
call.function_call_id,
expires_at,
);
let reply: { block: false } | { block: true; denial: DenialEnvelope };
let decisionStr: 'allow' | 'deny';
let reasonForEvent: string | null;
if (awaited.kind === 'allow') {
reply = blockReplyForAllow();
decisionStr = 'allow';
reasonForEvent = null;
} else {
const env = userDenyEnvelope(call.function_id, awaited.reason, call.args);
reply = blockReplyForDeny(env);
decisionStr = 'deny';
reasonForEvent = awaited.reason;
}
await writeEvent(ctx.iii, call.session_id, {
type: 'approval_resolved',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
decision: decisionStr,
reason: reasonForEvent,
await streamSet(ctx.iii, {
stream_name: 'agent::events',
group_id: call.session_id,
item_id: `approval-${uuidLike()}`,
data: {
type: 'approval_requested',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
args: call.args,
},
});
if (outcome.kind === 'deny') {
const env = permissionsDenyEnvelope(
call.function_id,
outcome.rule_id,
outcome.matched_constraint,
call.args,
);
const reply = {
block: true,
reason: env.reason,
denial: env,
subscriber: SUBSCRIBER_NAME,
approval_gate: true,
};
try {
await streamSet(ctx.iii, {
stream_name: 'agent::events',
group_id: call.session_id,
item_id: `approval-${uuidLike()}`,
data: {
type: 'function_call_denied',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
denial: env,
},
});
} catch (err) {
logger.warn('approval-gate: failed to emit function_call_denied event', {
err: String(err),
});
}
await writeHookReply(ctx.iii, call.reply_stream, call.event_id, reply);
return reply;
}
try {
await streamSet(ctx.iii, {
stream_name: 'agent::events',
group_id: call.session_id,
item_id: `approval-${uuidLike()}`,
data: {
type: 'approval_requested',
function_call_id: call.function_call_id,
tool_call_id: call.function_call_id,
function_id: call.function_id,
tool_name: call.function_id,
args: call.args,
},
});
} catch (err) {
logger.warn('approval-gate: failed to emit approval_requested event', {
err: String(err),
});
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/approval-gate/gate-subscriber.ts` around lines 65 - 108, The
current logic calls streamSet before writeHookReply so any streamSet error will
prevent the gate response from being sent; to fix, wrap the streamSet calls in a
try/catch in both the deny branch (where permissionsDenyEnvelope is used and
reply is returned) and the pending branch (approval_requested), log or handle
the streamSet error but always proceed to call writeHookReply with the
constructed reply (and return it in the deny case); reference streamSet,
writeHookReply, permissionsDenyEnvelope, SUBSCRIBER_NAME, uuidLike and ensure
the deny branch still returns the reply after writeHookReply even if streamSet
failed.

Comment on lines +68 to +79
await streamSet(iii, {
stream_name: 'agent::events',
group_id: session_id,
item_id: `approval-${uuidLike()}`,
data: {
type: 'approval_resolved',
function_call_id,
tool_call_id: function_call_id,
decision,
reason,
},
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle approval_resolved event-write failures without failing resolve.

handleResolve already committed the decision. If streamSet throws here, the API path fails after a successful write, which invites duplicate retries and misleading client errors.

🛠️ Suggested resilience patch
-  await streamSet(iii, {
-    stream_name: 'agent::events',
-    group_id: session_id,
-    item_id: `approval-${uuidLike()}`,
-    data: {
-      type: 'approval_resolved',
-      function_call_id,
-      tool_call_id: function_call_id,
-      decision,
-      reason,
-    },
-  });
+  try {
+    await streamSet(iii, {
+      stream_name: 'agent::events',
+      group_id: session_id,
+      item_id: `approval-${uuidLike()}`,
+      data: {
+        type: 'approval_resolved',
+        function_call_id,
+        tool_call_id: function_call_id,
+        decision,
+        reason,
+      },
+    });
+  } catch (err) {
+    logger.warn('approval-gate: resolved state written but event emit failed', {
+      err: String(err),
+      session_id,
+      function_call_id,
+    });
+  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/approval-gate/pending.ts` around lines 68 - 79,
handleResolve currently calls streamSet to emit an 'approval_resolved' event
(stream_name 'agent::events', item_id `approval-${uuidLike()}`) but doesn't
handle failures, so a thrown error can make the API call fail even after the
decision is committed; wrap the streamSet(...) call in a try/catch inside
handleResolve, and on error do not rethrow—instead log the failure (using the
module's existing logger or console.error) with contextual details (session_id,
function_call_id, decision, and the caught error) so the resolve flow remains
successful while still recording the event-write failure.

Comment on lines +188 to +201
(async () => {
try {
const envelope = buildPublishEnvelope(topic, event_id, inner);
await iii.trigger<unknown, unknown>({
function_id: 'iii::durable::publish',
payload: envelope,
});
} catch (err) {
logger.warn('hook-fanout::publish_collect: publish trigger failed', {
topic,
err: String(err),
});
publish_error = err instanceof Error ? err.message : String(err);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when iii::durable::publish fails.

The catch block only sets publish_error; the call still waits until deadline. That stalls requests unnecessarily even though publish already failed.

⚡ Suggested fix
       (async () => {
         try {
           const envelope = buildPublishEnvelope(topic, event_id, inner);
           await iii.trigger<unknown, unknown>({
             function_id: 'iii::durable::publish',
             payload: envelope,
           });
         } catch (err) {
           logger.warn('hook-fanout::publish_collect: publish trigger failed', {
             topic,
             err: String(err),
           });
           publish_error = err instanceof Error ? err.message : String(err);
+          settle(
+            collector,
+            collector.replies.length === 0 ? 'deadline_no_replies' : 'deadline',
+          );
         }
       })();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(async () => {
try {
const envelope = buildPublishEnvelope(topic, event_id, inner);
await iii.trigger<unknown, unknown>({
function_id: 'iii::durable::publish',
payload: envelope,
});
} catch (err) {
logger.warn('hook-fanout::publish_collect: publish trigger failed', {
topic,
err: String(err),
});
publish_error = err instanceof Error ? err.message : String(err);
}
(async () => {
try {
const envelope = buildPublishEnvelope(topic, event_id, inner);
await iii.trigger<unknown, unknown>({
function_id: 'iii::durable::publish',
payload: envelope,
});
} catch (err) {
logger.warn('hook-fanout::publish_collect: publish trigger failed', {
topic,
err: String(err),
});
publish_error = err instanceof Error ? err.message : String(err);
settle(
collector,
collector.replies.length === 0 ? 'deadline_no_replies' : 'deadline',
);
}
})();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/hook-fanout/publish-collect.ts` around lines 188 - 201, The
publish try/catch silently records publish_error but lets the IIFE continue
waiting until the deadline; modify the catch in the async IIFE that calls
buildPublishEnvelope and iii.trigger (function_id 'iii::durable::publish') so it
fails fast: after logging and setting publish_error (and keeping the existing
logger.warn and publish_error assignment), immediately rethrow the caught error
(or otherwise reject/abort the surrounding wait) so the caller stops waiting
instead of stalling until the deadline.

Comment on lines +152 to +157
return {
kind: 'deny',
result: denialResult(
gateUnavailableEnvelope(function_call.function_id, `trigger_failed: ${String(err)}`),
),
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not classify target function execution errors as gate denials.

This path wraps arbitrary iii.trigger execution failures as gate_unavailable denials, which conflates runtime/tool errors with authorization failures and can mislead downstream handling.

Suggested fix
-    return {
-      kind: 'deny',
-      result: denialResult(
-        gateUnavailableEnvelope(function_call.function_id, `trigger_failed: ${String(err)}`),
-      ),
-    };
+    return {
+      kind: 'result',
+      result: errorResult({
+        error: 'trigger_failed',
+        function: function_call.function_id,
+        message: String(err),
+      }),
+    };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/turn-orchestrator/agent-call.ts` around lines 152 - 157, The
code is incorrectly classifying failures from iii.trigger as gate denials by
returning kind: 'deny' with denialResult(gateUnavailableEnvelope(...)); change
this path so arbitrary runtime/tool errors from iii.trigger are not mapped to
authorization denials: replace the deny/denialResult/gateUnavailableEnvelope
return with an error/failure variant (for example kind: 'error' or 'failure')
and wrap the original error (String(err)) in an execution-specific
envelope/result (e.g., executionErrorEnvelope or errorResult) that references
function_call.function_id; ensure callers that handle denials versus execution
errors can distinguish the new kind.

Comment on lines +50 to 58
session_id?: string,
): Promise<HookOutcome> {
const inner = { function_call, approval_required };
// PR #150: include session_id at the inner-payload root so approval-gate's
// extractCall can route it. Without this, the gate sees a null session_id,
// returns {block:false}, and the call passes through unapproved.
const inner = session_id
? { session_id, function_call, approval_required }
: { function_call, approval_required };
const payload = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Require session_id to preserve fail-closed approval behavior

Line 50 makes session_id optional, but Lines 52-54 explicitly document that missing session_id can produce block:false, which becomes allow at Line 88. That creates a bypass path if any caller omits it.

Suggested fix
 export async function consultBefore(
   iii: ISdk,
   function_call: FunctionCall,
   approval_required: string[],
-  session_id?: string,
+  session_id: string,
 ): Promise<HookOutcome> {
+  if (!session_id) {
+    return {
+      kind: 'deny',
+      denial: gateUnavailableEnvelope(
+        function_call.function_id,
+        'permission gate rejected request: missing session_id',
+      ),
+    };
+  }
-  const inner = session_id
-    ? { session_id, function_call, approval_required }
-    : { function_call, approval_required };
+  const inner = { session_id, function_call, approval_required };

Also applies to: 87-90

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/turn-orchestrator/hook.ts` around lines 50 - 58, The handler
currently marks session_id optional which allows a missing session_id to create
an inner payload without it and accidentally bypass approval; change the
function signature to require session_id (remove the optional ?), and add a
runtime guard in the function that throws or returns a failing HookOutcome when
session_id is falsy to enforce fail-closed behavior; ensure the inner payload
construction (the inner variable that includes session_id, function_call,
approval_required) always includes session_id and update any callsites if needed
so extractCall/approval-gate always receive a non-null session_id.

Comment on lines +82 to +86
if (entry.pre_approved === true) {
const value = await iii.trigger<unknown, unknown>({
function_id: fc.function_id,
payload: fc.arguments ?? {},
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve augmented call context on resumed approvals.

Line 83 dispatches pre_approved calls with raw fc.arguments, but the regular path (Lines 104–126) augments args with session/call metadata. That makes resumed approved executions semantically different from first-pass executions.

Suggested fix
-    if (entry.pre_approved === true) {
-      const value = await iii.trigger<unknown, unknown>({
-        function_id: fc.function_id,
-        payload: fc.arguments ?? {},
-      });
+    if (entry.pre_approved === true) {
+      const augmentedPayload =
+        fc.arguments && typeof fc.arguments === 'object' && !Array.isArray(fc.arguments)
+          ? { ...(fc.arguments as Record<string, unknown>) }
+          : { arguments: fc.arguments };
+      augmentedPayload.session_id = rec.session_id;
+      augmentedPayload.function_call_id = fc.id;
+      augmentedPayload.function_id = fc.function_id;
+      augmentedPayload.function_call = {
+        id: fc.id,
+        function_id: fc.function_id,
+        arguments: fc.arguments,
+      };
+      const value = await iii.trigger<unknown, unknown>({
+        function_id: fc.function_id,
+        payload: augmentedPayload,
+      });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/turn-orchestrator/states/functions.ts` around lines 82 - 86,
The resumed pre_approval branch dispatches iii.trigger using raw fc.arguments
(entry.pre_approved path) which loses the session/call metadata added in the
normal path; modify the entry.pre_approved === true branch to build the same
augmented payload as the regular path (merge fc.arguments with the session/call
context fields used in the regular flow—e.g., call/session IDs, caller info,
metadata) and pass that merged object to iii.trigger with fc.function_id so
resumed executions receive the same enriched context as first-pass executions.

await emit(iii, rec.session_id, { type: 'agent_end', messages });
transitionTo(rec, 'stopped');
}
// reload 1779112003
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove the reload marker comment.

The trailing // reload 1779112003 comment appears to be a development artifact with no functional purpose and should be removed before merge.

🧹 Proposed fix
-// reload 1779112003
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// reload 1779112003
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/src/turn-orchestrator/states/tearing-down.ts` at line 27, Remove
the development artifact comment "// reload 1779112003" from the tearing-down.ts
file; locate the exact trailing comment text and delete that line (and any other
identical "reload" marker comments if present) so the file contains no leftover
dev reload markers before merging.

Comment on lines +96 to +109
it('does not publish when state write fails', async () => {
const bus = new InMemoryStateBus();
const out = await awaitDecision(bus, 'approvals', 's1', 'tc-1', Date.now() + 1_000);
expect(out.kind).toBe('deny');
if (out.kind === 'deny') expect(out.reason).toBe('state_unavailable');
vi.spyOn(bus, 'set').mockRejectedValue(new Error('boom'));
const iii = { trigger: vi.fn() } as unknown as ISdk;
const out = await handleResolveWithEvents(iii, bus, 'approvals', {
session_id: 's1',
function_call_id: 'fc-1',
decision: 'allow',
});
expect((out as Record<string, unknown>).ok).toBe(false);
const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map(
(c) => (c[0] as { function_id: string }).function_id,
);
expect(fns).not.toContain('iii::durable::publish');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Strengthen the failure-path side-effect assertion.

This assertion won’t catch regressions: handleResolveWithEvents never calls iii::durable::publish here, even on success. Assert no trigger calls (or specifically no stream::set) when bus.set fails.

✅ Suggested test fix
-    const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map(
-      (c) => (c[0] as { function_id: string }).function_id,
-    );
-    expect(fns).not.toContain('iii::durable::publish');
+    const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map(
+      (c) => (c[0] as { function_id: string }).function_id,
+    );
+    expect(fns).toHaveLength(0);
+    expect(fns).not.toContain('stream::set');
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
it('does not publish when state write fails', async () => {
const bus = new InMemoryStateBus();
const out = await awaitDecision(bus, 'approvals', 's1', 'tc-1', Date.now() + 1_000);
expect(out.kind).toBe('deny');
if (out.kind === 'deny') expect(out.reason).toBe('state_unavailable');
vi.spyOn(bus, 'set').mockRejectedValue(new Error('boom'));
const iii = { trigger: vi.fn() } as unknown as ISdk;
const out = await handleResolveWithEvents(iii, bus, 'approvals', {
session_id: 's1',
function_call_id: 'fc-1',
decision: 'allow',
});
expect((out as Record<string, unknown>).ok).toBe(false);
const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map(
(c) => (c[0] as { function_id: string }).function_id,
);
expect(fns).not.toContain('iii::durable::publish');
it('does not publish when state write fails', async () => {
const bus = new InMemoryStateBus();
vi.spyOn(bus, 'set').mockRejectedValue(new Error('boom'));
const iii = { trigger: vi.fn() } as unknown as ISdk;
const out = await handleResolveWithEvents(iii, bus, 'approvals', {
session_id: 's1',
function_call_id: 'fc-1',
decision: 'allow',
});
expect((out as Record<string, unknown>).ok).toBe(false);
const fns = (iii.trigger as ReturnType<typeof vi.fn>).mock.calls.map(
(c) => (c[0] as { function_id: string }).function_id,
);
expect(fns).toHaveLength(0);
expect(fns).not.toContain('stream::set');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness-node/tests/approval-gate/pending.test.ts` around lines 96 - 109, The
test currently only asserts that 'iii::durable::publish' wasn't called but that
function is never invoked by handleResolveWithEvents here; update the
failure-path assertion to ensure no side-effect triggers occur when
InMemoryStateBus.set rejects: after mocking bus.set to throw, assert that
iii.trigger was never called (or specifically assert that no call had
function_id 'stream::set') — locate the InMemoryStateBus instance 'bus', the
mocked 'set' call, and the SDK stub 'iii.trigger' in the test and replace the
current fns check with an assertion on iii.trigger mock call count or absence of
'stream::set'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants