Skip to content

Auto-accept toggle, LM Studio + llama.cpp providers, /review hardening#184

Merged
andersonleal merged 5 commits into
mainfrom
wip/auto-accept-and-lmstudio
May 22, 2026
Merged

Auto-accept toggle, LM Studio + llama.cpp providers, /review hardening#184
andersonleal merged 5 commits into
mainfrom
wip/auto-accept-and-lmstudio

Conversation

@andersonleal
Copy link
Copy Markdown
Collaborator

@andersonleal andersonleal commented May 22, 2026

Summary

Four cohesive workstreams on this branch:

  1. Auto-accept toggle for the chat UI — per-conversation policy
    to auto-resolve safe approval prompts, with a client-side
    defense-in-depth deny filter for state-mutating / destructive /
    egress / recursive-dispatch / credential surfaces.
  2. provider-lmstudio — local LM Studio Chat Completions
    provider (load/unload/refresh + chat) on the iii bus.
  3. provider-llamacpp — sibling provider for llama-server
    (single-model-per-process runtime; OpenAI-compat chat path).
  4. /review hardening — security, perf, a11y, and test
    improvements across the three above + the orchestrator and
    context-compaction handlers.

A small skill-id↔function-id disambiguation pass and a CI
language=javascript removal landed alongside in the base WIP
commit; they're not gating but the review thread suggested splitting
them out in follow-ups.

What's in each commit

  • 6c0c8b9c wip: auto-accept toggle, LM Studio provider, gemma routing
    — initial WIP. Adds AutoAcceptToggle, the LM Studio provider with
    load/unload/refresh + auto-load retry, gemma routing intent, the
    skill-id disambiguation prompt hints, and compaction_done events.
  • 604f3130 fix(chat): defense-in-depth auto-accept policy + extract hook + a11y
    auto-accept-policy.ts deny filter, extracted
    useAutoAcceptApprovals hook (with hot-path early exit),
    <LiveRegion> + useLiveAnnouncer, AutoAcceptToggle a11y rework
    (focus-visible, aria-describedby, aria-disabled, high-contrast
    ON state per WCAG 1.4.3), handleSubmit deps cleanup,
    MessageList reverse-loop, formatStopReason extraction.
  • 9ffef133 fix(harness): security, perf, and test hardening from /review
    — LM Studio + kimi SSE DoS guard (clamp tool-call index to ≤256),
    errors moved to error_message only (no text ContentBlock
    injection sink), O(1) Map-based wire-message dedup, parallel
    discovery via Promise.allSettled, loopback-aware auth (omit
    Authorization on non-loopback without an explicit credential),
    validated LMSTUDIO_BASE_URL parsing, non-2xx body truncation +
    control-char strip, bounded existingResultIds rebuild in the
    orchestrator, extracted emitCompactionDone shared helper. Test
    improvements: fake-timer stream timeout, tightened error_kind
    assertion, abortedFromController assertion in load timeout,
    Anthropic dedup scenarios ported to LM Studio, full coverage of
    the new auth / config shapes.
  • d6157847 feat(harness): add provider-llamacpp + bump default fetch timeout
    — entire new harness/src/provider-llamacpp/ worker (14 source
    files + 8 test files + docs), wiring across router /
    auth-credentials / models-catalog / harness index. Default fetch
    timeout 30s → 120s with LLAMACPP_FETCH_TIMEOUT_MS env override
    (reads fresh per-call). 35B+ GGUFs over LAN need it.

Diff scope

125 files changed, ~11,900 insertions, ~60 deletions. New worker
(provider-llamacpp) and a sibling design doc / Open Design artifact
under console/web/DESIGN.md account for most of the volume.

Test plan

  • cd harness && bun run test — 821 passing across 105 files
    (provider-llamacpp + new auth/catalog seeds + provider-router
    widening on top of the 789 we landed at /review time).
  • cd console/web && bun run test — 235 passing across 17 files
    (auto-accept-policy regression tests for the shell:: namespace
    fix, useAutoAcceptApprovals wiring-guard, format-stop-reason).
  • cd harness && bunx tsc --noEmit — clean.
  • cd console/web && bunx tsc --noEmit -p . — clean.
  • Manual: open the console, flip auto-accept ON, send a turn
    that triggers shell::fs::read or another safe read. The card
    should auto-resolve with an SR announcement. A shell::fs::write
    / agent::trigger should still require a click.
  • Manual: start llama-server -m <model.gguf> on :8080,
    configure the console to point at it (or set LLAMACPP_BASE_URL),
    pick the model in the picker, send a message. Tool calls should
    flow through. For 35B+ models also set LLAMACPP_FETCH_TIMEOUT_MS=180000.
  • Manual: revisit AutoAcceptToggle keyboard nav — Tab into it,
    Space toggles, focus ring is visible in both light and dark
    themes.

Follow-ups (out of scope here per /review)

  • Shared openai-compat-sse kernel extraction across
    kimi/lmstudio/llamacpp (the duplication ratchet got tighter with
    llamacpp landing). Belongs in its own PR.
  • Split the bundled skill-id↔function-id disambiguation + CI
    language-removal into separate PRs.
  • The base WIP commit's "gemma routing" claim has no gemma
    references in the diff. Either land the routing or correct the
    message before merge.

Summary by CodeRabbit

  • New Features

    • Added auto-accept toggle for function approvals in conversations
    • Introduced LM Studio and Llama.cpp provider support with model discovery and loading
    • Added visual notifications for context compaction events
    • Published comprehensive design system documentation with component specifications
  • Bug Fixes

    • Fixed idempotency issues preventing duplicate messages on retry
    • Improved tool-call boundary handling for provider compatibility
    • Enhanced error messages for function-not-found scenarios
  • Documentation

    • Updated skill documentation with explicit callable function identifiers
    • Added architecture diagrams and configuration guides for new providers

Review Change Stack

Squashed bundle of in-flight work on top of origin/main (post
harness-node -> harness rename). Several independent concerns; split
into focused PRs before shipping.

console/web - auto-accept per-conversation toggle:
- New AutoAcceptToggle component + auto-accept storage helper with
  tests; ChatPanel/ChatView/Composer wire it in.
- Storage layer persists per-conversation flag; translate layer
  honors it for incoming approval events.
- Chat type + iii-agent-event additions; Examples & Playground
  updates.

harness - provider-lmstudio worker:
- New worker under src/provider-lmstudio/ (auth, complete, config,
  discover, load/unload, refresh, register, sse, stream, types,
  wire-messages, wire-tools) with accompanying tests.
- iii.worker.yaml + docs/workers/provider-lmstudio.md.
- Manual refresh after loading/unloading a model in LM Studio is
  available via the existing CLI (no helper script needed):
    iii trigger provider::lmstudio::refresh_models --json '{}'

harness - models catalog + routing for gemma:
- models-catalog/models.json adds google/gemma-4-e4b (and related).
- turn-orchestrator/provider-router routes gemma to lmstudio instead
  of falling back to anthropic.

harness - misc refinements:
- context-compaction handlers emit a done event.
- turn-orchestrator agent-trigger + states (assistant, functions)
  + system-prompt updates.
- Provider wire-messages tweaks across anthropic, kimi, openai.

harness - QA docs:
- docs/qa/lmstudio-chat-e2e.md: long-form Python runbook for
  CLI re-play of the chat path (kickoff payload, poll loop,
  failure modes).
- docs/qa/agent-howto.md: terse quickstart pointing at the runbook,
  with pass criteria + fail-diagnosis table for the canonical
  sandbox+exec+stop scenario.

iii-directory:
- skills/directory/**/*.md re-document function_id argument shapes.
- fs_source.rs + functions/skills.rs source updates.

Misc: Cargo.lock bumps (database, iii-lsp, image-resize, mcp),
image-resize/example/config.yaml, README + harness/docs/architecture.md
updates (provider-lmstudio row added to worker catalogue, mermaid
diagram includes provLms node).
Hardens the per-conversation auto-accept toggle in three layers:

Safety. Adds `auto-accept-policy.ts` — a defense-in-depth deny filter
that refuses to silently click "allow" on calls whose function id
matches state-mutating, destructive, process-spawn, external-egress,
recursive-dispatch, or credential patterns. The server-side approval
gate is still in the chain regardless; the policy is a client-side
guard against prompt-injection-driven escalation when auto-accept is
on. Deny verbs are boundary-anchored so the `shell::` worker
namespace doesn't poison safe reads like `shell::fs::read` /
`shell::fs::ls`.

Hot path. Extracts the inline auto-accept effect from `ChatView` into
`useAutoAcceptApprovals` with a hot-path early exit when the last
message isn't a function-call. Drops `conversation.messages` from
`handleSubmit`'s deps and reads through a ref instead so the callback
identity is stable across token-by-token renders (previously rebuilt
every token, which thrashed Lexical's `KEY_ENTER_COMMAND` listener).
`MessageList` swaps `[...messages].reverse().find(...)` for a reverse
for-loop to drop a per-token full-array allocation.

A11y. Reworks `AutoAcceptToggle`: high-contrast `bg-ink text-bg` ON
state (WCAG 1.4.3), `focus-visible` outline ring, `aria-describedby`
for the explanatory text (the `title` tooltip was keyboard/SR
invisible), `aria-disabled` instead of `disabled` so the policy
control stays in the tab order while streaming, and the visible
"on"/"off" text now carries the accessible name (was overridden by
an `aria-label` that doubled with `role="switch"`). Adds a shared
`<LiveRegion>` + `useLiveAnnouncer` so SR users hear auto-accept
events, denial reasons, stop-reasons (assertive on error), and
compaction markers.

Extracts `formatStopReason` to its own module with tests.

Tests: 235 passing (was 233 pre-fix).
Provider-lmstudio (and provider-kimi for the shared SSE DoS guard):

- SSE: clamp attacker-controlled tool-call `index` to [0, 256] so a
  hostile or buggy server can't force unbounded `state.tool_calls`
  allocation via `{"index": 1e9}` in a delta. Same guard added to
  provider-kimi/sse.ts.
- SSE: render provider errors via `error_message` ONLY, not as a
  `text` ContentBlock. Pre-fix a malicious LM Studio backend or MITM
  could smuggle "tool approved" lines / markup into the persisted
  assistant content stream where it would be re-fed to the next turn
  as trusted context.
- wire-messages: replace `out.findIndex` tool-result dedup with an
  `O(1)` `Map<tool_call_id, idx>` lookup. Long histories with many
  tool calls were O(M²) per translation, amplified on LM Studio
  because the auto-load retry can re-translate the same history
  twice in one user turn.
- discover: `Promise.allSettled` the per-model `models::register`
  fan-out so startup-blocking time is bounded by one round-trip's
  timeout instead of N × REGISTER_TIMEOUT_MS. Move full id list to
  DEBUG (count stays at INFO) so fine-tuned customer ids don't leak
  into shared observability tooling.
- auth: loopback-aware key selection. Omit `Authorization` entirely
  on non-loopback hosts without an explicit credential (don't ship a
  synthetic `Bearer lm-studio` fingerprint to whoever operates a
  misconfigured remote). Log auth-fetch failures at WARN with a
  stable code so monitoring can alert on a sustained fallback rate
  (DoS-the-auth-worker → force-fallback attack).
- config: validate `LMSTUDIO_BASE_URL` as http(s) via `new URL()`.
  Reject malformed values + non-http(s) schemes; fall through to
  yaml/default. Warn when the resolved host is non-loopback.
- stream: truncate non-2xx response bodies to 256 bytes and strip
  control chars before surfacing — non-2xx can carry HTML auth-
  redirect pages, proxy error bodies, or attacker-controlled text.

Turn-orchestrator:

- `states/functions.ts`: bound the `existingResultIds` rebuild by
  walking messages backwards and stopping at the assistant boundary
  once the incoming function_result ids are covered. Pre-fix this
  was O(history) per finalize.

Context-compaction:

- Extract `emitCompactionDone` helper. The sync + async handlers
  carried byte-for-byte identical try/catch/emit blocks — collapsed
  to one call each, with a stable failure code.

Tests:

- stream timeout: switch to fake timers (was real-timer-bound,
  flaky on slow CI). Eagerly attach a no-op `.catch()` to silence a
  fake-timer-ordering "unhandled rejection" warning.
- sse: tighten `error_kind` assertion (`toBeDefined` → `toBe('permanent')`).
- load timeout: assert `abortedFromController` so a refactor that
  surfaces a timeout error WITHOUT actually calling `controller.abort()`
  is caught.
- wire-messages: port the three Anthropic dedup scenarios (collapse,
  distinct-id ordering, across-batch separation) to keep the providers
  in lockstep.
- auth/config: cover the new `isLoopbackUrl`, `selectAuthKey`,
  `buildAuthHeaders`, and URL-validation shapes.

Tests: 789 passing (was 710 pre-review).
Adds a llama.cpp `llama-server` provider as a peer to provider-lmstudio.
Same OpenAI-compatible chat path; runtime-management surface trimmed
because llama-server hosts exactly one model per process.

What's the same as provider-lmstudio:

- OpenAI Chat Completions wire format with SSE streaming.
- `reasoning_content` round-trip for thinking-mode models (DeepSeek-R1,
  Qwen-thinking) when llama-server is started with
  `--jinja --reasoning-format deepseek`.
- Tool calls via `--jinja` + a tool-aware chat template.
- The DoS guard (tool-call index clamp), security posture (error in
  `error_message` only, non-2xx truncation), and perf shape (O(1)
  tool-result dedup, parallel discovery via Promise.allSettled).

What's different:

- Default port 8080 (vs LM Studio's 1234).
- No fallback bearer string. Unlike LM Studio there is no documented
  default token; on loopback without a credential we omit
  `Authorization` entirely. On non-loopback without a credential we
  also omit AND log a WARN with code `llamacpp_auth_omitted_nonloopback`.
- No `load_model` / `unload_model` bus functions. llama-server can't
  switch models at runtime (`-m` is set at process start). Only
  `complete`, `stream`, and `refresh_models` are registered.
- Discovery uses OpenAI-compat `/v1/models` (id only) rather than LM
  Studio's `/api/v0/models` (rich metadata, all-downloaded). The
  startup fire-and-forget call registers the single loaded model.
- `llamacpp-local` catalog placeholder (alongside `lmstudio-local`)
  so capability gating (supports_tools etc.) works for any
  user-loaded model id.

Bumps the fetch timeout default 30s → 120s. 30s was too short for
large GGUFs (35B+ Q4) where prompt-processing alone can take tens of
seconds before the first token arrives, especially over LAN. Adds
`LLAMACPP_FETCH_TIMEOUT_MS` env override (positive integer ms;
values <1000 ignored). Read fresh per-call so operators can adjust
without restarting the worker, and so tests can mutate it.

Wiring:

- `harness/src/index.ts` registers the worker entry.
- `turn-orchestrator/provider-router.ts` adds the `'llamacpp'` route
  + `targetFunctionId` branch. Same "no model-name heuristic"
  posture as lmstudio (id namespaces overlap with HF-style ids from
  other services; require explicit `provider='llamacpp'`).
- `auth-credentials/types.ts` ENV_VAR_MAP: `LLAMACPP_API_KEY`.
- `models-catalog/catalog.ts` + `models.json`: `llamacpp-local`
  placeholder fallback.

Docs at `harness/docs/workers/provider-llamacpp.md` (configuration,
auth policy, defenses, side-by-side diff against provider-lmstudio).

Tests: 789 → 821 (+32 new across 9 test files + provider-router
broadening for the new branch).
@vercel
Copy link
Copy Markdown

vercel Bot commented May 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
workers Ready Ready Preview, Comment May 22, 2026 6:35pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

Warning

Rate limit exceeded

@andersonleal has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 48 minutes and 21 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c415530c-f396-48e4-9920-c9f2f0e9a714

📥 Commits

Reviewing files that changed from the base of the PR and between d615784 and 0254648.

📒 Files selected for processing (36)
  • console/web/src/hooks/use-auto-accept-approvals.test.ts
  • harness/src/provider-anthropic/wire-messages.ts
  • harness/src/provider-kimi/sse.ts
  • harness/src/provider-llamacpp/auth.ts
  • harness/src/provider-llamacpp/config.ts
  • harness/src/provider-llamacpp/discover.ts
  • harness/src/provider-llamacpp/refresh-fn.ts
  • harness/src/provider-llamacpp/sse.ts
  • harness/src/provider-llamacpp/stream.ts
  • harness/src/provider-llamacpp/types.ts
  • harness/src/provider-lmstudio/auth.ts
  • harness/src/provider-lmstudio/config.ts
  • harness/src/provider-lmstudio/discover.ts
  • harness/src/provider-lmstudio/refresh-fn.ts
  • harness/src/provider-lmstudio/sse.ts
  • harness/src/provider-lmstudio/stream.ts
  • harness/src/provider-lmstudio/unload-fn.ts
  • harness/src/turn-orchestrator/agent-trigger.ts
  • harness/tests/context-compaction/compaction-done-emit.test.ts
  • harness/tests/provider-anthropic/wire-messages.test.ts
  • harness/tests/provider-kimi/sse.test.ts
  • harness/tests/provider-llamacpp/auth.test.ts
  • harness/tests/provider-llamacpp/discover.test.ts
  • harness/tests/provider-llamacpp/sse.test.ts
  • harness/tests/provider-llamacpp/stream.test.ts
  • harness/tests/provider-llamacpp/wire-messages.test.ts
  • harness/tests/provider-lmstudio/auth.test.ts
  • harness/tests/provider-lmstudio/discover.test.ts
  • harness/tests/provider-lmstudio/load.test.ts
  • harness/tests/provider-lmstudio/sse.test.ts
  • harness/tests/provider-lmstudio/stream.test.ts
  • harness/tests/provider-lmstudio/wire-messages.test.ts
  • harness/tests/provider-openai/wire-messages.test.ts
  • harness/tests/turn-orchestrator/agent-trigger.test.ts
  • harness/tests/turn-orchestrator/functions.test.ts
  • harness/tests/turn-orchestrator/provider-router.test.ts
📝 Walkthrough

Walkthrough

Adds per-conversation auto-accept with safety policy and UI, ARIA live announcements, new compaction and stop-reason events, token estimation from compaction, and LM Studio/llama.cpp providers with SSE streaming, discovery, load/unload, routing, and docs; directory surfaces function_id in skills.

Changes

Auto-accept approvals, compaction, and local providers

Layer / File(s) Summary
Auto-accept contracts and hook
console/web/src/lib/backend/auto-accept*.ts, console/web/src/hooks/use-auto-accept-approvals*.ts, tests
Defines policy, candidate selection, and a hook to auto-resolve safe function-call approvals; includes wiring-guard and behavior tests.
Chat wiring and accessibility
console/web/src/components/chat/*, console/web/src/hooks/use-live-announcer.ts, console/web/src/components/ui/LiveRegion.tsx
Adds AutoAcceptToggle, propagates autoAccept state, ARIA live region/announcer, and new stream event handling for compaction and stop-reason.
Stream contracts and formatting
console/web/src/lib/backend/types.ts, translate.ts/tests, format-stop-reason*.ts, token estimate
Extends StreamEvent with compaction and stop-reason; adds reason formatter; token counts start at last compaction marker; tests updated.
Local providers (LM Studio, llama.cpp)
harness/src/provider-*/**, models-catalog, routing
Implements providers with SSE streaming, discovery, load/unload, auth, wire adapters, configs, CLI manifests; seeds catalog; updates router; extensive tests.
Compaction emission and UI surfacing
harness/src/context-compaction/*, console/web/src/types/iii-agent-event.ts
Emits compaction_done on sync/async compaction; frontend type and translation produce compaction markers; tests validate emission.
Directory skills metadata and guidance
iii-directory/*
Surfaces optional function_id in list/get responses; improves not-found suggestions and doc callouts; updates parsing and tests.

Sequence Diagram(s)

sequenceDiagram
  participant ChatClient
  participant HarnessRouter
  participant Provider(lmstudio/llamacpp)
  participant ModelAPI

  ChatClient->>HarnessRouter: provider::(lmstudio|llamacpp)::stream (writer_ref, messages, tools)
  HarnessRouter->>Provider(lmstudio/llamacpp): register handler invoked
  Provider(lmstudio/llamacpp)->>ModelAPI: POST /v1/chat/completions (SSE)
  ModelAPI-->>Provider(lmstudio/llamacpp): SSE chunks (text/thinking/tool_calls)
  Provider(lmstudio/llamacpp)->>HarnessRouter: stream events (start, deltas, done/error)
  HarnessRouter-->>ChatClient: JSON events via writer_ref
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • iii-hq/workers#140 — Introduces context-compaction; this PR consumes and surfaces compaction_done as compaction events.
  • iii-hq/workers#156 — Adds approval resolve plumbing used here by auto-accept to call backend.resolveApproval.
  • iii-hq/workers#177 — Touches directory skills get/not-found; complements this PR’s function_id exposure and suggestion logic.

Suggested reviewers

  • sergiofilhowz

Poem

a toggle clicks, approvals pass,
i thump-thump tests, then sip my grass.
compact the trail, the tokens lean,
two local burrows join the scene.
events stream warm like sunlight’s flow—
with twitching ears, i ship and go. 🥕🐇

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch wip/auto-accept-and-lmstudio

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 22, 2026

skill-check — worker

0 verified, 12 skipped (no docs/).

Layer Result
structure
vale
ai

Three for three. Nicely done.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
harness/README.md (1)

26-46: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document provider-llamacpp alongside LM Studio in Workers and Quickstart.

This README update adds LM Studio, but the new llama.cpp worker from this PR is still missing from both the workers table and run commands, leaving setup docs incomplete.

Proposed doc update
 | `src/provider-kimi/` | `provider::kimi::stream`, `provider::kimi::complete` | Kimi (Moonshot) Chat Completions SSE → channel writer. |
 | `src/provider-lmstudio/` | `provider::lmstudio::stream`, `provider::lmstudio::complete` | LM Studio (localhost) Chat Completions SSE → channel writer. |
+| `src/provider-llamacpp/` | `provider::llamacpp::stream`, `provider::llamacpp::complete` | llama.cpp (llama-server) OpenAI-compatible Chat Completions SSE → channel writer. |
 | `src/context-compaction/` | (none — pure side-car on `agent::events`) | Optional out-of-band session-history compactor. |
@@
 node dist/provider-kimi/main.js         --url ws://127.0.0.1:49134 --config ./config.yaml
 node dist/provider-lmstudio/main.js     --url ws://127.0.0.1:49134 --config ./config.yaml
+node dist/provider-llamacpp/main.js     --url ws://127.0.0.1:49134 --config ./config.yaml
 node dist/llm-budget/main.js            --url ws://127.0.0.1:49134
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/README.md` around lines 26 - 46, Add documentation for the new
provider-llamacpp worker: insert a table row alongside the LM Studio entry
referencing src/provider-llamacpp/ and the worker capabilities (e.g.
provider::llamacpp::stream, provider::llamacpp::complete) in the workers table,
and add the corresponding Quickstart run command to the example commands: node
dist/provider-llamacpp/main.js --url ws://127.0.0.1:49134 --config ./config.yaml
(match the same flags pattern used by provider-lmstudio and other providers).
console/web/src/hooks/use-conversations.ts (1)

72-80: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Backfill autoAccept for previously persisted conversations.

loadConversations() can return legacy records without autoAccept; those flow through as undefined even though Conversation.autoAccept is now required. Normalize on load to keep runtime shape stable.

💡 Suggested patch
   const [conversations, setConversations] = useState<Conversation[]>(() => {
-    const loaded = loadConversations()
+    const loaded = loadConversations().map((c) => ({
+      ...c,
+      autoAccept: c.autoAccept ?? false,
+    }))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/web/src/hooks/use-conversations.ts` around lines 72 - 80, Loaded
conversations may lack the required Conversation.autoAccept field; modify the
initializer that calls loadConversations() so it normalizes/backfills autoAccept
on each record before returning—e.g., map over loaded results and for any
conversation with autoAccept === undefined set it to the default value used by
emptyConversation (you can derive that by calling emptyConversation(model) where
model is convo.model || loadLastModel() || DEFAULT_MODEL); then return the
mapped list (or the single emptyConversation when loaded is empty) so runtime
shape always includes autoAccept.
harness/src/provider-llamacpp/discover.ts (1)

1-202: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix formatting in this file before merge.

The harness: node lint + test check is currently failing on formatting for this file, so this will block CI until it’s reformatted.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/provider-llamacpp/discover.ts` around lines 1 - 202, This file
fails CI due to formatting; run the project's auto-formatter/linter fix (e.g.,
prettier/eslint with --fix or the repo's "node lint + test" task) on
harness/src/provider-llamacpp/discover.ts and commit the changes so the file
matches repo style. Focus on normalizing whitespace/indentation, trailing
commas/semicolons, and import/exports formatting around functions such as
modelsUrl, fetchWithTimeout, fetchModelIds, toCatalogModel, registerDiscovered,
and discoverAndRegister so the file passes lint/format checks. Ensure no
functional changes are made—only formatting fixes—and re-run the CI lint check
locally before pushing.
🧹 Nitpick comments (7)
console/web/DESIGN.md (2)

1731-1736: 💤 Low value

Add language specifier to fenced code block.

This text diagram should have an explicit language specifier for better tool support.

📝 Proposed fix
-```
+```text
 section eyebrow (label-caps)            <- bg page
   card        bg-panel head + bg body   <- per function/trigger
     pane      bg-paper-2 head + bg body <- per schema (request/response)
       tree    flat type-table           <- per field
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/web/DESIGN.md` around lines 1731 - 1736, The fenced code block that
contains the ASCII diagram (the triple-backtick block shown with the lines
starting "section eyebrow (label-caps)...") lacks a language specifier; change
the opening fence from ``` to ```text so tools recognize it as plain text
(update the code fence surrounding the diagram accordingly).

641-645: 💤 Low value

Add language specifier to fenced code block.

This text diagram would benefit from an explicit language specifier for better tool support and clarity. Add text after the opening fence.

📝 Proposed fix
-```
+```text
 bg            page (cream paper)
  └─ panel        header strips, focused cards, code-block chrome
      └─ paper-2     nested separation when needed
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/web/DESIGN.md` around lines 641 - 645, Add the language specifier
"text" to the fenced code block that currently begins with ``` in DESIGN.md (the
small ASCII diagram block shown under the bg → panel → paper-2 example) by
changing the opening fence to ```text so tools and renderers recognize it as
plain text; keep the block contents unchanged and only update the opening fence.
harness/src/provider-openai/wire-messages.ts (1)

52-66: ⚖️ Poor tradeoff

Consider O(n) deduplication with a Map to avoid O(n²) complexity.

The current implementation calls findIndex (O(n)) for each function_result message, resulting in O(n²) complexity when processing large conversation histories with many tool calls. For better performance, track tool_call_id → index in a Map during the main loop, then perform a single O(1) lookup and update per message.

⚡ Proposed optimization
 export function toOpenaiMessages(messages: AgentMessage[], system_prompt: string): unknown[] {
   const out: unknown[] = [];
+  const toolIdxMap = new Map<string, number>();
   if (system_prompt.length > 0) {
     out.push({ role: 'system', content: system_prompt });
   }
   for (const m of messages) {
     if (m.role === 'user') {
       const text = m.content
         .filter(
           (c): c is Extract<(typeof m.content)[number], { type: 'text' }> => c.type === 'text',
         )
         .map((c) => c.text)
         .join('\n');
       out.push({ role: 'user', content: text });
     } else if (m.role === 'assistant') {
       const text = m.content
         .filter(
           (c): c is Extract<(typeof m.content)[number], { type: 'text' }> => c.type === 'text',
         )
         .map((c) => c.text)
         .join('\n');
       const tool_calls = m.content
         .filter(
           (c): c is Extract<(typeof m.content)[number], { type: 'function_call' }> =>
             c.type === 'function_call',
         )
         .map((c) => ({
           id: c.id,
           type: 'function',
           function: { name: c.function_id, arguments: JSON.stringify(c.arguments) },
         }));
       const entry: Record<string, unknown> = { role: 'assistant' };
       if (text.length > 0) entry.content = text;
       if (tool_calls.length > 0) entry.tool_calls = tool_calls;
       out.push(entry);
     } else if (m.role === 'function_result') {
       const text = formatFunctionResultContent(m);
       const row: Record<string, unknown> = {
         role: 'tool',
         tool_call_id: m.function_call_id,
         content: text,
       };
       if (m.is_error) row.is_error = true;
-      // Boundary dedup: never ship two tool messages with the same
-      // tool_call_id. Some OpenAI-compatible servers accept duplicates
-      // and silently overwrite; others (Anthropic's compat shim,
-      // strict gateways) reject. Latest-wins: replace any prior tool
-      // message with the same id rather than appending.
-      const existingIdx = out.findIndex(
-        (e) =>
-          (e as { role?: string }).role === 'tool' &&
-          (e as { tool_call_id?: string }).tool_call_id === m.function_call_id,
-      );
-      if (existingIdx >= 0) {
-        out[existingIdx] = row;
-      } else {
-        out.push(row);
-      }
+      // Boundary dedup: never ship two tool messages with the same
+      // tool_call_id. Some OpenAI-compatible servers accept duplicates
+      // and silently overwrite; others (Anthropic's compat shim,
+      // strict gateways) reject. Latest-wins: replace any prior tool
+      // message with the same id rather than appending.
+      const existingIdx = toolIdxMap.get(m.function_call_id);
+      if (existingIdx !== undefined) {
+        out[existingIdx] = row;
+      } else {
+        toolIdxMap.set(m.function_call_id, out.length);
+        out.push(row);
+      }
     }
     // custom messages are skipped
   }
   return out;
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/provider-openai/wire-messages.ts` around lines 52 - 66, The dedup
block using out.findIndex leads to O(n²); replace it by maintaining a
Map<string, number> (e.g. toolIndexMap) keyed by m.function_call_id that you
populate during the main loop, then use toolIndexMap.get(m.function_call_id)
instead of computing existingIdx via findIndex; if an index exists assign
out[index] = row (no map change), otherwise push row and set
toolIndexMap.set(m.function_call_id, out.length - 1); update the map whenever
you mutate out so subsequent iterations are O(1).
console/web/src/components/ui/LiveRegion.tsx (1)

21-31: 💤 Low value

Consider modernizing the visually-hidden pattern.

The clip: 'rect(0, 0, 0, 0)' property is deprecated in favor of clip-path: 'inset(50%)'. While clip still works in most browsers, the modern approach is more future-proof.

♻️ Proposed modernization
 const SR_ONLY_STYLE: React.CSSProperties = {
   position: 'absolute',
   width: 1,
   height: 1,
   padding: 0,
   margin: -1,
   overflow: 'hidden',
-  clip: 'rect(0, 0, 0, 0)',
+  clipPath: 'inset(50%)',
   whiteSpace: 'nowrap',
   border: 0,
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/web/src/components/ui/LiveRegion.tsx` around lines 21 - 31,
SR_ONLY_STYLE uses the deprecated CSS clip property; update the visually-hidden
style in SR_ONLY_STYLE to use modern clip-path by adding clipPath: 'inset(50%)'
while keeping clip: 'rect(0, 0, 0, 0)' as a fallback for older browsers, and
maintain the other properties (position, width, height, padding, margin,
overflow, whiteSpace, border) so the LiveRegion behavior remains unchanged;
modify the SR_ONLY_STYLE constant accordingly (referencing the SR_ONLY_STYLE
symbol).
harness/src/provider-kimi/wire-messages.ts (1)

65-74: ⚡ Quick win

Use O(1) tool-result dedup indexing in this hot path.

Per-message findIndex turns dedup into O(n²) as history grows. Track tool row indices in a map to keep replacement O(1).

♻️ Proposed refactor
 export function toOpenaiMessages(messages: AgentMessage[], system_prompt: string): unknown[] {
   const out: unknown[] = [];
+  const toolRowById = new Map<string, number>();
   if (system_prompt.length > 0) {
     out.push({ role: 'system', content: system_prompt });
   }
@@
-      const existingIdx = out.findIndex(
-        (e) =>
-          (e as { role?: string }).role === 'tool' &&
-          (e as { tool_call_id?: string }).tool_call_id === m.function_call_id,
-      );
+      const existingIdx = toolRowById.get(m.function_call_id) ?? -1;
       if (existingIdx >= 0) {
         out[existingIdx] = row;
       } else {
         out.push(row);
+        toolRowById.set(m.function_call_id, out.length - 1);
       }
harness/src/provider-lmstudio/config.ts (1)

21-93: ⚡ Quick win

Extract URL validation/loopback resolution into a shared helper.

This block is effectively duplicated in the llama.cpp provider config. Centralizing it will reduce security/behavior drift and keep warning semantics aligned across local providers.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/provider-lmstudio/config.ts` around lines 21 - 93, The URL
validation and loopback-check logic (validatedUrl, isLoopbackHost, and
resolveApiUrl) is duplicated in the llama.cpp provider; extract these into a
shared helper module (e.g., provider-local-utils) that exports validatedUrl,
isLoopbackHost, and resolveApiUrl (or a single resolveApiUrlWithLogging) so both
harness/src/provider-lmstudio/config.ts and the llama.cpp provider import and
call the same functions; ensure the shared helper preserves the same logger
usage and warning codes ('lmstudio_base_url_invalid',
'lmstudio_api_url_invalid', 'lmstudio_api_url_remote') or make them generic and
map them when importing, and keep DEFAULT_API_URL behavior intact so callers
(resolveApiUrl) still return the original fallback string.
iii-directory/src/fs_source.rs (1)

772-779: ⚡ Quick win

Assert function_id in frontmatter tests to lock the new contract.

The new function_id field is parsed but not asserted in the updated tests, so regressions could slip through unnoticed.

✅ Proposed test additions
     fn read_with_frontmatter_defaults_when_missing() {
@@
         let (fm, body) = read_skill_with_frontmatter(&path).unwrap();
         assert!(fm.title.is_none());
         assert!(fm.kind.is_none());
+        assert!(fm.function_id.is_none());
         assert_eq!(body, "# Plain\n\nbody\n");
     }
@@
     fn read_with_frontmatter_tolerates_unrelated_yaml_keys() {
@@
         let (fm, _) = read_skill_with_frontmatter(&path).unwrap();
         assert_eq!(fm.title.as_deref(), Some("Hi"));
         assert_eq!(fm.kind.as_deref(), Some("index"));
+        assert_eq!(fm.function_id.as_deref(), Some("c::d"));
     }

Also applies to: 783-794

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@iii-directory/src/fs_source.rs` around lines 772 - 779, The tests for
read_skill_with_frontmatter are missing assertions for the new frontmatter field
function_id; update the test function
read_with_frontmatter_defaults_when_missing() to assert fm.function_id.is_none()
for the plain.md case, and similarly update the other test (the one between
lines 783-794 that reads frontmatter-present content) to assert fm.function_id
== Some("expected-id") (or the exact string used in that test) so the new
contract is locked; reference the fm variable produced by
read_skill_with_frontmatter and add the appropriate .is_none() or equality
assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@console/web/src/hooks/use-auto-accept-approvals.ts`:
- Around line 61-77: The onDenied callback is being called repeatedly for the
same pending denied approvals; add deduplication using a ref-set (reuse
autoResolvedRef or add a new ref like autoDeniedRef) to track seen denied keys
(e.g. `${functionId}:${functionCallId}`) inside the useEffect that processes
selection.deniedByPolicy so you only call onDenied for entries not already in
the set, then add them to the set after invoking onDenied; keep the existing
reset of the ref on conversationId change so deduplication is per-conversation
and still allow nextApprovalsToAutoResolve/selection.candidates to work as
before.

In `@console/web/src/lib/backend/translate.ts`:
- Around line 55-58: The code emits the terminal "assistant-end" twice via
translateAssistantMessageEnd() in the 'message_end' case and again in the
'agent_end' handling; update the logic so only one code path emits the terminal
signal: pick a single emitter (preferably translateAssistantMessageEnd for
message-based termination) and remove or guard the duplicate emission in the
agent_end branch (or vice versa). Concretely, modify the agent_end handling to
check whether the event stream already had a message_end/assistant emission (or
whether the last message role was 'assistant') before calling the terminal emit,
or move the terminal emission exclusively into translateAssistantMessageEnd and
ensure agent_end no longer emits assistant-end; adjust corresponding checks in
the functions handling 'message_end', 'agent_end' and any code paths referenced
at lines around translateAssistantMessageEnd and agent_end to prevent duplicate
terminal signals.

In `@harness/docs/architecture.md`:
- Around line 32-33: Update the architecture doc to keep the provider listing
and counts in sync: add a new provider entry for llama.cpp (mirror the format of
provider-lmstudio row, e.g., add a "provider-llamacpp |
[src/provider-llamacpp/](harness/src/provider-llamacpp/) | llama.cpp (localhost)
Chat Completions SSE → channel writer. |
[workers/provider-llamacpp.md](harness/docs/workers/provider-llamacpp.md) |"),
update the opening "worker-count" statement and any catalogue/diagram counts to
include both local providers, and update the other affected sections referenced
(lines 53-54, 70-71, 209-220) so the documented worker/service counts and
diagrams match the new provider list; keep formatting consistent with the
existing provider-lmstudio and context-compaction rows and link to the
corresponding workers docs.

In `@harness/docs/workers/provider-llamacpp.md`:
- Around line 34-38: The fenced code block containing the environment variable
LLAMACPP_BASE_URL is missing a language tag which triggers markdownlint MD040;
update the fenced block around the lines with LLAMACPP_BASE_URL to include a
language identifier (e.g., bash) so the opening triple-backticks become ```bash
and the closing remains ```, ensuring the block renders and satisfies linting.

In `@harness/src/context-compaction/handler-sync.ts`:
- Around line 132-141: The emitCompactionDone call is awaited and can throw,
which contradicts the “best-effort” comment; change it so publish failures do
not abort the sync compaction path by isolating the call in its own try/catch
(or fire-and-forget promise) and logging the error instead of rethrowing.
Specifically, locate the emitCompactionDone(...) invocation in handler-sync.ts
(inside the sync compaction flow) and wrap it so any exception is caught and
sent to the logger (or swallowed) while allowing the surrounding function to
continue and return normally.

In `@harness/src/provider-llamacpp/auth.ts`:
- Around line 1-6: The import block at the top (symbols: logger, Credential,
ISdk, WorkerConfig, ChatCompletionsConfig, configFromCredential) is
misformatted; run the project's formatter (e.g. the repo's format script or
Prettier/ESLint autofix) against this file to normalize imports and whitespace
so the linter warning is resolved.

In `@harness/src/provider-llamacpp/refresh-fn.ts`:
- Around line 11-48: This file fails the formatter check; run the repository
formatter and commit the normalized output. Locate the register function and
surrounding code (symbols: FUNCTION_ID, RefreshResult, register) in
harness/src/provider-llamacpp/refresh-fn.ts, run the project's formatting
command (e.g., npm run format or the repo's configured formatter like
prettier/clang-format), ensure whitespace/indentation and import ordering are
fixed, then stage and commit the formatted file so CI passes.

In `@harness/src/provider-llamacpp/register.ts`:
- Around line 36-45: The catch in runStartupDiscovery currently always logs
"header build failed" even if discoverAndRegister throws; update the function to
either split into two try/catch blocks—one around buildAuthHeaders(iii, chatUrl)
that logs header-specific errors and a second around discoverAndRegister(iii,
chatUrl, headers) that logs discovery-specific errors—or change the single catch
message to a generic "llamacpp startup discovery failed" and include the error
and context; reference runStartupDiscovery, buildAuthHeaders and
discoverAndRegister when making the change so the log accurately reflects which
step failed.

In `@harness/src/provider-llamacpp/stream.ts`:
- Around line 139-142: The regex in the safeText construction uses literal
control characters and trips lint; update the replace to use explicit Unicode
escapes instead (e.g. replace(/[\u0000-\u0008\u000B\u000C\u000E-\u001F\u007F]/g,
'') ) so you can remove the eslint-disable comment while preserving the same
character ranges and the .slice(0, 256) behavior for the safeText variable.

In `@harness/src/provider-lmstudio/auth.ts`:
- Around line 52-54: The current validator returns any object with a type
property which can lead to runtime crashes when the code later assumes a
specific string API key; update the validation around the returned cred so it
ensures cred.type is a string and that the expected secret field used later
(e.g., cred.key or cred.api_key — whichever the code expects where the comment
pointed to) exists and is a string before returning; in practice, inside the
same function that currently checks "if (!cred || typeof cred !== 'object' ||
!('type' in cred)) return null;" replace that broad check with explicit checks:
typeof cred.type === 'string' and typeof cred.<expectedSecretField> === 'string'
(or check multiple candidate fields and pick the valid string), and return null
if these checks fail so the downstream code that references the key string
cannot crash.

In `@harness/src/provider-lmstudio/load-fn.ts`:
- Around line 37-43: The current validation for payload.model only checks length
and allows whitespace-only strings; update the guard in the load-fn.ts where
payload is validated (the if block that currently checks payload.model, used
before calling buildAuthHeaders and loadModel) to reject strings that are empty
after trimming (e.g., ensure typeof payload.model === 'string' and
payload.model.trim().length > 0) so whitespace-only model ids are treated as
invalid and return the same { ok: false, error: ... } response.

In `@harness/src/provider-lmstudio/load.ts`:
- Around line 103-114: The error includes the full resp.text() body which can be
very large; change the code that reads the response body (the const text = await
resp.text().catch(() => '') in this file and the similar usage around the
154-157 block) to truncate the captured body to a safe maximum (e.g. 1KB) before
embedding in the thrown Error—read the text as before but assign a truncated
version (add "…(truncated)" when truncated) and use that truncated text in the
Error message that references model and resp.status so logs/events and memory
stay bounded.

In `@harness/src/provider-lmstudio/register.ts`:
- Around line 35-43: The current try/catch wraps both buildAuthHeaders and
discoverAndRegister so any error from discoverAndRegister is mislogged as a
header build failure; change the structure so only buildAuthHeaders is wrapped
by the try/catch (call discoverAndRegister after the try), or give
discoverAndRegister its own specific error handling/logging. In practice, move
await discoverAndRegister(iii, chatUrl, headers) out of the try that catches
errors for buildAuthHeaders (or add a separate catch for discoverAndRegister)
and ensure logger.warn('lmstudio startup discovery: header build failed', { err:
String(err) }) only runs for errors thrown by buildAuthHeaders.

In `@harness/src/provider-lmstudio/stream.ts`:
- Around line 230-236: The non-SSE error path calls syntheticErrorEvent with
cfg.model but other error paths use the resolved effectiveModel; update the
syntheticErrorEvent invocation (the 200 non-SSE body branch) to pass
effectiveModel instead of cfg.model so the reported model ID matches the
resolved model used elsewhere (refer to syntheticErrorEvent, cfg.model,
effectiveModel).
- Around line 148-154: The error event uses the placeholder cfg.model instead of
the resolved model id, causing misleading reports; update the
syntheticErrorEvent call to pass effectiveModel (the resolved model id used
elsewhere) instead of cfg.model so errors report the actual model, ensuring
effectiveModel is in scope for the call that also passes safeText, resp.status,
and classifyLmstudioError.

In `@harness/src/turn-orchestrator/states/functions.ts`:
- Around line 396-409: The dedup scan in the loop over prior messages uses the
wrong break condition: when encountering an assistant boundary (m.role ===
'assistant') you should stop scanning further only if any incoming ID has NOT
been seen yet (i.e., there exists id in incomingIds that is not in
existingResultIds) — the current logic continues scanning in that common path
and can O(history) and mis-dedupe; update the branch in the assistant-boundary
block (the code referencing m.role, incomingIds, existingResultIds, and unseen)
so that you break out of the outer loop when an unseen id is found (invert the
condition) instead of continuing, ensuring earlier turns are not incorrectly
considered duplicates.

In `@harness/tests/context-compaction/compaction-done-emit.test.ts`:
- Around line 95-100: The test stub computes increments against the original
oldValue for each op, so multiple sequential `increment` ops don't compose;
update the loop in compaction-done-emit.test.ts so increments use the latest
computed value rather than oldValue: initialize newValue to oldValue (or set a
current value variable) before processing ops, and when handling an `increment`
op compute prev from the current/newValue (falling back to 0 only if neither
oldValue nor newValue is a number) so successive increments accumulate
correctly; refer to the loop using vars ops, oldValue, and newValue to locate
and change the logic.

In `@harness/tests/provider-lmstudio/wire-messages.test.ts`:
- Around line 239-265: Test title/comments currently claim cross-batch repeats
stay separate but the assertions expect a single merged tool; update the test
text to reflect the actual invariant enforced by toOpenaiMessages/mkResult:
dedup is scoped to the pending batch so within-this-batch repeats (even with an
assistant turn between) collapse to one tool entry while cross-batch repeats
remain separate—adjust the it(...) title and inline comments to state
"within-batch repeats dedupe; cross-batch repeats stay separate" so they match
the existing expect(...) checks.

---

Outside diff comments:
In `@console/web/src/hooks/use-conversations.ts`:
- Around line 72-80: Loaded conversations may lack the required
Conversation.autoAccept field; modify the initializer that calls
loadConversations() so it normalizes/backfills autoAccept on each record before
returning—e.g., map over loaded results and for any conversation with autoAccept
=== undefined set it to the default value used by emptyConversation (you can
derive that by calling emptyConversation(model) where model is convo.model ||
loadLastModel() || DEFAULT_MODEL); then return the mapped list (or the single
emptyConversation when loaded is empty) so runtime shape always includes
autoAccept.

In `@harness/README.md`:
- Around line 26-46: Add documentation for the new provider-llamacpp worker:
insert a table row alongside the LM Studio entry referencing
src/provider-llamacpp/ and the worker capabilities (e.g.
provider::llamacpp::stream, provider::llamacpp::complete) in the workers table,
and add the corresponding Quickstart run command to the example commands: node
dist/provider-llamacpp/main.js --url ws://127.0.0.1:49134 --config ./config.yaml
(match the same flags pattern used by provider-lmstudio and other providers).

In `@harness/src/provider-llamacpp/discover.ts`:
- Around line 1-202: This file fails CI due to formatting; run the project's
auto-formatter/linter fix (e.g., prettier/eslint with --fix or the repo's "node
lint + test" task) on harness/src/provider-llamacpp/discover.ts and commit the
changes so the file matches repo style. Focus on normalizing
whitespace/indentation, trailing commas/semicolons, and import/exports
formatting around functions such as modelsUrl, fetchWithTimeout, fetchModelIds,
toCatalogModel, registerDiscovered, and discoverAndRegister so the file passes
lint/format checks. Ensure no functional changes are made—only formatting
fixes—and re-run the CI lint check locally before pushing.

---

Nitpick comments:
In `@console/web/DESIGN.md`:
- Around line 1731-1736: The fenced code block that contains the ASCII diagram
(the triple-backtick block shown with the lines starting "section eyebrow
(label-caps)...") lacks a language specifier; change the opening fence from ```
to ```text so tools recognize it as plain text (update the code fence
surrounding the diagram accordingly).
- Around line 641-645: Add the language specifier "text" to the fenced code
block that currently begins with ``` in DESIGN.md (the small ASCII diagram block
shown under the bg → panel → paper-2 example) by changing the opening fence to
```text so tools and renderers recognize it as plain text; keep the block
contents unchanged and only update the opening fence.

In `@console/web/src/components/ui/LiveRegion.tsx`:
- Around line 21-31: SR_ONLY_STYLE uses the deprecated CSS clip property; update
the visually-hidden style in SR_ONLY_STYLE to use modern clip-path by adding
clipPath: 'inset(50%)' while keeping clip: 'rect(0, 0, 0, 0)' as a fallback for
older browsers, and maintain the other properties (position, width, height,
padding, margin, overflow, whiteSpace, border) so the LiveRegion behavior
remains unchanged; modify the SR_ONLY_STYLE constant accordingly (referencing
the SR_ONLY_STYLE symbol).

In `@harness/src/provider-lmstudio/config.ts`:
- Around line 21-93: The URL validation and loopback-check logic (validatedUrl,
isLoopbackHost, and resolveApiUrl) is duplicated in the llama.cpp provider;
extract these into a shared helper module (e.g., provider-local-utils) that
exports validatedUrl, isLoopbackHost, and resolveApiUrl (or a single
resolveApiUrlWithLogging) so both harness/src/provider-lmstudio/config.ts and
the llama.cpp provider import and call the same functions; ensure the shared
helper preserves the same logger usage and warning codes
('lmstudio_base_url_invalid', 'lmstudio_api_url_invalid',
'lmstudio_api_url_remote') or make them generic and map them when importing, and
keep DEFAULT_API_URL behavior intact so callers (resolveApiUrl) still return the
original fallback string.

In `@harness/src/provider-openai/wire-messages.ts`:
- Around line 52-66: The dedup block using out.findIndex leads to O(n²); replace
it by maintaining a Map<string, number> (e.g. toolIndexMap) keyed by
m.function_call_id that you populate during the main loop, then use
toolIndexMap.get(m.function_call_id) instead of computing existingIdx via
findIndex; if an index exists assign out[index] = row (no map change), otherwise
push row and set toolIndexMap.set(m.function_call_id, out.length - 1); update
the map whenever you mutate out so subsequent iterations are O(1).

In `@iii-directory/src/fs_source.rs`:
- Around line 772-779: The tests for read_skill_with_frontmatter are missing
assertions for the new frontmatter field function_id; update the test function
read_with_frontmatter_defaults_when_missing() to assert fm.function_id.is_none()
for the plain.md case, and similarly update the other test (the one between
lines 783-794 that reads frontmatter-present content) to assert fm.function_id
== Some("expected-id") (or the exact string used in that test) so the new
contract is locked; reference the fm variable produced by
read_skill_with_frontmatter and add the appropriate .is_none() or equality
assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 152cc76f-230f-47eb-825d-739fc414d578

📥 Commits

Reviewing files that changed from the base of the PR and between c9d4dd9 and d615784.

📒 Files selected for processing (125)
  • console/web/DESIGN.md
  • console/web/src/components/chat/AutoAcceptToggle.tsx
  • console/web/src/components/chat/ChatPanel.tsx
  • console/web/src/components/chat/ChatView.tsx
  • console/web/src/components/chat/Composer.tsx
  • console/web/src/components/chat/MessageList.tsx
  • console/web/src/components/ui/LiveRegion.tsx
  • console/web/src/hooks/use-auto-accept-approvals.test.ts
  • console/web/src/hooks/use-auto-accept-approvals.ts
  • console/web/src/hooks/use-conversations.ts
  • console/web/src/hooks/use-live-announcer.ts
  • console/web/src/lib/backend/auto-accept-policy.test.ts
  • console/web/src/lib/backend/auto-accept-policy.ts
  • console/web/src/lib/backend/auto-accept.test.ts
  • console/web/src/lib/backend/auto-accept.ts
  • console/web/src/lib/backend/translate.test.ts
  • console/web/src/lib/backend/translate.ts
  • console/web/src/lib/backend/types.ts
  • console/web/src/lib/format-stop-reason.test.ts
  • console/web/src/lib/format-stop-reason.ts
  • console/web/src/lib/storage.ts
  • console/web/src/lib/token-estimate.test.ts
  • console/web/src/lib/token-estimate.ts
  • console/web/src/pages/Examples/sections/composer-variants.tsx
  • console/web/src/pages/Playground/EventLog.tsx
  • console/web/src/pages/Playground/index.tsx
  • console/web/src/types/chat.ts
  • console/web/src/types/iii-agent-event.ts
  • harness/README.md
  • harness/docs/architecture.md
  • harness/docs/workers/provider-llamacpp.md
  • harness/docs/workers/provider-lmstudio.md
  • harness/package.json
  • harness/src/auth-credentials/types.ts
  • harness/src/context-compaction/emit.ts
  • harness/src/context-compaction/handler-async.ts
  • harness/src/context-compaction/handler-sync.ts
  • harness/src/index.ts
  • harness/src/models-catalog/catalog.ts
  • harness/src/models-catalog/models.json
  • harness/src/provider-anthropic/wire-messages.ts
  • harness/src/provider-kimi/sse.ts
  • harness/src/provider-kimi/wire-messages.ts
  • harness/src/provider-llamacpp/auth.ts
  • harness/src/provider-llamacpp/complete.ts
  • harness/src/provider-llamacpp/config.ts
  • harness/src/provider-llamacpp/discover.ts
  • harness/src/provider-llamacpp/iii.worker.yaml
  • harness/src/provider-llamacpp/main.ts
  • harness/src/provider-llamacpp/refresh-fn.ts
  • harness/src/provider-llamacpp/register.ts
  • harness/src/provider-llamacpp/sse.ts
  • harness/src/provider-llamacpp/stream-fn.ts
  • harness/src/provider-llamacpp/stream.ts
  • harness/src/provider-llamacpp/types.ts
  • harness/src/provider-llamacpp/wire-messages.ts
  • harness/src/provider-llamacpp/wire-tools.ts
  • harness/src/provider-lmstudio/auth.ts
  • harness/src/provider-lmstudio/complete.ts
  • harness/src/provider-lmstudio/config.ts
  • harness/src/provider-lmstudio/discover.ts
  • harness/src/provider-lmstudio/iii.worker.yaml
  • harness/src/provider-lmstudio/load-fn.ts
  • harness/src/provider-lmstudio/load.ts
  • harness/src/provider-lmstudio/main.ts
  • harness/src/provider-lmstudio/refresh-fn.ts
  • harness/src/provider-lmstudio/register.ts
  • harness/src/provider-lmstudio/sse.ts
  • harness/src/provider-lmstudio/stream-fn.ts
  • harness/src/provider-lmstudio/stream.ts
  • harness/src/provider-lmstudio/types.ts
  • harness/src/provider-lmstudio/unload-fn.ts
  • harness/src/provider-lmstudio/wire-messages.ts
  • harness/src/provider-lmstudio/wire-tools.ts
  • harness/src/provider-openai/wire-messages.ts
  • harness/src/turn-orchestrator/agent-trigger.ts
  • harness/src/turn-orchestrator/provider-router.ts
  • harness/src/turn-orchestrator/states/assistant.ts
  • harness/src/turn-orchestrator/states/functions.ts
  • harness/src/turn-orchestrator/system-prompt.ts
  • harness/src/types/agent-event.ts
  • harness/tests/auth-credentials/env-map-llamacpp.test.ts
  • harness/tests/auth-credentials/env-map-lmstudio.test.ts
  • harness/tests/context-compaction/compaction-done-emit.test.ts
  • harness/tests/models-catalog/seed-llamacpp.test.ts
  • harness/tests/models-catalog/seed-lmstudio.test.ts
  • harness/tests/provider-anthropic/wire-messages.test.ts
  • harness/tests/provider-kimi/sse.test.ts
  • harness/tests/provider-kimi/wire-messages.test.ts
  • harness/tests/provider-llamacpp/auth.test.ts
  • harness/tests/provider-llamacpp/config.test.ts
  • harness/tests/provider-llamacpp/discover.test.ts
  • harness/tests/provider-llamacpp/sse.test.ts
  • harness/tests/provider-llamacpp/stream.test.ts
  • harness/tests/provider-llamacpp/wire-messages.test.ts
  • harness/tests/provider-llamacpp/wire-tools.test.ts
  • harness/tests/provider-lmstudio/auth.test.ts
  • harness/tests/provider-lmstudio/config.test.ts
  • harness/tests/provider-lmstudio/discover.test.ts
  • harness/tests/provider-lmstudio/load.test.ts
  • harness/tests/provider-lmstudio/sse.test.ts
  • harness/tests/provider-lmstudio/stream.test.ts
  • harness/tests/provider-lmstudio/wire-messages.test.ts
  • harness/tests/provider-lmstudio/wire-tools.test.ts
  • harness/tests/provider-openai/wire-messages.test.ts
  • harness/tests/turn-orchestrator/agent-trigger.test.ts
  • harness/tests/turn-orchestrator/functions.test.ts
  • harness/tests/turn-orchestrator/provider-router.test.ts
  • harness/tests/turn-orchestrator/system-prompt.test.ts
  • iii-directory/skills/directory/engine/functions/info.md
  • iii-directory/skills/directory/engine/functions/list.md
  • iii-directory/skills/directory/engine/registered-triggers/info.md
  • iii-directory/skills/directory/engine/registered-triggers/list.md
  • iii-directory/skills/directory/engine/triggers/info.md
  • iii-directory/skills/directory/engine/triggers/list.md
  • iii-directory/skills/directory/engine/workers/info.md
  • iii-directory/skills/directory/engine/workers/list.md
  • iii-directory/skills/directory/registry/workers/info.md
  • iii-directory/skills/directory/registry/workers/list.md
  • iii-directory/skills/directory/skills/download.md
  • iii-directory/skills/directory/skills/get.md
  • iii-directory/skills/directory/skills/index.md
  • iii-directory/skills/directory/skills/list.md
  • iii-directory/src/fs_source.rs
  • iii-directory/src/functions/skills.rs

Comment on lines +61 to +77
const autoResolvedRef = useRef<Set<string>>(new Set())

useEffect(() => {
autoResolvedRef.current = new Set()
}, [conversationId])

useEffect(() => {
if (!enabled) return
if (!resolveApproval) return
const selection = selectAutoAcceptCandidates(messages, policy)
if (selection === null) return // hot-path early exit
if (onDenied) {
for (const d of selection.deniedByPolicy) {
onDenied(d.functionId, d.functionCallId)
}
}
const todo = nextApprovalsToAutoResolve(selection.candidates, autoResolvedRef.current)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Deduplicate policy-denied callbacks to avoid repeated side effects.

onDenied currently fires every effect pass for the same pending denied approval. This can repeatedly announce/log the same denial while the card remains pending.

♻️ Proposed fix
 export function useAutoAcceptApprovals({
@@
 }: UseAutoAcceptApprovalsArgs): void {
   const autoResolvedRef = useRef<Set<string>>(new Set())
+  const deniedNotifiedRef = useRef<Set<string>>(new Set())
 
   useEffect(() => {
     autoResolvedRef.current = new Set()
+    deniedNotifiedRef.current = new Set()
   }, [conversationId])
@@
     if (selection === null) return // hot-path early exit
     if (onDenied) {
       for (const d of selection.deniedByPolicy) {
+        if (deniedNotifiedRef.current.has(d.functionCallId)) continue
+        deniedNotifiedRef.current.add(d.functionCallId)
         onDenied(d.functionId, d.functionCallId)
       }
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/web/src/hooks/use-auto-accept-approvals.ts` around lines 61 - 77, The
onDenied callback is being called repeatedly for the same pending denied
approvals; add deduplication using a ref-set (reuse autoResolvedRef or add a new
ref like autoDeniedRef) to track seen denied keys (e.g.
`${functionId}:${functionCallId}`) inside the useEffect that processes
selection.deniedByPolicy so you only call onDenied for entries not already in
the set, then add them to the set after invoking onDenied; keep the existing
reset of the ref on conversationId change so deduplication is per-conversation
and still allow nextApprovalsToAutoResolve/selection.candidates to work as
before.

Comment on lines 55 to 58
case 'message_end':
if (event.message.role === 'assistant') {
return [{ kind: 'assistant-end' }]
}
return []
if (event.message.role !== 'assistant') return []
return translateAssistantMessageEnd(event.message)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid duplicate terminal assistant-end emission paths.

Line 56/57 now emits assistant-end via translateAssistantMessageEnd, while Line 91/92 still emits assistant-end on agent_end. When both events are present, the UI receives duplicate terminal signals.

Suggested minimal fix
 function translateAssistantMessageEnd(message: AssistantMessage): StreamEvent[] {
-  const out: StreamEvent[] = [{ kind: 'assistant-end' }]
+  const out: StreamEvent[] = []
   const stop = message.stop_reason as
     | 'end'
     | 'length'
     | 'error'
     | 'aborted'

Also applies to: 91-93, 156-158

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@console/web/src/lib/backend/translate.ts` around lines 55 - 58, The code
emits the terminal "assistant-end" twice via translateAssistantMessageEnd() in
the 'message_end' case and again in the 'agent_end' handling; update the logic
so only one code path emits the terminal signal: pick a single emitter
(preferably translateAssistantMessageEnd for message-based termination) and
remove or guard the duplicate emission in the agent_end branch (or vice versa).
Concretely, modify the agent_end handling to check whether the event stream
already had a message_end/assistant emission (or whether the last message role
was 'assistant') before calling the terminal emit, or move the terminal emission
exclusively into translateAssistantMessageEnd and ensure agent_end no longer
emits assistant-end; adjust corresponding checks in the functions handling
'message_end', 'agent_end' and any code paths referenced at lines around
translateAssistantMessageEnd and agent_end to prevent duplicate terminal
signals.

Comment on lines +32 to 33
| provider-lmstudio | [src/provider-lmstudio/](harness/src/provider-lmstudio/) | LM Studio (localhost) Chat Completions SSE → channel writer. | [workers/provider-lmstudio.md](harness/docs/workers/provider-lmstudio.md) |
| context-compaction | [src/context-compaction/](harness/src/context-compaction/) | Optional `agent::events` side-car that compacts session history when running token count crosses a threshold. | [workers/context-compaction.md](harness/docs/workers/context-compaction.md) |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep architecture docs consistent with both new local providers.

These sections now include LM Studio, but llama.cpp is still missing, and the opening worker-count statement is stale. Please update the catalogue/diagrams/count together so this doc remains source-of-truth.

Also applies to: 53-54, 70-71, 209-220

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/docs/architecture.md` around lines 32 - 33, Update the architecture
doc to keep the provider listing and counts in sync: add a new provider entry
for llama.cpp (mirror the format of provider-lmstudio row, e.g., add a
"provider-llamacpp | [src/provider-llamacpp/](harness/src/provider-llamacpp/) |
llama.cpp (localhost) Chat Completions SSE → channel writer. |
[workers/provider-llamacpp.md](harness/docs/workers/provider-llamacpp.md) |"),
update the opening "worker-count" statement and any catalogue/diagram counts to
include both local providers, and update the other affected sections referenced
(lines 53-54, 70-71, 209-220) so the documented worker/service counts and
diagrams match the new provider list; keep formatting consistent with the
existing provider-lmstudio and context-compaction rows and link to the
corresponding workers docs.

Comment on lines +34 to +38
```
LLAMACPP_BASE_URL=http://localhost:8080
# or:
LLAMACPP_BASE_URL=http://localhost:8080/v1/chat/completions
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language tag to the fenced env block.

This block is missing a fence language, which triggers markdownlint (MD040).

Suggested fix
-```
+```bash
 LLAMACPP_BASE_URL=http://localhost:8080
 # or:
 LLAMACPP_BASE_URL=http://localhost:8080/v1/chat/completions
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>

[warning] 34-34: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @harness/docs/workers/provider-llamacpp.md around lines 34 - 38, The fenced
code block containing the environment variable LLAMACPP_BASE_URL is missing a
language tag which triggers markdownlint MD040; update the fenced block around
the lines with LLAMACPP_BASE_URL to include a language identifier (e.g., bash)
so the opening triple-backticks become bash and the closing remains ,
ensuring the block renders and satisfies linting.


</details>

<!-- fingerprinting:phantom:poseidon:hawk -->

<!-- This is an auto-generated comment by CodeRabbit -->

Comment on lines +132 to +141
// Tell the UI we just compacted so it can render a marker and
// re-estimate context usage. Best-effort: a publish failure must
// not derail the in-flight turn — the orchestrator is waiting on
// our return.
await emitCompactionDone(iii, input.session_id, 'sync', {
summary_text: result.summary_text,
tokens_before: result.tokens_before,
compaction_entry_id: result.compaction_entry_id,
tail_start_id: result.tail_start_id,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Best-effort compaction event can currently fail the sync compaction path.

Lines 136-141 await emitCompactionDone without isolation. If publish fails, the catch block rethrows and aborts the operation, which contradicts the “must not derail” requirement in the comment.

Proposed fix
-      await emitCompactionDone(iii, input.session_id, 'sync', {
-        summary_text: result.summary_text,
-        tokens_before: result.tokens_before,
-        compaction_entry_id: result.compaction_entry_id,
-        tail_start_id: result.tail_start_id,
-      });
+      try {
+        await emitCompactionDone(iii, input.session_id, 'sync', {
+          summary_text: result.summary_text,
+          tokens_before: result.tokens_before,
+          compaction_entry_id: result.compaction_entry_id,
+          tail_start_id: result.tail_start_id,
+        });
+      } catch (err) {
+        logger.warn('handler-sync: failed to emit compaction_done (non-fatal)', {
+          session_id: input.session_id,
+          err: String(err),
+        });
+      }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Tell the UI we just compacted so it can render a marker and
// re-estimate context usage. Best-effort: a publish failure must
// not derail the in-flight turn — the orchestrator is waiting on
// our return.
await emitCompactionDone(iii, input.session_id, 'sync', {
summary_text: result.summary_text,
tokens_before: result.tokens_before,
compaction_entry_id: result.compaction_entry_id,
tail_start_id: result.tail_start_id,
});
// Tell the UI we just compacted so it can render a marker and
// re-estimate context usage. Best-effort: a publish failure must
// not derail the in-flight turn — the orchestrator is waiting on
// our return.
try {
await emitCompactionDone(iii, input.session_id, 'sync', {
summary_text: result.summary_text,
tokens_before: result.tokens_before,
compaction_entry_id: result.compaction_entry_id,
tail_start_id: result.tail_start_id,
});
} catch (err) {
logger.warn('handler-sync: failed to emit compaction_done (non-fatal)', {
session_id: input.session_id,
err: String(err),
});
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/context-compaction/handler-sync.ts` around lines 132 - 141, The
emitCompactionDone call is awaited and can throw, which contradicts the
“best-effort” comment; change it so publish failures do not abort the sync
compaction path by isolating the call in its own try/catch (or fire-and-forget
promise) and logging the error instead of rethrowing. Specifically, locate the
emitCompactionDone(...) invocation in handler-sync.ts (inside the sync
compaction flow) and wrap it so any exception is caught and sent to the logger
(or swallowed) while allowing the surrounding function to continue and return
normally.

Comment thread harness/src/provider-lmstudio/stream.ts
Comment on lines +230 to +236
yield syntheticErrorEvent(
'lmstudio returned a 200 response with a non-SSE body (no parseable chunks). The endpoint URL is likely wrong, or LM Studio served its dashboard page in place of an SSE stream.',
cfg.model,
cfg.provider_name,
);
return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Same model id inconsistency.

Line 232 uses cfg.model instead of effectiveModel for the non-SSE body error. For consistency with other error paths in this function, use the resolved model.

Proposed fix
     yield syntheticErrorEvent(
       'lmstudio returned a 200 response with a non-SSE body (no parseable chunks). The endpoint URL is likely wrong, or LM Studio served its dashboard page in place of an SSE stream.',
-      cfg.model,
+      effectiveModel,
       cfg.provider_name,
     );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
yield syntheticErrorEvent(
'lmstudio returned a 200 response with a non-SSE body (no parseable chunks). The endpoint URL is likely wrong, or LM Studio served its dashboard page in place of an SSE stream.',
cfg.model,
cfg.provider_name,
);
return;
}
yield syntheticErrorEvent(
'lmstudio returned a 200 response with a non-SSE body (no parseable chunks). The endpoint URL is likely wrong, or LM Studio served its dashboard page in place of an SSE stream.',
effectiveModel,
cfg.provider_name,
);
return;
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/provider-lmstudio/stream.ts` around lines 230 - 236, The non-SSE
error path calls syntheticErrorEvent with cfg.model but other error paths use
the resolved effectiveModel; update the syntheticErrorEvent invocation (the 200
non-SSE body branch) to pass effectiveModel instead of cfg.model so the reported
model ID matches the resolved model used elsewhere (refer to
syntheticErrorEvent, cfg.model, effectiveModel).

Comment on lines +396 to +409
if (m.role === 'assistant') {
// Once we cross an assistant boundary BEFORE seeing any
// pending incoming id we've passed the turn this finalize
// is writing for — earlier function_result blocks can't be
// duplicates of `function_results`.
let unseen = false;
for (const id of incomingIds) {
if (!existingResultIds.has(id)) {
unseen = true;
break;
}
}
if (!unseen) break;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Assistant-boundary break condition is inverted in dedup scan.

The current condition keeps scanning older history when incoming IDs have not been seen yet, which is the common first-run path. That makes the guard degrade to O(history) and can falsely dedupe against older turns if an ID repeats.

Suggested fix
     if (m.role === 'assistant') {
       // Once we cross an assistant boundary BEFORE seeing any
       // pending incoming id we've passed the turn this finalize
       // is writing for — earlier function_result blocks can't be
       // duplicates of `function_results`.
       let unseen = false;
       for (const id of incomingIds) {
         if (!existingResultIds.has(id)) {
           unseen = true;
           break;
         }
       }
-      if (!unseen) break;
+      if (unseen) break;
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/src/turn-orchestrator/states/functions.ts` around lines 396 - 409,
The dedup scan in the loop over prior messages uses the wrong break condition:
when encountering an assistant boundary (m.role === 'assistant') you should stop
scanning further only if any incoming ID has NOT been seen yet (i.e., there
exists id in incomingIds that is not in existingResultIds) — the current logic
continues scanning in that common path and can O(history) and mis-dedupe; update
the branch in the assistant-boundary block (the code referencing m.role,
incomingIds, existingResultIds, and unseen) so that you break out of the outer
loop when an unseen id is found (invert the condition) instead of continuing,
ensuring earlier turns are not incorrectly considered duplicates.

Comment on lines +95 to +100
for (const op of ops) {
if (op.type === 'set') newValue = op.value;
if (op.type === 'increment') {
const prev = typeof oldValue === 'number' ? oldValue : 0;
newValue = prev + (op.by ?? 1);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix sequential state::update semantics in the test stub.

increment currently reads from the original value each time, so multi-op updates don’t compose correctly.

Proposed fix
-          if (op.type === 'increment') {
-            const prev = typeof oldValue === 'number' ? oldValue : 0;
-            newValue = prev + (op.by ?? 1);
-          }
+          if (op.type === 'increment') {
+            const prev = typeof newValue === 'number' ? newValue : 0;
+            newValue = prev + (op.by ?? 1);
+          }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for (const op of ops) {
if (op.type === 'set') newValue = op.value;
if (op.type === 'increment') {
const prev = typeof oldValue === 'number' ? oldValue : 0;
newValue = prev + (op.by ?? 1);
}
for (const op of ops) {
if (op.type === 'set') newValue = op.value;
if (op.type === 'increment') {
const prev = typeof newValue === 'number' ? newValue : 0;
newValue = prev + (op.by ?? 1);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/tests/context-compaction/compaction-done-emit.test.ts` around lines
95 - 100, The test stub computes increments against the original oldValue for
each op, so multiple sequential `increment` ops don't compose; update the loop
in compaction-done-emit.test.ts so increments use the latest computed value
rather than oldValue: initialize newValue to oldValue (or set a current value
variable) before processing ops, and when handling an `increment` op compute
prev from the current/newValue (falling back to 0 only if neither oldValue nor
newValue is a number) so successive increments accumulate correctly; refer to
the loop using vars ops, oldValue, and newValue to locate and change the logic.

Comment on lines +239 to +265
it('dedup is scoped to the pending batch (across-batch repeats stay separate for openai-format too)', () => {
// OpenAI/LM Studio's wire format is flat (no batched user-msg
// wrapper like Anthropic), so duplicates here just become a
// single tool entry regardless of an assistant gap. We still
// verify the assistant boundary doesn't merge tool entries from
// different turns into one row.
const msgs: AgentMessage[] = [
mkResult('a', 'r1'),
{
role: 'assistant',
content: [{ type: 'text', text: 'thinking…' }],
stop_reason: 'end',
model: 'qwen/qwen3-4b-2507',
provider: 'lmstudio',
timestamp: 0,
},
mkResult('a', 'r2'),
];
const out = toOpenaiMessages(msgs, '') as Array<Record<string, unknown>>;
const tools = out.filter((m) => m.role === 'tool') as Array<
{ tool_call_id: string; content: string }
>;
// Latest-wins: a single tool row carries the most recent body.
// The assistant message between them survives intact.
expect(tools).toHaveLength(1);
expect(tools[0]?.content).toBe('r2');
expect(out.filter((m) => m.role === 'assistant')).toHaveLength(1);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align the cross-batch dedup invariant in this test.

The test title/comments say cross-batch repeats should stay separate, but the assertion expects a single merged tool message. Please make the name/comment/assertions agree on one invariant to avoid baking in the wrong behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harness/tests/provider-lmstudio/wire-messages.test.ts` around lines 239 -
265, Test title/comments currently claim cross-batch repeats stay separate but
the assertions expect a single merged tool; update the test text to reflect the
actual invariant enforced by toOpenaiMessages/mkResult: dedup is scoped to the
pending batch so within-this-batch repeats (even with an assistant turn between)
collapse to one tool entry while cross-batch repeats remain separate—adjust the
it(...) title and inline comments to state "within-batch repeats dedupe;
cross-batch repeats stay separate" so they match the existing expect(...)
checks.

Three CI failures on the initial push:

1. `noControlCharactersInRegex` (Biome lint, harness): the eslint-
   disable comment on `stream.ts` was ignored by Biome 2.4.10. The
   `/[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]/g` literal flagged on each
   escape inside the character class. Replaced with a
   `stripControlChars` helper that iterates by `charCodeAt` (allows
   TAB/LF/CR, strips other C0 + DEL). Same fix in
   `provider-lmstudio/stream.ts` and `provider-llamacpp/stream.ts`.

2. Format drift (Biome format, harness): three test files added in
   the base WIP commit (`agent-trigger.test.ts`, `functions.test.ts`,
   `provider-router.test.ts`) and several files we touched
   for the /review and llamacpp work didn't match `biome format`'s
   output (long-line wrapping, joined chain calls, single-line
   expressions). Ran `biome check --write` against `harness/`;
   biome ci now exits 0.

3. SPA build (console rust lint+test, `tsc -b` step): the
   `use-auto-accept-approvals.test.ts` "wiring guard" test imported
   `node:fs/promises` to read its own source as a sanity check.
   The console TypeScript project doesn't include node types, so
   `tsc -b` failed with TS2591. Dropped the source-introspection
   test; the typechecker already catches drift if the hook stops
   calling the policy-aware selector (its return shape would
   change).

Tests: 789 harness (no change) + 234 console (was 235; dropped the
one fs-using guard). `biome ci harness` and `tsc -b` both exit 0.

No behavior change. Helper functions and formatting only.
@andersonleal andersonleal merged commit 06df663 into main May 22, 2026
14 checks passed
@andersonleal andersonleal deleted the wip/auto-accept-and-lmstudio branch May 22, 2026 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants