Auto-accept toggle, LM Studio + llama.cpp providers, /review hardening#184
Conversation
Squashed bundle of in-flight work on top of origin/main (post
harness-node -> harness rename). Several independent concerns; split
into focused PRs before shipping.
console/web - auto-accept per-conversation toggle:
- New AutoAcceptToggle component + auto-accept storage helper with
tests; ChatPanel/ChatView/Composer wire it in.
- Storage layer persists per-conversation flag; translate layer
honors it for incoming approval events.
- Chat type + iii-agent-event additions; Examples & Playground
updates.
harness - provider-lmstudio worker:
- New worker under src/provider-lmstudio/ (auth, complete, config,
discover, load/unload, refresh, register, sse, stream, types,
wire-messages, wire-tools) with accompanying tests.
- iii.worker.yaml + docs/workers/provider-lmstudio.md.
- Manual refresh after loading/unloading a model in LM Studio is
available via the existing CLI (no helper script needed):
iii trigger provider::lmstudio::refresh_models --json '{}'
harness - models catalog + routing for gemma:
- models-catalog/models.json adds google/gemma-4-e4b (and related).
- turn-orchestrator/provider-router routes gemma to lmstudio instead
of falling back to anthropic.
harness - misc refinements:
- context-compaction handlers emit a done event.
- turn-orchestrator agent-trigger + states (assistant, functions)
+ system-prompt updates.
- Provider wire-messages tweaks across anthropic, kimi, openai.
harness - QA docs:
- docs/qa/lmstudio-chat-e2e.md: long-form Python runbook for
CLI re-play of the chat path (kickoff payload, poll loop,
failure modes).
- docs/qa/agent-howto.md: terse quickstart pointing at the runbook,
with pass criteria + fail-diagnosis table for the canonical
sandbox+exec+stop scenario.
iii-directory:
- skills/directory/**/*.md re-document function_id argument shapes.
- fs_source.rs + functions/skills.rs source updates.
Misc: Cargo.lock bumps (database, iii-lsp, image-resize, mcp),
image-resize/example/config.yaml, README + harness/docs/architecture.md
updates (provider-lmstudio row added to worker catalogue, mermaid
diagram includes provLms node).
Hardens the per-conversation auto-accept toggle in three layers: Safety. Adds `auto-accept-policy.ts` — a defense-in-depth deny filter that refuses to silently click "allow" on calls whose function id matches state-mutating, destructive, process-spawn, external-egress, recursive-dispatch, or credential patterns. The server-side approval gate is still in the chain regardless; the policy is a client-side guard against prompt-injection-driven escalation when auto-accept is on. Deny verbs are boundary-anchored so the `shell::` worker namespace doesn't poison safe reads like `shell::fs::read` / `shell::fs::ls`. Hot path. Extracts the inline auto-accept effect from `ChatView` into `useAutoAcceptApprovals` with a hot-path early exit when the last message isn't a function-call. Drops `conversation.messages` from `handleSubmit`'s deps and reads through a ref instead so the callback identity is stable across token-by-token renders (previously rebuilt every token, which thrashed Lexical's `KEY_ENTER_COMMAND` listener). `MessageList` swaps `[...messages].reverse().find(...)` for a reverse for-loop to drop a per-token full-array allocation. A11y. Reworks `AutoAcceptToggle`: high-contrast `bg-ink text-bg` ON state (WCAG 1.4.3), `focus-visible` outline ring, `aria-describedby` for the explanatory text (the `title` tooltip was keyboard/SR invisible), `aria-disabled` instead of `disabled` so the policy control stays in the tab order while streaming, and the visible "on"/"off" text now carries the accessible name (was overridden by an `aria-label` that doubled with `role="switch"`). Adds a shared `<LiveRegion>` + `useLiveAnnouncer` so SR users hear auto-accept events, denial reasons, stop-reasons (assertive on error), and compaction markers. Extracts `formatStopReason` to its own module with tests. Tests: 235 passing (was 233 pre-fix).
Provider-lmstudio (and provider-kimi for the shared SSE DoS guard):
- SSE: clamp attacker-controlled tool-call `index` to [0, 256] so a
hostile or buggy server can't force unbounded `state.tool_calls`
allocation via `{"index": 1e9}` in a delta. Same guard added to
provider-kimi/sse.ts.
- SSE: render provider errors via `error_message` ONLY, not as a
`text` ContentBlock. Pre-fix a malicious LM Studio backend or MITM
could smuggle "tool approved" lines / markup into the persisted
assistant content stream where it would be re-fed to the next turn
as trusted context.
- wire-messages: replace `out.findIndex` tool-result dedup with an
`O(1)` `Map<tool_call_id, idx>` lookup. Long histories with many
tool calls were O(M²) per translation, amplified on LM Studio
because the auto-load retry can re-translate the same history
twice in one user turn.
- discover: `Promise.allSettled` the per-model `models::register`
fan-out so startup-blocking time is bounded by one round-trip's
timeout instead of N × REGISTER_TIMEOUT_MS. Move full id list to
DEBUG (count stays at INFO) so fine-tuned customer ids don't leak
into shared observability tooling.
- auth: loopback-aware key selection. Omit `Authorization` entirely
on non-loopback hosts without an explicit credential (don't ship a
synthetic `Bearer lm-studio` fingerprint to whoever operates a
misconfigured remote). Log auth-fetch failures at WARN with a
stable code so monitoring can alert on a sustained fallback rate
(DoS-the-auth-worker → force-fallback attack).
- config: validate `LMSTUDIO_BASE_URL` as http(s) via `new URL()`.
Reject malformed values + non-http(s) schemes; fall through to
yaml/default. Warn when the resolved host is non-loopback.
- stream: truncate non-2xx response bodies to 256 bytes and strip
control chars before surfacing — non-2xx can carry HTML auth-
redirect pages, proxy error bodies, or attacker-controlled text.
Turn-orchestrator:
- `states/functions.ts`: bound the `existingResultIds` rebuild by
walking messages backwards and stopping at the assistant boundary
once the incoming function_result ids are covered. Pre-fix this
was O(history) per finalize.
Context-compaction:
- Extract `emitCompactionDone` helper. The sync + async handlers
carried byte-for-byte identical try/catch/emit blocks — collapsed
to one call each, with a stable failure code.
Tests:
- stream timeout: switch to fake timers (was real-timer-bound,
flaky on slow CI). Eagerly attach a no-op `.catch()` to silence a
fake-timer-ordering "unhandled rejection" warning.
- sse: tighten `error_kind` assertion (`toBeDefined` → `toBe('permanent')`).
- load timeout: assert `abortedFromController` so a refactor that
surfaces a timeout error WITHOUT actually calling `controller.abort()`
is caught.
- wire-messages: port the three Anthropic dedup scenarios (collapse,
distinct-id ordering, across-batch separation) to keep the providers
in lockstep.
- auth/config: cover the new `isLoopbackUrl`, `selectAuthKey`,
`buildAuthHeaders`, and URL-validation shapes.
Tests: 789 passing (was 710 pre-review).
Adds a llama.cpp `llama-server` provider as a peer to provider-lmstudio. Same OpenAI-compatible chat path; runtime-management surface trimmed because llama-server hosts exactly one model per process. What's the same as provider-lmstudio: - OpenAI Chat Completions wire format with SSE streaming. - `reasoning_content` round-trip for thinking-mode models (DeepSeek-R1, Qwen-thinking) when llama-server is started with `--jinja --reasoning-format deepseek`. - Tool calls via `--jinja` + a tool-aware chat template. - The DoS guard (tool-call index clamp), security posture (error in `error_message` only, non-2xx truncation), and perf shape (O(1) tool-result dedup, parallel discovery via Promise.allSettled). What's different: - Default port 8080 (vs LM Studio's 1234). - No fallback bearer string. Unlike LM Studio there is no documented default token; on loopback without a credential we omit `Authorization` entirely. On non-loopback without a credential we also omit AND log a WARN with code `llamacpp_auth_omitted_nonloopback`. - No `load_model` / `unload_model` bus functions. llama-server can't switch models at runtime (`-m` is set at process start). Only `complete`, `stream`, and `refresh_models` are registered. - Discovery uses OpenAI-compat `/v1/models` (id only) rather than LM Studio's `/api/v0/models` (rich metadata, all-downloaded). The startup fire-and-forget call registers the single loaded model. - `llamacpp-local` catalog placeholder (alongside `lmstudio-local`) so capability gating (supports_tools etc.) works for any user-loaded model id. Bumps the fetch timeout default 30s → 120s. 30s was too short for large GGUFs (35B+ Q4) where prompt-processing alone can take tens of seconds before the first token arrives, especially over LAN. Adds `LLAMACPP_FETCH_TIMEOUT_MS` env override (positive integer ms; values <1000 ignored). Read fresh per-call so operators can adjust without restarting the worker, and so tests can mutate it. Wiring: - `harness/src/index.ts` registers the worker entry. - `turn-orchestrator/provider-router.ts` adds the `'llamacpp'` route + `targetFunctionId` branch. Same "no model-name heuristic" posture as lmstudio (id namespaces overlap with HF-style ids from other services; require explicit `provider='llamacpp'`). - `auth-credentials/types.ts` ENV_VAR_MAP: `LLAMACPP_API_KEY`. - `models-catalog/catalog.ts` + `models.json`: `llamacpp-local` placeholder fallback. Docs at `harness/docs/workers/provider-llamacpp.md` (configuration, auth policy, defenses, side-by-side diff against provider-lmstudio). Tests: 789 → 821 (+32 new across 9 test files + provider-router broadening for the new branch).
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (36)
📝 WalkthroughWalkthroughAdds per-conversation auto-accept with safety policy and UI, ARIA live announcements, new compaction and stop-reason events, token estimation from compaction, and LM Studio/llama.cpp providers with SSE streaming, discovery, load/unload, routing, and docs; directory surfaces function_id in skills. ChangesAuto-accept approvals, compaction, and local providers
Sequence Diagram(s)sequenceDiagram
participant ChatClient
participant HarnessRouter
participant Provider(lmstudio/llamacpp)
participant ModelAPI
ChatClient->>HarnessRouter: provider::(lmstudio|llamacpp)::stream (writer_ref, messages, tools)
HarnessRouter->>Provider(lmstudio/llamacpp): register handler invoked
Provider(lmstudio/llamacpp)->>ModelAPI: POST /v1/chat/completions (SSE)
ModelAPI-->>Provider(lmstudio/llamacpp): SSE chunks (text/thinking/tool_calls)
Provider(lmstudio/llamacpp)->>HarnessRouter: stream events (start, deltas, done/error)
HarnessRouter-->>ChatClient: JSON events via writer_ref
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
skill-check — worker0 verified, 12 skipped (no docs/).
Three for three. Nicely done. |
There was a problem hiding this comment.
Actionable comments posted: 18
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
harness/README.md (1)
26-46:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDocument
provider-llamacppalongside LM Studio in Workers and Quickstart.This README update adds LM Studio, but the new llama.cpp worker from this PR is still missing from both the workers table and run commands, leaving setup docs incomplete.
Proposed doc update
| `src/provider-kimi/` | `provider::kimi::stream`, `provider::kimi::complete` | Kimi (Moonshot) Chat Completions SSE → channel writer. | | `src/provider-lmstudio/` | `provider::lmstudio::stream`, `provider::lmstudio::complete` | LM Studio (localhost) Chat Completions SSE → channel writer. | +| `src/provider-llamacpp/` | `provider::llamacpp::stream`, `provider::llamacpp::complete` | llama.cpp (llama-server) OpenAI-compatible Chat Completions SSE → channel writer. | | `src/context-compaction/` | (none — pure side-car on `agent::events`) | Optional out-of-band session-history compactor. | @@ node dist/provider-kimi/main.js --url ws://127.0.0.1:49134 --config ./config.yaml node dist/provider-lmstudio/main.js --url ws://127.0.0.1:49134 --config ./config.yaml +node dist/provider-llamacpp/main.js --url ws://127.0.0.1:49134 --config ./config.yaml node dist/llm-budget/main.js --url ws://127.0.0.1:49134🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/README.md` around lines 26 - 46, Add documentation for the new provider-llamacpp worker: insert a table row alongside the LM Studio entry referencing src/provider-llamacpp/ and the worker capabilities (e.g. provider::llamacpp::stream, provider::llamacpp::complete) in the workers table, and add the corresponding Quickstart run command to the example commands: node dist/provider-llamacpp/main.js --url ws://127.0.0.1:49134 --config ./config.yaml (match the same flags pattern used by provider-lmstudio and other providers).console/web/src/hooks/use-conversations.ts (1)
72-80:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winBackfill
autoAcceptfor previously persisted conversations.
loadConversations()can return legacy records withoutautoAccept; those flow through asundefinedeven thoughConversation.autoAcceptis now required. Normalize on load to keep runtime shape stable.💡 Suggested patch
const [conversations, setConversations] = useState<Conversation[]>(() => { - const loaded = loadConversations() + const loaded = loadConversations().map((c) => ({ + ...c, + autoAccept: c.autoAccept ?? false, + }))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@console/web/src/hooks/use-conversations.ts` around lines 72 - 80, Loaded conversations may lack the required Conversation.autoAccept field; modify the initializer that calls loadConversations() so it normalizes/backfills autoAccept on each record before returning—e.g., map over loaded results and for any conversation with autoAccept === undefined set it to the default value used by emptyConversation (you can derive that by calling emptyConversation(model) where model is convo.model || loadLastModel() || DEFAULT_MODEL); then return the mapped list (or the single emptyConversation when loaded is empty) so runtime shape always includes autoAccept.harness/src/provider-llamacpp/discover.ts (1)
1-202:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix formatting in this file before merge.
The
harness: node lint + testcheck is currently failing on formatting for this file, so this will block CI until it’s reformatted.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/provider-llamacpp/discover.ts` around lines 1 - 202, This file fails CI due to formatting; run the project's auto-formatter/linter fix (e.g., prettier/eslint with --fix or the repo's "node lint + test" task) on harness/src/provider-llamacpp/discover.ts and commit the changes so the file matches repo style. Focus on normalizing whitespace/indentation, trailing commas/semicolons, and import/exports formatting around functions such as modelsUrl, fetchWithTimeout, fetchModelIds, toCatalogModel, registerDiscovered, and discoverAndRegister so the file passes lint/format checks. Ensure no functional changes are made—only formatting fixes—and re-run the CI lint check locally before pushing.
🧹 Nitpick comments (7)
console/web/DESIGN.md (2)
1731-1736: 💤 Low valueAdd language specifier to fenced code block.
This text diagram should have an explicit language specifier for better tool support.
📝 Proposed fix
-``` +```text section eyebrow (label-caps) <- bg page card bg-panel head + bg body <- per function/trigger pane bg-paper-2 head + bg body <- per schema (request/response) tree flat type-table <- per field🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@console/web/DESIGN.md` around lines 1731 - 1736, The fenced code block that contains the ASCII diagram (the triple-backtick block shown with the lines starting "section eyebrow (label-caps)...") lacks a language specifier; change the opening fence from ``` to ```text so tools recognize it as plain text (update the code fence surrounding the diagram accordingly).
641-645: 💤 Low valueAdd language specifier to fenced code block.
This text diagram would benefit from an explicit language specifier for better tool support and clarity. Add
textafter the opening fence.📝 Proposed fix
-``` +```text bg page (cream paper) └─ panel header strips, focused cards, code-block chrome └─ paper-2 nested separation when needed🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@console/web/DESIGN.md` around lines 641 - 645, Add the language specifier "text" to the fenced code block that currently begins with ``` in DESIGN.md (the small ASCII diagram block shown under the bg → panel → paper-2 example) by changing the opening fence to ```text so tools and renderers recognize it as plain text; keep the block contents unchanged and only update the opening fence.harness/src/provider-openai/wire-messages.ts (1)
52-66: ⚖️ Poor tradeoffConsider O(n) deduplication with a Map to avoid O(n²) complexity.
The current implementation calls
findIndex(O(n)) for eachfunction_resultmessage, resulting in O(n²) complexity when processing large conversation histories with many tool calls. For better performance, tracktool_call_id → indexin a Map during the main loop, then perform a single O(1) lookup and update per message.⚡ Proposed optimization
export function toOpenaiMessages(messages: AgentMessage[], system_prompt: string): unknown[] { const out: unknown[] = []; + const toolIdxMap = new Map<string, number>(); if (system_prompt.length > 0) { out.push({ role: 'system', content: system_prompt }); } for (const m of messages) { if (m.role === 'user') { const text = m.content .filter( (c): c is Extract<(typeof m.content)[number], { type: 'text' }> => c.type === 'text', ) .map((c) => c.text) .join('\n'); out.push({ role: 'user', content: text }); } else if (m.role === 'assistant') { const text = m.content .filter( (c): c is Extract<(typeof m.content)[number], { type: 'text' }> => c.type === 'text', ) .map((c) => c.text) .join('\n'); const tool_calls = m.content .filter( (c): c is Extract<(typeof m.content)[number], { type: 'function_call' }> => c.type === 'function_call', ) .map((c) => ({ id: c.id, type: 'function', function: { name: c.function_id, arguments: JSON.stringify(c.arguments) }, })); const entry: Record<string, unknown> = { role: 'assistant' }; if (text.length > 0) entry.content = text; if (tool_calls.length > 0) entry.tool_calls = tool_calls; out.push(entry); } else if (m.role === 'function_result') { const text = formatFunctionResultContent(m); const row: Record<string, unknown> = { role: 'tool', tool_call_id: m.function_call_id, content: text, }; if (m.is_error) row.is_error = true; - // Boundary dedup: never ship two tool messages with the same - // tool_call_id. Some OpenAI-compatible servers accept duplicates - // and silently overwrite; others (Anthropic's compat shim, - // strict gateways) reject. Latest-wins: replace any prior tool - // message with the same id rather than appending. - const existingIdx = out.findIndex( - (e) => - (e as { role?: string }).role === 'tool' && - (e as { tool_call_id?: string }).tool_call_id === m.function_call_id, - ); - if (existingIdx >= 0) { - out[existingIdx] = row; - } else { - out.push(row); - } + // Boundary dedup: never ship two tool messages with the same + // tool_call_id. Some OpenAI-compatible servers accept duplicates + // and silently overwrite; others (Anthropic's compat shim, + // strict gateways) reject. Latest-wins: replace any prior tool + // message with the same id rather than appending. + const existingIdx = toolIdxMap.get(m.function_call_id); + if (existingIdx !== undefined) { + out[existingIdx] = row; + } else { + toolIdxMap.set(m.function_call_id, out.length); + out.push(row); + } } // custom messages are skipped } return out; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/provider-openai/wire-messages.ts` around lines 52 - 66, The dedup block using out.findIndex leads to O(n²); replace it by maintaining a Map<string, number> (e.g. toolIndexMap) keyed by m.function_call_id that you populate during the main loop, then use toolIndexMap.get(m.function_call_id) instead of computing existingIdx via findIndex; if an index exists assign out[index] = row (no map change), otherwise push row and set toolIndexMap.set(m.function_call_id, out.length - 1); update the map whenever you mutate out so subsequent iterations are O(1).console/web/src/components/ui/LiveRegion.tsx (1)
21-31: 💤 Low valueConsider modernizing the visually-hidden pattern.
The
clip: 'rect(0, 0, 0, 0)'property is deprecated in favor ofclip-path: 'inset(50%)'. Whileclipstill works in most browsers, the modern approach is more future-proof.♻️ Proposed modernization
const SR_ONLY_STYLE: React.CSSProperties = { position: 'absolute', width: 1, height: 1, padding: 0, margin: -1, overflow: 'hidden', - clip: 'rect(0, 0, 0, 0)', + clipPath: 'inset(50%)', whiteSpace: 'nowrap', border: 0, }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@console/web/src/components/ui/LiveRegion.tsx` around lines 21 - 31, SR_ONLY_STYLE uses the deprecated CSS clip property; update the visually-hidden style in SR_ONLY_STYLE to use modern clip-path by adding clipPath: 'inset(50%)' while keeping clip: 'rect(0, 0, 0, 0)' as a fallback for older browsers, and maintain the other properties (position, width, height, padding, margin, overflow, whiteSpace, border) so the LiveRegion behavior remains unchanged; modify the SR_ONLY_STYLE constant accordingly (referencing the SR_ONLY_STYLE symbol).harness/src/provider-kimi/wire-messages.ts (1)
65-74: ⚡ Quick winUse O(1) tool-result dedup indexing in this hot path.
Per-message
findIndexturns dedup into O(n²) as history grows. Track tool row indices in a map to keep replacement O(1).♻️ Proposed refactor
export function toOpenaiMessages(messages: AgentMessage[], system_prompt: string): unknown[] { const out: unknown[] = []; + const toolRowById = new Map<string, number>(); if (system_prompt.length > 0) { out.push({ role: 'system', content: system_prompt }); } @@ - const existingIdx = out.findIndex( - (e) => - (e as { role?: string }).role === 'tool' && - (e as { tool_call_id?: string }).tool_call_id === m.function_call_id, - ); + const existingIdx = toolRowById.get(m.function_call_id) ?? -1; if (existingIdx >= 0) { out[existingIdx] = row; } else { out.push(row); + toolRowById.set(m.function_call_id, out.length - 1); }harness/src/provider-lmstudio/config.ts (1)
21-93: ⚡ Quick winExtract URL validation/loopback resolution into a shared helper.
This block is effectively duplicated in the llama.cpp provider config. Centralizing it will reduce security/behavior drift and keep warning semantics aligned across local providers.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@harness/src/provider-lmstudio/config.ts` around lines 21 - 93, The URL validation and loopback-check logic (validatedUrl, isLoopbackHost, and resolveApiUrl) is duplicated in the llama.cpp provider; extract these into a shared helper module (e.g., provider-local-utils) that exports validatedUrl, isLoopbackHost, and resolveApiUrl (or a single resolveApiUrlWithLogging) so both harness/src/provider-lmstudio/config.ts and the llama.cpp provider import and call the same functions; ensure the shared helper preserves the same logger usage and warning codes ('lmstudio_base_url_invalid', 'lmstudio_api_url_invalid', 'lmstudio_api_url_remote') or make them generic and map them when importing, and keep DEFAULT_API_URL behavior intact so callers (resolveApiUrl) still return the original fallback string.iii-directory/src/fs_source.rs (1)
772-779: ⚡ Quick winAssert
function_idin frontmatter tests to lock the new contract.The new
function_idfield is parsed but not asserted in the updated tests, so regressions could slip through unnoticed.✅ Proposed test additions
fn read_with_frontmatter_defaults_when_missing() { @@ let (fm, body) = read_skill_with_frontmatter(&path).unwrap(); assert!(fm.title.is_none()); assert!(fm.kind.is_none()); + assert!(fm.function_id.is_none()); assert_eq!(body, "# Plain\n\nbody\n"); } @@ fn read_with_frontmatter_tolerates_unrelated_yaml_keys() { @@ let (fm, _) = read_skill_with_frontmatter(&path).unwrap(); assert_eq!(fm.title.as_deref(), Some("Hi")); assert_eq!(fm.kind.as_deref(), Some("index")); + assert_eq!(fm.function_id.as_deref(), Some("c::d")); }Also applies to: 783-794
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@iii-directory/src/fs_source.rs` around lines 772 - 779, The tests for read_skill_with_frontmatter are missing assertions for the new frontmatter field function_id; update the test function read_with_frontmatter_defaults_when_missing() to assert fm.function_id.is_none() for the plain.md case, and similarly update the other test (the one between lines 783-794 that reads frontmatter-present content) to assert fm.function_id == Some("expected-id") (or the exact string used in that test) so the new contract is locked; reference the fm variable produced by read_skill_with_frontmatter and add the appropriate .is_none() or equality assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@console/web/src/hooks/use-auto-accept-approvals.ts`:
- Around line 61-77: The onDenied callback is being called repeatedly for the
same pending denied approvals; add deduplication using a ref-set (reuse
autoResolvedRef or add a new ref like autoDeniedRef) to track seen denied keys
(e.g. `${functionId}:${functionCallId}`) inside the useEffect that processes
selection.deniedByPolicy so you only call onDenied for entries not already in
the set, then add them to the set after invoking onDenied; keep the existing
reset of the ref on conversationId change so deduplication is per-conversation
and still allow nextApprovalsToAutoResolve/selection.candidates to work as
before.
In `@console/web/src/lib/backend/translate.ts`:
- Around line 55-58: The code emits the terminal "assistant-end" twice via
translateAssistantMessageEnd() in the 'message_end' case and again in the
'agent_end' handling; update the logic so only one code path emits the terminal
signal: pick a single emitter (preferably translateAssistantMessageEnd for
message-based termination) and remove or guard the duplicate emission in the
agent_end branch (or vice versa). Concretely, modify the agent_end handling to
check whether the event stream already had a message_end/assistant emission (or
whether the last message role was 'assistant') before calling the terminal emit,
or move the terminal emission exclusively into translateAssistantMessageEnd and
ensure agent_end no longer emits assistant-end; adjust corresponding checks in
the functions handling 'message_end', 'agent_end' and any code paths referenced
at lines around translateAssistantMessageEnd and agent_end to prevent duplicate
terminal signals.
In `@harness/docs/architecture.md`:
- Around line 32-33: Update the architecture doc to keep the provider listing
and counts in sync: add a new provider entry for llama.cpp (mirror the format of
provider-lmstudio row, e.g., add a "provider-llamacpp |
[src/provider-llamacpp/](harness/src/provider-llamacpp/) | llama.cpp (localhost)
Chat Completions SSE → channel writer. |
[workers/provider-llamacpp.md](harness/docs/workers/provider-llamacpp.md) |"),
update the opening "worker-count" statement and any catalogue/diagram counts to
include both local providers, and update the other affected sections referenced
(lines 53-54, 70-71, 209-220) so the documented worker/service counts and
diagrams match the new provider list; keep formatting consistent with the
existing provider-lmstudio and context-compaction rows and link to the
corresponding workers docs.
In `@harness/docs/workers/provider-llamacpp.md`:
- Around line 34-38: The fenced code block containing the environment variable
LLAMACPP_BASE_URL is missing a language tag which triggers markdownlint MD040;
update the fenced block around the lines with LLAMACPP_BASE_URL to include a
language identifier (e.g., bash) so the opening triple-backticks become ```bash
and the closing remains ```, ensuring the block renders and satisfies linting.
In `@harness/src/context-compaction/handler-sync.ts`:
- Around line 132-141: The emitCompactionDone call is awaited and can throw,
which contradicts the “best-effort” comment; change it so publish failures do
not abort the sync compaction path by isolating the call in its own try/catch
(or fire-and-forget promise) and logging the error instead of rethrowing.
Specifically, locate the emitCompactionDone(...) invocation in handler-sync.ts
(inside the sync compaction flow) and wrap it so any exception is caught and
sent to the logger (or swallowed) while allowing the surrounding function to
continue and return normally.
In `@harness/src/provider-llamacpp/auth.ts`:
- Around line 1-6: The import block at the top (symbols: logger, Credential,
ISdk, WorkerConfig, ChatCompletionsConfig, configFromCredential) is
misformatted; run the project's formatter (e.g. the repo's format script or
Prettier/ESLint autofix) against this file to normalize imports and whitespace
so the linter warning is resolved.
In `@harness/src/provider-llamacpp/refresh-fn.ts`:
- Around line 11-48: This file fails the formatter check; run the repository
formatter and commit the normalized output. Locate the register function and
surrounding code (symbols: FUNCTION_ID, RefreshResult, register) in
harness/src/provider-llamacpp/refresh-fn.ts, run the project's formatting
command (e.g., npm run format or the repo's configured formatter like
prettier/clang-format), ensure whitespace/indentation and import ordering are
fixed, then stage and commit the formatted file so CI passes.
In `@harness/src/provider-llamacpp/register.ts`:
- Around line 36-45: The catch in runStartupDiscovery currently always logs
"header build failed" even if discoverAndRegister throws; update the function to
either split into two try/catch blocks—one around buildAuthHeaders(iii, chatUrl)
that logs header-specific errors and a second around discoverAndRegister(iii,
chatUrl, headers) that logs discovery-specific errors—or change the single catch
message to a generic "llamacpp startup discovery failed" and include the error
and context; reference runStartupDiscovery, buildAuthHeaders and
discoverAndRegister when making the change so the log accurately reflects which
step failed.
In `@harness/src/provider-llamacpp/stream.ts`:
- Around line 139-142: The regex in the safeText construction uses literal
control characters and trips lint; update the replace to use explicit Unicode
escapes instead (e.g. replace(/[\u0000-\u0008\u000B\u000C\u000E-\u001F\u007F]/g,
'') ) so you can remove the eslint-disable comment while preserving the same
character ranges and the .slice(0, 256) behavior for the safeText variable.
In `@harness/src/provider-lmstudio/auth.ts`:
- Around line 52-54: The current validator returns any object with a type
property which can lead to runtime crashes when the code later assumes a
specific string API key; update the validation around the returned cred so it
ensures cred.type is a string and that the expected secret field used later
(e.g., cred.key or cred.api_key — whichever the code expects where the comment
pointed to) exists and is a string before returning; in practice, inside the
same function that currently checks "if (!cred || typeof cred !== 'object' ||
!('type' in cred)) return null;" replace that broad check with explicit checks:
typeof cred.type === 'string' and typeof cred.<expectedSecretField> === 'string'
(or check multiple candidate fields and pick the valid string), and return null
if these checks fail so the downstream code that references the key string
cannot crash.
In `@harness/src/provider-lmstudio/load-fn.ts`:
- Around line 37-43: The current validation for payload.model only checks length
and allows whitespace-only strings; update the guard in the load-fn.ts where
payload is validated (the if block that currently checks payload.model, used
before calling buildAuthHeaders and loadModel) to reject strings that are empty
after trimming (e.g., ensure typeof payload.model === 'string' and
payload.model.trim().length > 0) so whitespace-only model ids are treated as
invalid and return the same { ok: false, error: ... } response.
In `@harness/src/provider-lmstudio/load.ts`:
- Around line 103-114: The error includes the full resp.text() body which can be
very large; change the code that reads the response body (the const text = await
resp.text().catch(() => '') in this file and the similar usage around the
154-157 block) to truncate the captured body to a safe maximum (e.g. 1KB) before
embedding in the thrown Error—read the text as before but assign a truncated
version (add "…(truncated)" when truncated) and use that truncated text in the
Error message that references model and resp.status so logs/events and memory
stay bounded.
In `@harness/src/provider-lmstudio/register.ts`:
- Around line 35-43: The current try/catch wraps both buildAuthHeaders and
discoverAndRegister so any error from discoverAndRegister is mislogged as a
header build failure; change the structure so only buildAuthHeaders is wrapped
by the try/catch (call discoverAndRegister after the try), or give
discoverAndRegister its own specific error handling/logging. In practice, move
await discoverAndRegister(iii, chatUrl, headers) out of the try that catches
errors for buildAuthHeaders (or add a separate catch for discoverAndRegister)
and ensure logger.warn('lmstudio startup discovery: header build failed', { err:
String(err) }) only runs for errors thrown by buildAuthHeaders.
In `@harness/src/provider-lmstudio/stream.ts`:
- Around line 230-236: The non-SSE error path calls syntheticErrorEvent with
cfg.model but other error paths use the resolved effectiveModel; update the
syntheticErrorEvent invocation (the 200 non-SSE body branch) to pass
effectiveModel instead of cfg.model so the reported model ID matches the
resolved model used elsewhere (refer to syntheticErrorEvent, cfg.model,
effectiveModel).
- Around line 148-154: The error event uses the placeholder cfg.model instead of
the resolved model id, causing misleading reports; update the
syntheticErrorEvent call to pass effectiveModel (the resolved model id used
elsewhere) instead of cfg.model so errors report the actual model, ensuring
effectiveModel is in scope for the call that also passes safeText, resp.status,
and classifyLmstudioError.
In `@harness/src/turn-orchestrator/states/functions.ts`:
- Around line 396-409: The dedup scan in the loop over prior messages uses the
wrong break condition: when encountering an assistant boundary (m.role ===
'assistant') you should stop scanning further only if any incoming ID has NOT
been seen yet (i.e., there exists id in incomingIds that is not in
existingResultIds) — the current logic continues scanning in that common path
and can O(history) and mis-dedupe; update the branch in the assistant-boundary
block (the code referencing m.role, incomingIds, existingResultIds, and unseen)
so that you break out of the outer loop when an unseen id is found (invert the
condition) instead of continuing, ensuring earlier turns are not incorrectly
considered duplicates.
In `@harness/tests/context-compaction/compaction-done-emit.test.ts`:
- Around line 95-100: The test stub computes increments against the original
oldValue for each op, so multiple sequential `increment` ops don't compose;
update the loop in compaction-done-emit.test.ts so increments use the latest
computed value rather than oldValue: initialize newValue to oldValue (or set a
current value variable) before processing ops, and when handling an `increment`
op compute prev from the current/newValue (falling back to 0 only if neither
oldValue nor newValue is a number) so successive increments accumulate
correctly; refer to the loop using vars ops, oldValue, and newValue to locate
and change the logic.
In `@harness/tests/provider-lmstudio/wire-messages.test.ts`:
- Around line 239-265: Test title/comments currently claim cross-batch repeats
stay separate but the assertions expect a single merged tool; update the test
text to reflect the actual invariant enforced by toOpenaiMessages/mkResult:
dedup is scoped to the pending batch so within-this-batch repeats (even with an
assistant turn between) collapse to one tool entry while cross-batch repeats
remain separate—adjust the it(...) title and inline comments to state
"within-batch repeats dedupe; cross-batch repeats stay separate" so they match
the existing expect(...) checks.
---
Outside diff comments:
In `@console/web/src/hooks/use-conversations.ts`:
- Around line 72-80: Loaded conversations may lack the required
Conversation.autoAccept field; modify the initializer that calls
loadConversations() so it normalizes/backfills autoAccept on each record before
returning—e.g., map over loaded results and for any conversation with autoAccept
=== undefined set it to the default value used by emptyConversation (you can
derive that by calling emptyConversation(model) where model is convo.model ||
loadLastModel() || DEFAULT_MODEL); then return the mapped list (or the single
emptyConversation when loaded is empty) so runtime shape always includes
autoAccept.
In `@harness/README.md`:
- Around line 26-46: Add documentation for the new provider-llamacpp worker:
insert a table row alongside the LM Studio entry referencing
src/provider-llamacpp/ and the worker capabilities (e.g.
provider::llamacpp::stream, provider::llamacpp::complete) in the workers table,
and add the corresponding Quickstart run command to the example commands: node
dist/provider-llamacpp/main.js --url ws://127.0.0.1:49134 --config ./config.yaml
(match the same flags pattern used by provider-lmstudio and other providers).
In `@harness/src/provider-llamacpp/discover.ts`:
- Around line 1-202: This file fails CI due to formatting; run the project's
auto-formatter/linter fix (e.g., prettier/eslint with --fix or the repo's "node
lint + test" task) on harness/src/provider-llamacpp/discover.ts and commit the
changes so the file matches repo style. Focus on normalizing
whitespace/indentation, trailing commas/semicolons, and import/exports
formatting around functions such as modelsUrl, fetchWithTimeout, fetchModelIds,
toCatalogModel, registerDiscovered, and discoverAndRegister so the file passes
lint/format checks. Ensure no functional changes are made—only formatting
fixes—and re-run the CI lint check locally before pushing.
---
Nitpick comments:
In `@console/web/DESIGN.md`:
- Around line 1731-1736: The fenced code block that contains the ASCII diagram
(the triple-backtick block shown with the lines starting "section eyebrow
(label-caps)...") lacks a language specifier; change the opening fence from ```
to ```text so tools recognize it as plain text (update the code fence
surrounding the diagram accordingly).
- Around line 641-645: Add the language specifier "text" to the fenced code
block that currently begins with ``` in DESIGN.md (the small ASCII diagram block
shown under the bg → panel → paper-2 example) by changing the opening fence to
```text so tools and renderers recognize it as plain text; keep the block
contents unchanged and only update the opening fence.
In `@console/web/src/components/ui/LiveRegion.tsx`:
- Around line 21-31: SR_ONLY_STYLE uses the deprecated CSS clip property; update
the visually-hidden style in SR_ONLY_STYLE to use modern clip-path by adding
clipPath: 'inset(50%)' while keeping clip: 'rect(0, 0, 0, 0)' as a fallback for
older browsers, and maintain the other properties (position, width, height,
padding, margin, overflow, whiteSpace, border) so the LiveRegion behavior
remains unchanged; modify the SR_ONLY_STYLE constant accordingly (referencing
the SR_ONLY_STYLE symbol).
In `@harness/src/provider-lmstudio/config.ts`:
- Around line 21-93: The URL validation and loopback-check logic (validatedUrl,
isLoopbackHost, and resolveApiUrl) is duplicated in the llama.cpp provider;
extract these into a shared helper module (e.g., provider-local-utils) that
exports validatedUrl, isLoopbackHost, and resolveApiUrl (or a single
resolveApiUrlWithLogging) so both harness/src/provider-lmstudio/config.ts and
the llama.cpp provider import and call the same functions; ensure the shared
helper preserves the same logger usage and warning codes
('lmstudio_base_url_invalid', 'lmstudio_api_url_invalid',
'lmstudio_api_url_remote') or make them generic and map them when importing, and
keep DEFAULT_API_URL behavior intact so callers (resolveApiUrl) still return the
original fallback string.
In `@harness/src/provider-openai/wire-messages.ts`:
- Around line 52-66: The dedup block using out.findIndex leads to O(n²); replace
it by maintaining a Map<string, number> (e.g. toolIndexMap) keyed by
m.function_call_id that you populate during the main loop, then use
toolIndexMap.get(m.function_call_id) instead of computing existingIdx via
findIndex; if an index exists assign out[index] = row (no map change), otherwise
push row and set toolIndexMap.set(m.function_call_id, out.length - 1); update
the map whenever you mutate out so subsequent iterations are O(1).
In `@iii-directory/src/fs_source.rs`:
- Around line 772-779: The tests for read_skill_with_frontmatter are missing
assertions for the new frontmatter field function_id; update the test function
read_with_frontmatter_defaults_when_missing() to assert fm.function_id.is_none()
for the plain.md case, and similarly update the other test (the one between
lines 783-794 that reads frontmatter-present content) to assert fm.function_id
== Some("expected-id") (or the exact string used in that test) so the new
contract is locked; reference the fm variable produced by
read_skill_with_frontmatter and add the appropriate .is_none() or equality
assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 152cc76f-230f-47eb-825d-739fc414d578
📒 Files selected for processing (125)
console/web/DESIGN.mdconsole/web/src/components/chat/AutoAcceptToggle.tsxconsole/web/src/components/chat/ChatPanel.tsxconsole/web/src/components/chat/ChatView.tsxconsole/web/src/components/chat/Composer.tsxconsole/web/src/components/chat/MessageList.tsxconsole/web/src/components/ui/LiveRegion.tsxconsole/web/src/hooks/use-auto-accept-approvals.test.tsconsole/web/src/hooks/use-auto-accept-approvals.tsconsole/web/src/hooks/use-conversations.tsconsole/web/src/hooks/use-live-announcer.tsconsole/web/src/lib/backend/auto-accept-policy.test.tsconsole/web/src/lib/backend/auto-accept-policy.tsconsole/web/src/lib/backend/auto-accept.test.tsconsole/web/src/lib/backend/auto-accept.tsconsole/web/src/lib/backend/translate.test.tsconsole/web/src/lib/backend/translate.tsconsole/web/src/lib/backend/types.tsconsole/web/src/lib/format-stop-reason.test.tsconsole/web/src/lib/format-stop-reason.tsconsole/web/src/lib/storage.tsconsole/web/src/lib/token-estimate.test.tsconsole/web/src/lib/token-estimate.tsconsole/web/src/pages/Examples/sections/composer-variants.tsxconsole/web/src/pages/Playground/EventLog.tsxconsole/web/src/pages/Playground/index.tsxconsole/web/src/types/chat.tsconsole/web/src/types/iii-agent-event.tsharness/README.mdharness/docs/architecture.mdharness/docs/workers/provider-llamacpp.mdharness/docs/workers/provider-lmstudio.mdharness/package.jsonharness/src/auth-credentials/types.tsharness/src/context-compaction/emit.tsharness/src/context-compaction/handler-async.tsharness/src/context-compaction/handler-sync.tsharness/src/index.tsharness/src/models-catalog/catalog.tsharness/src/models-catalog/models.jsonharness/src/provider-anthropic/wire-messages.tsharness/src/provider-kimi/sse.tsharness/src/provider-kimi/wire-messages.tsharness/src/provider-llamacpp/auth.tsharness/src/provider-llamacpp/complete.tsharness/src/provider-llamacpp/config.tsharness/src/provider-llamacpp/discover.tsharness/src/provider-llamacpp/iii.worker.yamlharness/src/provider-llamacpp/main.tsharness/src/provider-llamacpp/refresh-fn.tsharness/src/provider-llamacpp/register.tsharness/src/provider-llamacpp/sse.tsharness/src/provider-llamacpp/stream-fn.tsharness/src/provider-llamacpp/stream.tsharness/src/provider-llamacpp/types.tsharness/src/provider-llamacpp/wire-messages.tsharness/src/provider-llamacpp/wire-tools.tsharness/src/provider-lmstudio/auth.tsharness/src/provider-lmstudio/complete.tsharness/src/provider-lmstudio/config.tsharness/src/provider-lmstudio/discover.tsharness/src/provider-lmstudio/iii.worker.yamlharness/src/provider-lmstudio/load-fn.tsharness/src/provider-lmstudio/load.tsharness/src/provider-lmstudio/main.tsharness/src/provider-lmstudio/refresh-fn.tsharness/src/provider-lmstudio/register.tsharness/src/provider-lmstudio/sse.tsharness/src/provider-lmstudio/stream-fn.tsharness/src/provider-lmstudio/stream.tsharness/src/provider-lmstudio/types.tsharness/src/provider-lmstudio/unload-fn.tsharness/src/provider-lmstudio/wire-messages.tsharness/src/provider-lmstudio/wire-tools.tsharness/src/provider-openai/wire-messages.tsharness/src/turn-orchestrator/agent-trigger.tsharness/src/turn-orchestrator/provider-router.tsharness/src/turn-orchestrator/states/assistant.tsharness/src/turn-orchestrator/states/functions.tsharness/src/turn-orchestrator/system-prompt.tsharness/src/types/agent-event.tsharness/tests/auth-credentials/env-map-llamacpp.test.tsharness/tests/auth-credentials/env-map-lmstudio.test.tsharness/tests/context-compaction/compaction-done-emit.test.tsharness/tests/models-catalog/seed-llamacpp.test.tsharness/tests/models-catalog/seed-lmstudio.test.tsharness/tests/provider-anthropic/wire-messages.test.tsharness/tests/provider-kimi/sse.test.tsharness/tests/provider-kimi/wire-messages.test.tsharness/tests/provider-llamacpp/auth.test.tsharness/tests/provider-llamacpp/config.test.tsharness/tests/provider-llamacpp/discover.test.tsharness/tests/provider-llamacpp/sse.test.tsharness/tests/provider-llamacpp/stream.test.tsharness/tests/provider-llamacpp/wire-messages.test.tsharness/tests/provider-llamacpp/wire-tools.test.tsharness/tests/provider-lmstudio/auth.test.tsharness/tests/provider-lmstudio/config.test.tsharness/tests/provider-lmstudio/discover.test.tsharness/tests/provider-lmstudio/load.test.tsharness/tests/provider-lmstudio/sse.test.tsharness/tests/provider-lmstudio/stream.test.tsharness/tests/provider-lmstudio/wire-messages.test.tsharness/tests/provider-lmstudio/wire-tools.test.tsharness/tests/provider-openai/wire-messages.test.tsharness/tests/turn-orchestrator/agent-trigger.test.tsharness/tests/turn-orchestrator/functions.test.tsharness/tests/turn-orchestrator/provider-router.test.tsharness/tests/turn-orchestrator/system-prompt.test.tsiii-directory/skills/directory/engine/functions/info.mdiii-directory/skills/directory/engine/functions/list.mdiii-directory/skills/directory/engine/registered-triggers/info.mdiii-directory/skills/directory/engine/registered-triggers/list.mdiii-directory/skills/directory/engine/triggers/info.mdiii-directory/skills/directory/engine/triggers/list.mdiii-directory/skills/directory/engine/workers/info.mdiii-directory/skills/directory/engine/workers/list.mdiii-directory/skills/directory/registry/workers/info.mdiii-directory/skills/directory/registry/workers/list.mdiii-directory/skills/directory/skills/download.mdiii-directory/skills/directory/skills/get.mdiii-directory/skills/directory/skills/index.mdiii-directory/skills/directory/skills/list.mdiii-directory/src/fs_source.rsiii-directory/src/functions/skills.rs
| const autoResolvedRef = useRef<Set<string>>(new Set()) | ||
|
|
||
| useEffect(() => { | ||
| autoResolvedRef.current = new Set() | ||
| }, [conversationId]) | ||
|
|
||
| useEffect(() => { | ||
| if (!enabled) return | ||
| if (!resolveApproval) return | ||
| const selection = selectAutoAcceptCandidates(messages, policy) | ||
| if (selection === null) return // hot-path early exit | ||
| if (onDenied) { | ||
| for (const d of selection.deniedByPolicy) { | ||
| onDenied(d.functionId, d.functionCallId) | ||
| } | ||
| } | ||
| const todo = nextApprovalsToAutoResolve(selection.candidates, autoResolvedRef.current) |
There was a problem hiding this comment.
Deduplicate policy-denied callbacks to avoid repeated side effects.
onDenied currently fires every effect pass for the same pending denied approval. This can repeatedly announce/log the same denial while the card remains pending.
♻️ Proposed fix
export function useAutoAcceptApprovals({
@@
}: UseAutoAcceptApprovalsArgs): void {
const autoResolvedRef = useRef<Set<string>>(new Set())
+ const deniedNotifiedRef = useRef<Set<string>>(new Set())
useEffect(() => {
autoResolvedRef.current = new Set()
+ deniedNotifiedRef.current = new Set()
}, [conversationId])
@@
if (selection === null) return // hot-path early exit
if (onDenied) {
for (const d of selection.deniedByPolicy) {
+ if (deniedNotifiedRef.current.has(d.functionCallId)) continue
+ deniedNotifiedRef.current.add(d.functionCallId)
onDenied(d.functionId, d.functionCallId)
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@console/web/src/hooks/use-auto-accept-approvals.ts` around lines 61 - 77, The
onDenied callback is being called repeatedly for the same pending denied
approvals; add deduplication using a ref-set (reuse autoResolvedRef or add a new
ref like autoDeniedRef) to track seen denied keys (e.g.
`${functionId}:${functionCallId}`) inside the useEffect that processes
selection.deniedByPolicy so you only call onDenied for entries not already in
the set, then add them to the set after invoking onDenied; keep the existing
reset of the ref on conversationId change so deduplication is per-conversation
and still allow nextApprovalsToAutoResolve/selection.candidates to work as
before.
| case 'message_end': | ||
| if (event.message.role === 'assistant') { | ||
| return [{ kind: 'assistant-end' }] | ||
| } | ||
| return [] | ||
| if (event.message.role !== 'assistant') return [] | ||
| return translateAssistantMessageEnd(event.message) | ||
|
|
There was a problem hiding this comment.
Avoid duplicate terminal assistant-end emission paths.
Line 56/57 now emits assistant-end via translateAssistantMessageEnd, while Line 91/92 still emits assistant-end on agent_end. When both events are present, the UI receives duplicate terminal signals.
Suggested minimal fix
function translateAssistantMessageEnd(message: AssistantMessage): StreamEvent[] {
- const out: StreamEvent[] = [{ kind: 'assistant-end' }]
+ const out: StreamEvent[] = []
const stop = message.stop_reason as
| 'end'
| 'length'
| 'error'
| 'aborted'Also applies to: 91-93, 156-158
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@console/web/src/lib/backend/translate.ts` around lines 55 - 58, The code
emits the terminal "assistant-end" twice via translateAssistantMessageEnd() in
the 'message_end' case and again in the 'agent_end' handling; update the logic
so only one code path emits the terminal signal: pick a single emitter
(preferably translateAssistantMessageEnd for message-based termination) and
remove or guard the duplicate emission in the agent_end branch (or vice versa).
Concretely, modify the agent_end handling to check whether the event stream
already had a message_end/assistant emission (or whether the last message role
was 'assistant') before calling the terminal emit, or move the terminal emission
exclusively into translateAssistantMessageEnd and ensure agent_end no longer
emits assistant-end; adjust corresponding checks in the functions handling
'message_end', 'agent_end' and any code paths referenced at lines around
translateAssistantMessageEnd and agent_end to prevent duplicate terminal
signals.
| | provider-lmstudio | [src/provider-lmstudio/](harness/src/provider-lmstudio/) | LM Studio (localhost) Chat Completions SSE → channel writer. | [workers/provider-lmstudio.md](harness/docs/workers/provider-lmstudio.md) | | ||
| | context-compaction | [src/context-compaction/](harness/src/context-compaction/) | Optional `agent::events` side-car that compacts session history when running token count crosses a threshold. | [workers/context-compaction.md](harness/docs/workers/context-compaction.md) | |
There was a problem hiding this comment.
Keep architecture docs consistent with both new local providers.
These sections now include LM Studio, but llama.cpp is still missing, and the opening worker-count statement is stale. Please update the catalogue/diagrams/count together so this doc remains source-of-truth.
Also applies to: 53-54, 70-71, 209-220
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@harness/docs/architecture.md` around lines 32 - 33, Update the architecture
doc to keep the provider listing and counts in sync: add a new provider entry
for llama.cpp (mirror the format of provider-lmstudio row, e.g., add a
"provider-llamacpp | [src/provider-llamacpp/](harness/src/provider-llamacpp/) |
llama.cpp (localhost) Chat Completions SSE → channel writer. |
[workers/provider-llamacpp.md](harness/docs/workers/provider-llamacpp.md) |"),
update the opening "worker-count" statement and any catalogue/diagram counts to
include both local providers, and update the other affected sections referenced
(lines 53-54, 70-71, 209-220) so the documented worker/service counts and
diagrams match the new provider list; keep formatting consistent with the
existing provider-lmstudio and context-compaction rows and link to the
corresponding workers docs.
| ``` | ||
| LLAMACPP_BASE_URL=http://localhost:8080 | ||
| # or: | ||
| LLAMACPP_BASE_URL=http://localhost:8080/v1/chat/completions | ||
| ``` |
There was a problem hiding this comment.
Add a language tag to the fenced env block.
This block is missing a fence language, which triggers markdownlint (MD040).
Suggested fix
-```
+```bash
LLAMACPP_BASE_URL=http://localhost:8080
# or:
LLAMACPP_BASE_URL=http://localhost:8080/v1/chat/completions</details>
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 34-34: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @harness/docs/workers/provider-llamacpp.md around lines 34 - 38, The fenced
code block containing the environment variable LLAMACPP_BASE_URL is missing a
language tag which triggers markdownlint MD040; update the fenced block around
the lines with LLAMACPP_BASE_URL to include a language identifier (e.g., bash)
so the opening triple-backticks become bash and the closing remains ,
ensuring the block renders and satisfies linting.
</details>
<!-- fingerprinting:phantom:poseidon:hawk -->
<!-- This is an auto-generated comment by CodeRabbit -->
| // Tell the UI we just compacted so it can render a marker and | ||
| // re-estimate context usage. Best-effort: a publish failure must | ||
| // not derail the in-flight turn — the orchestrator is waiting on | ||
| // our return. | ||
| await emitCompactionDone(iii, input.session_id, 'sync', { | ||
| summary_text: result.summary_text, | ||
| tokens_before: result.tokens_before, | ||
| compaction_entry_id: result.compaction_entry_id, | ||
| tail_start_id: result.tail_start_id, | ||
| }); |
There was a problem hiding this comment.
Best-effort compaction event can currently fail the sync compaction path.
Lines 136-141 await emitCompactionDone without isolation. If publish fails, the catch block rethrows and aborts the operation, which contradicts the “must not derail” requirement in the comment.
Proposed fix
- await emitCompactionDone(iii, input.session_id, 'sync', {
- summary_text: result.summary_text,
- tokens_before: result.tokens_before,
- compaction_entry_id: result.compaction_entry_id,
- tail_start_id: result.tail_start_id,
- });
+ try {
+ await emitCompactionDone(iii, input.session_id, 'sync', {
+ summary_text: result.summary_text,
+ tokens_before: result.tokens_before,
+ compaction_entry_id: result.compaction_entry_id,
+ tail_start_id: result.tail_start_id,
+ });
+ } catch (err) {
+ logger.warn('handler-sync: failed to emit compaction_done (non-fatal)', {
+ session_id: input.session_id,
+ err: String(err),
+ });
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Tell the UI we just compacted so it can render a marker and | |
| // re-estimate context usage. Best-effort: a publish failure must | |
| // not derail the in-flight turn — the orchestrator is waiting on | |
| // our return. | |
| await emitCompactionDone(iii, input.session_id, 'sync', { | |
| summary_text: result.summary_text, | |
| tokens_before: result.tokens_before, | |
| compaction_entry_id: result.compaction_entry_id, | |
| tail_start_id: result.tail_start_id, | |
| }); | |
| // Tell the UI we just compacted so it can render a marker and | |
| // re-estimate context usage. Best-effort: a publish failure must | |
| // not derail the in-flight turn — the orchestrator is waiting on | |
| // our return. | |
| try { | |
| await emitCompactionDone(iii, input.session_id, 'sync', { | |
| summary_text: result.summary_text, | |
| tokens_before: result.tokens_before, | |
| compaction_entry_id: result.compaction_entry_id, | |
| tail_start_id: result.tail_start_id, | |
| }); | |
| } catch (err) { | |
| logger.warn('handler-sync: failed to emit compaction_done (non-fatal)', { | |
| session_id: input.session_id, | |
| err: String(err), | |
| }); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@harness/src/context-compaction/handler-sync.ts` around lines 132 - 141, The
emitCompactionDone call is awaited and can throw, which contradicts the
“best-effort” comment; change it so publish failures do not abort the sync
compaction path by isolating the call in its own try/catch (or fire-and-forget
promise) and logging the error instead of rethrowing. Specifically, locate the
emitCompactionDone(...) invocation in handler-sync.ts (inside the sync
compaction flow) and wrap it so any exception is caught and sent to the logger
(or swallowed) while allowing the surrounding function to continue and return
normally.
| yield syntheticErrorEvent( | ||
| 'lmstudio returned a 200 response with a non-SSE body (no parseable chunks). The endpoint URL is likely wrong, or LM Studio served its dashboard page in place of an SSE stream.', | ||
| cfg.model, | ||
| cfg.provider_name, | ||
| ); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Same model id inconsistency.
Line 232 uses cfg.model instead of effectiveModel for the non-SSE body error. For consistency with other error paths in this function, use the resolved model.
Proposed fix
yield syntheticErrorEvent(
'lmstudio returned a 200 response with a non-SSE body (no parseable chunks). The endpoint URL is likely wrong, or LM Studio served its dashboard page in place of an SSE stream.',
- cfg.model,
+ effectiveModel,
cfg.provider_name,
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| yield syntheticErrorEvent( | |
| 'lmstudio returned a 200 response with a non-SSE body (no parseable chunks). The endpoint URL is likely wrong, or LM Studio served its dashboard page in place of an SSE stream.', | |
| cfg.model, | |
| cfg.provider_name, | |
| ); | |
| return; | |
| } | |
| yield syntheticErrorEvent( | |
| 'lmstudio returned a 200 response with a non-SSE body (no parseable chunks). The endpoint URL is likely wrong, or LM Studio served its dashboard page in place of an SSE stream.', | |
| effectiveModel, | |
| cfg.provider_name, | |
| ); | |
| return; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@harness/src/provider-lmstudio/stream.ts` around lines 230 - 236, The non-SSE
error path calls syntheticErrorEvent with cfg.model but other error paths use
the resolved effectiveModel; update the syntheticErrorEvent invocation (the 200
non-SSE body branch) to pass effectiveModel instead of cfg.model so the reported
model ID matches the resolved model used elsewhere (refer to
syntheticErrorEvent, cfg.model, effectiveModel).
| if (m.role === 'assistant') { | ||
| // Once we cross an assistant boundary BEFORE seeing any | ||
| // pending incoming id we've passed the turn this finalize | ||
| // is writing for — earlier function_result blocks can't be | ||
| // duplicates of `function_results`. | ||
| let unseen = false; | ||
| for (const id of incomingIds) { | ||
| if (!existingResultIds.has(id)) { | ||
| unseen = true; | ||
| break; | ||
| } | ||
| } | ||
| if (!unseen) break; | ||
| } |
There was a problem hiding this comment.
Assistant-boundary break condition is inverted in dedup scan.
The current condition keeps scanning older history when incoming IDs have not been seen yet, which is the common first-run path. That makes the guard degrade to O(history) and can falsely dedupe against older turns if an ID repeats.
Suggested fix
if (m.role === 'assistant') {
// Once we cross an assistant boundary BEFORE seeing any
// pending incoming id we've passed the turn this finalize
// is writing for — earlier function_result blocks can't be
// duplicates of `function_results`.
let unseen = false;
for (const id of incomingIds) {
if (!existingResultIds.has(id)) {
unseen = true;
break;
}
}
- if (!unseen) break;
+ if (unseen) break;
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@harness/src/turn-orchestrator/states/functions.ts` around lines 396 - 409,
The dedup scan in the loop over prior messages uses the wrong break condition:
when encountering an assistant boundary (m.role === 'assistant') you should stop
scanning further only if any incoming ID has NOT been seen yet (i.e., there
exists id in incomingIds that is not in existingResultIds) — the current logic
continues scanning in that common path and can O(history) and mis-dedupe; update
the branch in the assistant-boundary block (the code referencing m.role,
incomingIds, existingResultIds, and unseen) so that you break out of the outer
loop when an unseen id is found (invert the condition) instead of continuing,
ensuring earlier turns are not incorrectly considered duplicates.
| for (const op of ops) { | ||
| if (op.type === 'set') newValue = op.value; | ||
| if (op.type === 'increment') { | ||
| const prev = typeof oldValue === 'number' ? oldValue : 0; | ||
| newValue = prev + (op.by ?? 1); | ||
| } |
There was a problem hiding this comment.
Fix sequential state::update semantics in the test stub.
increment currently reads from the original value each time, so multi-op updates don’t compose correctly.
Proposed fix
- if (op.type === 'increment') {
- const prev = typeof oldValue === 'number' ? oldValue : 0;
- newValue = prev + (op.by ?? 1);
- }
+ if (op.type === 'increment') {
+ const prev = typeof newValue === 'number' ? newValue : 0;
+ newValue = prev + (op.by ?? 1);
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for (const op of ops) { | |
| if (op.type === 'set') newValue = op.value; | |
| if (op.type === 'increment') { | |
| const prev = typeof oldValue === 'number' ? oldValue : 0; | |
| newValue = prev + (op.by ?? 1); | |
| } | |
| for (const op of ops) { | |
| if (op.type === 'set') newValue = op.value; | |
| if (op.type === 'increment') { | |
| const prev = typeof newValue === 'number' ? newValue : 0; | |
| newValue = prev + (op.by ?? 1); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@harness/tests/context-compaction/compaction-done-emit.test.ts` around lines
95 - 100, The test stub computes increments against the original oldValue for
each op, so multiple sequential `increment` ops don't compose; update the loop
in compaction-done-emit.test.ts so increments use the latest computed value
rather than oldValue: initialize newValue to oldValue (or set a current value
variable) before processing ops, and when handling an `increment` op compute
prev from the current/newValue (falling back to 0 only if neither oldValue nor
newValue is a number) so successive increments accumulate correctly; refer to
the loop using vars ops, oldValue, and newValue to locate and change the logic.
| it('dedup is scoped to the pending batch (across-batch repeats stay separate for openai-format too)', () => { | ||
| // OpenAI/LM Studio's wire format is flat (no batched user-msg | ||
| // wrapper like Anthropic), so duplicates here just become a | ||
| // single tool entry regardless of an assistant gap. We still | ||
| // verify the assistant boundary doesn't merge tool entries from | ||
| // different turns into one row. | ||
| const msgs: AgentMessage[] = [ | ||
| mkResult('a', 'r1'), | ||
| { | ||
| role: 'assistant', | ||
| content: [{ type: 'text', text: 'thinking…' }], | ||
| stop_reason: 'end', | ||
| model: 'qwen/qwen3-4b-2507', | ||
| provider: 'lmstudio', | ||
| timestamp: 0, | ||
| }, | ||
| mkResult('a', 'r2'), | ||
| ]; | ||
| const out = toOpenaiMessages(msgs, '') as Array<Record<string, unknown>>; | ||
| const tools = out.filter((m) => m.role === 'tool') as Array< | ||
| { tool_call_id: string; content: string } | ||
| >; | ||
| // Latest-wins: a single tool row carries the most recent body. | ||
| // The assistant message between them survives intact. | ||
| expect(tools).toHaveLength(1); | ||
| expect(tools[0]?.content).toBe('r2'); | ||
| expect(out.filter((m) => m.role === 'assistant')).toHaveLength(1); |
There was a problem hiding this comment.
Align the cross-batch dedup invariant in this test.
The test title/comments say cross-batch repeats should stay separate, but the assertion expects a single merged tool message. Please make the name/comment/assertions agree on one invariant to avoid baking in the wrong behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@harness/tests/provider-lmstudio/wire-messages.test.ts` around lines 239 -
265, Test title/comments currently claim cross-batch repeats stay separate but
the assertions expect a single merged tool; update the test text to reflect the
actual invariant enforced by toOpenaiMessages/mkResult: dedup is scoped to the
pending batch so within-this-batch repeats (even with an assistant turn between)
collapse to one tool entry while cross-batch repeats remain separate—adjust the
it(...) title and inline comments to state "within-batch repeats dedupe;
cross-batch repeats stay separate" so they match the existing expect(...)
checks.
Three CI failures on the initial push: 1. `noControlCharactersInRegex` (Biome lint, harness): the eslint- disable comment on `stream.ts` was ignored by Biome 2.4.10. The `/[\x00-\x08\x0b\x0c\x0e-\x1f\x7f]/g` literal flagged on each escape inside the character class. Replaced with a `stripControlChars` helper that iterates by `charCodeAt` (allows TAB/LF/CR, strips other C0 + DEL). Same fix in `provider-lmstudio/stream.ts` and `provider-llamacpp/stream.ts`. 2. Format drift (Biome format, harness): three test files added in the base WIP commit (`agent-trigger.test.ts`, `functions.test.ts`, `provider-router.test.ts`) and several files we touched for the /review and llamacpp work didn't match `biome format`'s output (long-line wrapping, joined chain calls, single-line expressions). Ran `biome check --write` against `harness/`; biome ci now exits 0. 3. SPA build (console rust lint+test, `tsc -b` step): the `use-auto-accept-approvals.test.ts` "wiring guard" test imported `node:fs/promises` to read its own source as a sanity check. The console TypeScript project doesn't include node types, so `tsc -b` failed with TS2591. Dropped the source-introspection test; the typechecker already catches drift if the hook stops calling the policy-aware selector (its return shape would change). Tests: 789 harness (no change) + 234 console (was 235; dropped the one fs-using guard). `biome ci harness` and `tsc -b` both exit 0. No behavior change. Helper functions and formatting only.
Summary
Four cohesive workstreams on this branch:
to auto-resolve safe approval prompts, with a client-side
defense-in-depth deny filter for state-mutating / destructive /
egress / recursive-dispatch / credential surfaces.
provider-lmstudio— local LM Studio Chat Completionsprovider (load/unload/refresh + chat) on the iii bus.
provider-llamacpp— sibling provider forllama-server(single-model-per-process runtime; OpenAI-compat chat path).
/reviewhardening — security, perf, a11y, and testimprovements across the three above + the orchestrator and
context-compaction handlers.
A small skill-id↔function-id disambiguation pass and a CI
language=javascriptremoval landed alongside in the base WIPcommit; they're not gating but the review thread suggested splitting
them out in follow-ups.
What's in each commit
6c0c8b9c wip: auto-accept toggle, LM Studio provider, gemma routing— initial WIP. Adds
AutoAcceptToggle, the LM Studio provider withload/unload/refresh + auto-load retry, gemma routing intent, the
skill-id disambiguation prompt hints, and
compaction_doneevents.604f3130 fix(chat): defense-in-depth auto-accept policy + extract hook + a11y—
auto-accept-policy.tsdeny filter, extracteduseAutoAcceptApprovalshook (with hot-path early exit),<LiveRegion>+useLiveAnnouncer, AutoAcceptToggle a11y rework(focus-visible,
aria-describedby,aria-disabled, high-contrastON state per WCAG 1.4.3),
handleSubmitdeps cleanup,MessageListreverse-loop,formatStopReasonextraction.9ffef133 fix(harness): security, perf, and test hardening from /review— LM Studio + kimi SSE DoS guard (clamp tool-call
indexto ≤256),errors moved to
error_messageonly (notextContentBlockinjection sink), O(1) Map-based wire-message dedup, parallel
discovery via
Promise.allSettled, loopback-aware auth (omitAuthorizationon non-loopback without an explicit credential),validated
LMSTUDIO_BASE_URLparsing, non-2xx body truncation +control-char strip, bounded
existingResultIdsrebuild in theorchestrator, extracted
emitCompactionDoneshared helper. Testimprovements: fake-timer stream timeout, tightened
error_kindassertion,
abortedFromControllerassertion in load timeout,Anthropic dedup scenarios ported to LM Studio, full coverage of
the new auth / config shapes.
d6157847 feat(harness): add provider-llamacpp + bump default fetch timeout— entire new
harness/src/provider-llamacpp/worker (14 sourcefiles + 8 test files + docs), wiring across router /
auth-credentials / models-catalog / harness index. Default fetch
timeout 30s → 120s with
LLAMACPP_FETCH_TIMEOUT_MSenv override(reads fresh per-call). 35B+ GGUFs over LAN need it.
Diff scope
125 files changed, ~11,900 insertions, ~60 deletions. New worker
(provider-llamacpp) and a sibling design doc / Open Design artifact
under
console/web/DESIGN.mdaccount for most of the volume.Test plan
cd harness && bun run test— 821 passing across 105 files(provider-llamacpp + new auth/catalog seeds + provider-router
widening on top of the 789 we landed at /review time).
cd console/web && bun run test— 235 passing across 17 files(auto-accept-policy regression tests for the
shell::namespacefix,
useAutoAcceptApprovalswiring-guard,format-stop-reason).cd harness && bunx tsc --noEmit— clean.cd console/web && bunx tsc --noEmit -p .— clean.that triggers
shell::fs::reador another safe read. The cardshould auto-resolve with an SR announcement. A
shell::fs::write/
agent::triggershould still require a click.llama-server -m <model.gguf>on :8080,configure the console to point at it (or set
LLAMACPP_BASE_URL),pick the model in the picker, send a message. Tool calls should
flow through. For 35B+ models also set
LLAMACPP_FETCH_TIMEOUT_MS=180000.Space toggles, focus ring is visible in both light and dark
themes.
Follow-ups (out of scope here per /review)
openai-compat-ssekernel extraction acrosskimi/lmstudio/llamacpp (the duplication ratchet got tighter with
llamacpp landing). Belongs in its own PR.
language-removal into separate PRs.
gemmareferences in the diff. Either land the routing or correct the
message before merge.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation