Skip to content

Proj/brv setting#673

Merged
bao-byterover merged 52 commits into
mainfrom
proj/brv-setting
May 20, 2026
Merged

Proj/brv setting#673
bao-byterover merged 52 commits into
mainfrom
proj/brv-setting

Conversation

@bao-byterover
Copy link
Copy Markdown
Collaborator

Summary

  • Problem:
    Users had no unified way to inspect and override daemon/runtime settings, the TUI and Web UI settings surfaces were incomplete, and the CLI wait logic still relied on a legacy wall-clock timeout model.

  • Why it matters:
    Slow local LLMs and long-running tasks need configurable budgets and more reliable task waiting. Without this, users hit hard-to-tune timeouts, fragmented settings UX, and confusing behavior across CLI/TUI/Web UI.

  • What changed:
    Added persistent global settings storage/validation and daemon bootstrap support; introduced brv settings (list/get/set/reset); added grouped settings UX in TUI /settings; added Web UI configuration sections and settings panels; plumbed LLM iteration/request timeout settings through the agent/provider stack; replaced the CLI wall-clock waiter with heartbeat-based task monitoring; soft-deprecated --timeout on brv query, brv curate, and brv dream; refreshed the configuration/connectors Web UI layout and connector artwork.

  • What did NOT change (scope boundary):
    No live-reload of settings was added; changes still require brv restart. No new billing/provider auth flows were introduced. The settings surface remains limited to the registered global keys and does not add per-project overrides.

Type of change

  • Bug fix
  • New feature
  • Refactor (no behavior change)
  • Documentation
  • Test
  • Chore (build, dependencies, CI)

Scope (select all touched areas)

  • TUI / REPL
  • Agent / Tools
  • LLM Providers
  • Server / Daemon
  • Shared (constants, types, transport events)
  • CLI Commands (oclif)
  • Hub / Connectors
  • Cloud Sync
  • CI/CD / Infra

Linked issues

  • Closes # N/A
  • Related ENG-2789, ENG-2790, ENG-2791, ENG-2792, ENG-2793, ENG-2794, ENG-2795, ENG-2796, ENG-2797, ENG-2798, ENG-2799, ENG-2800, ENG-2801, ENG-2802, ENG-2816, ENG-2817, ENG-2818, ENG-2819

Root cause (bug fixes only, otherwise write N/A)

  • Root cause:
    N/A

  • Why this was not caught earlier:
    N/A

Test plan

  • Coverage added:

    • Unit test
    • Integration test
    • Manual verification only
  • Test file(s):
    test/commands/settings/index.test.ts
    test/commands/settings/get.test.ts
    test/commands/settings/set.test.ts
    test/commands/settings/reset.test.ts
    test/unit/infra/storage/file-settings-store.test.ts
    test/unit/infra/storage/settings-validator.test.ts
    test/unit/infra/transport/handlers/settings-handler.test.ts
    test/unit/oclif/lib/task-client-heartbeat.test.ts
    test/unit/oclif/lib/timeout-deprecation.test.ts
    test/unit/agent/settings/agent-settings-snapshot.test.ts
    test/unit/agent/llm/generators/ai-sdk-abort-helper.test.ts
    test/unit/shared/utils/format-duration.test.ts
    test/unit/shared/utils/format-settings.test.ts
    test/unit/tui/features/settings/format-settings.test.ts
    test/unit/webui/features/settings/api/list-settings.test.ts
    test/unit/webui/features/settings/api/reset-setting.test.ts
    test/unit/webui/features/settings/api/set-setting.test.ts
    test/unit/webui/features/settings/stores/restart-banner-store.test.ts

  • Key scenario(s) covered:
    brv settings grouped output, get/set/reset flows, and JSON/text behavior
    Settings persistence, validation, and startup snapshot filtering
    Coupling rule enforcement for llm.requestTimeoutMs <= llm.iterationBudgetMs
    Heartbeat-based task waiting and clean handling of cancelled tasks
    --timeout deprecation messaging
    TUI/Web UI settings formatting and API wiring

User-visible changes

  • New brv settings command group to list, inspect, set, and reset global settings.
  • New TUI /settings page with grouped rows, in-place editing, reset support, and inline validation errors.
  • Web UI Configuration is split into General, Connectors, and Version control, with restart banner and settings panels.
  • New configurable keys for concurrency, LLM loop/request budgets, and task history retention.
  • --timeout on brv query, brv curate, and brv dream is now deprecated and has no effect, but remains accepted for compatibility.
  • CLI task waiting now uses daemon heartbeat activity instead of a fixed wall-clock timeout.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
$ npx mocha --forbid-only test/unit/oclif/lib/timeout-deprecation.test.ts
5 passing (4ms)

$ npx mocha --forbid-only test/commands/settings/index.test.ts
8 passing (55ms)

$ npx mocha --forbid-only test/unit/oclif/lib/task-client-heartbeat.test.ts
6 passing (13ms)

$ npm run typecheck
tsc --noEmit && tsc --noEmit -p src/webui/tsconfig.json
(exit 0)

$ npm run build
vite build ... ✓ built in 5.54s
PWA generateSW complete
(exit 0)

Checklist

  • Tests added or updated and passing (npm test)
  • Lint passes (npm run lint)
  • Type check passes (npm run typecheck)
  • Build succeeds (npm run build)
  • Commits follow Conventional Commits format
  • Documentation updated (if applicable)
  • No breaking changes (or clearly documented above)
  • Branch is up to date with main

Risks and mitigations

  • Risk:
    Users may expect settings changes to apply immediately.
    Mitigation:
    All surfaces explicitly show restart-required messaging and the Web UI adds a restart banner.
  • Risk:
    Existing scripts may still pass --timeout and assume it changes CLI wait behavior.
    Mitigation:
    The flag is preserved for compatibility and now prints a deprecation warning instead of breaking callers.
  • Risk:
    Heartbeat-based waiting changes task-completion behavior for slow or quiet daemon sessions.
    Mitigation:
    Added targeted tests for heartbeat progress tracking and clean cancellation handling.
  • Risk:
    The branch is behind main, so merge conflicts or regressions may appear during rebase.
    Mitigation:
    Rebase or merge main before final review and rerun the full npm test suite afterward.

Introduce FileSettingsStore + SettingsValidator backed by
<BRV_DATA_DIR>/settings.json with atomic temp+rename writes. Schema is
version-tagged and seeded with three initial keys (agentPool.maxSize,
agentPool.maxConcurrentTasksPerProject, taskHistory.maxEntries) whose
defaults reference the existing constants module so a constant change
flows through automatically. Validator centralises type and range
rules so future coupling rules (M2/M3) plug in without touching the
store. Unknown keys and out-of-range values surface as typed errors
that callers map at the transport boundary in M1 T3.
The daemon now reads the settings file once at boot and feeds the
resolved values into AgentPool (maxSize, maxConcurrentTasks) and the
per-project task-history cache (maxEntries). A new bootstrapSettings()
helper drives the wiring as a pure function so the failure paths
(missing file, corrupt JSON, partial validation failures) are
unit-tested without spinning up the daemon. FileSettingsStore gains a
readStartupSnapshot() method (and a tagged readRawValuesOrError() under
the hood) that surfaces parseError plus the list of rejected entries
so the bootstrap can log one warning per problem before falling back
to the registered defaults.

task-history-store-cache adopts a configureTaskHistoryStoreCache()
setter the daemon calls once before the first getStore() lookup so the
maxEntries override is observed by every per-project store the cache
creates afterwards. Existing stores keep their configured retention.
Surfaces (CLI, TUI, WebUI) gain a transport contract for reading and
writing settings without crossing the server import boundary. The
SettingsEvents module declares the four wire events (list, get, set,
reset) and a stable SettingsItemDTO that bundles current/default plus
descriptor metadata (description, range, type, restartRequired) so
clients can render a complete settings page from a single LIST round
trip.

SettingsHandler delegates persistence and validation to the existing
FileSettingsStore and SettingsValidator and maps every validator
exception to a typed structured response carrying key, value, and a
human-readable message. The daemon now shares a single FileSettingsStore
instance between the startup bootstrap and the transport surface so
clients observe the same on-disk state the daemon resolved at boot.
Adds the user-facing CLI surface for the configurable settings feature:
brv settings (list), brv settings list (alias), brv settings get,
brv settings set, brv settings reset. Each command speaks to the
daemon's SettingsEvents over the transport client, so the CLI never
reads or writes settings.json directly.

list renders a fixed-width table with KEY, CURRENT, DEFAULT, and
RESTART columns. set parses its argument as a number when finite so
the daemon validator surfaces range-style errors instead of always
reporting "expected integer"; non-numeric input still flows through
as a string. Validation failures bubble up as exit-code-1 with the
daemon's structured message, and every command supports --format json
with a stable shape suitable for scripting and CI use.
Adds a self-contained Settings section under the Commands collapsible
in README so a new user can discover the brv settings command group
without reading source. The section names the three M1 keys with their
defaults and what each one controls, points at the platform-specific
locations of settings.json plus the BRV_DATA_DIR override, and shows a
copy-paste example session (list, set, get) whose output matches the
CLI's actual format.

Restart-required semantics are stated once, explicitly, with the
rationale that every agent observes the same configuration for the
lifetime of the daemon. Validation-error behaviour (exit code 1 plus
the daemon's message) and the --format json flag are mentioned at the
end so scripting and CI consumers know how to detect failures.
…onfig

The agent's service-initializer now constructs AgentLLMService with a
timeout field sourced from an agent-side settings snapshot. The snapshot
is loaded once during agent bootstrap (alongside the existing
GET_PROJECT_CONFIG / GET_AUTH / GET_PROVIDER_CONFIG fan-out) and cached
for the lifetime of the agent process, so per-session services and the
hot-swap path observe a stable configuration.

The new settings/agent-settings-snapshot module exposes a synchronous
getAgentSettingValue accessor that CipherAgent reads when building
sessionLLMConfig in both start() and refreshProviderConfig(). When the
key is absent (M2 T2 has not registered it yet, or the user has not
overridden the default), the snapshot returns undefined and
AgentLLMService falls back to its existing 10-minute internal default.
M3 will reuse the same loader for llm.requestTimeoutMs.
Adds the agentic-loop wall-clock budget to the settings registry with
default 600000 ms (10 minutes, matches the existing hardcoded
fallback) and a 60_000 to 7_200_000 ms range. The default lives in a
new AGENT_LLM_ITERATION_BUDGET_MS constant in server/constants.ts so a
single source of truth feeds both the registry and the AgentLLMService
internal fallback; the previously hardcoded 600_000 literal in
agent-llm-service.ts is replaced with the constant.

brv settings list, get, set, reset all pick up the key automatically
because the daemon-side handler walks SETTINGS_REGISTRY. The agent-side
snapshot loader from ENG-2794 now sees a real value for the key when a
user has overridden it, so the agentic loop honours the new budget
after a restart. The README's Settings table gains a row plus a
paragraph explaining when to raise the value (slow local LLMs) and
when to lower it (cloud providers, faster failure detection).
A new ai-sdk-abort-helper module exposes withRequestTimeout for
one-shot calls and createAbortContext for the streaming iterator
path. Both keyed on a configurable timeout, both translate a
timer-driven abort into a typed LlmRequestTimeoutError. The retry
layer already classifies that error as retryable via the existing
"timeout" substring in DEFAULT_RETRY_POLICY.retryableErrors, so a
hung Ollama or LM Studio connection now aborts cleanly and retries
instead of consuming the full agentic loop budget.

AiSdkContentGenerator gains a requestTimeoutMs config field that
flows into generateText (via withRequestTimeout) and streamText (via
createAbortContext). When the field is undefined no signal is built
and behaviour matches today. M3 T2 wires the value from the
llm.requestTimeoutMs setting; until then the field is plumbed but
unused in production.
Registers the per-request HTTP timeout setting (default 120000 ms, 2
minutes; range 10000 to 7200000 ms) with the agentic-loop budget as
its coupling partner: SettingsValidator.validateCoupling enforces
llm.requestTimeoutMs <= llm.iterationBudgetMs, and FileSettingsStore
gates writes through both per-key range checks and the coupling rule
so violations like raising requestTimeoutMs above iterationBudgetMs
are rejected with a message naming both keys and their values.

partition() applies the same rule to startup reads: when the on-disk
file violates the coupling rule, both coupled keys are demoted to the
invalid list and the bootstrap warning loop logs one entry per key,
matching the project AC of falling back to defaults for both. The
agent-side snapshot now flows requestTimeoutMs through
SessionLLMConfig and SessionManager into the provider registry, which
hands it to each direct AI SDK provider's AiSdkContentGenerator; the
ByteRover provider keeps its existing transport-layer timeout. The
README's Settings section gains the new key, the coupling rule, and a
three-row preset table covering cloud, fast-GPU local, and CPU local
profiles.
Adds an interactive /settings slash command that lets TUI users view
and edit every registered setting without leaving the REPL. The page
fetches the full list via SettingsEvents.LIST, renders KEY / CURRENT /
DEFAULT / RESTART columns, and lets the user move the cursor with
arrow keys, press Enter to edit a value inline, Esc to abort an edit
or close the page, and R to reset the selected key to its default.

A page-level state machine keeps the React tree consistent across the
browse, edit, and saving phases. validateSettingInput rejects
non-integer or out-of-range entries locally before the round-trip,
and any structured error returned by SettingsHandler renders inline
on the affected row. After any successful save or reset the page
shows the `Run brv restart to apply.` banner so users know the change
is persisted but not yet applied. format-settings.ts plus its unit
tests cover the pure formatter and input validator; the Ink component
itself follows the project's no-render-test convention.
Introduces TaskHeartbeatManager — a per-task liveness ticker the task
router installs at task:started and tears down on the three terminal
events (task:completed, task:error, task:cancelled). The manager
debounces a setTimeout keyed on a 10-second interval so a quiet task
fires task:heartbeat exactly once per quiet window and a noisy task
that keeps forwarding LLM events never produces redundant heartbeats.

The router calls recordActivity inside routeLlmEvent so every
forwarded LlmEvent resets the timer, register inside handleTaskStarted
when the agent confirms the task is running, and recordTermination at
each of the three terminal lifecycle paths plus the agent-disconnect
failTask path and the local-cancel branch. The daemon constructs one
shared manager in brv-server.ts and threads it through
TransportHandlers; the emit callback hands the heartbeat payload to
transport.sendTo plus broadcastToProjectRoom so the same fanout the
router already uses for other task-scoped events delivers the new
event without bespoke plumbing. The CLI watcher consuming this event
lands in ENG-2802.
The legacy --timeout flag drove a CLI wall-clock timer that fired
before the daemon's real iteration budget on slow LLM calls. Removing
the flag would break existing scripts and CI jobs, so this change
keeps it accepted but neutralises it: the flag definition stays, the
help text is now the canonical "(deprecated, no effect, kept for
compatibility)" string, and passing a non-default value prints a
single deprecation warning pointing at llm.iterationBudgetMs as the
replacement.

The value is no longer threaded into waitForTaskCompletion — every
call site now passes DEFAULT_TIMEOUT_SECONDS * 1000 unconditionally so
the wait-for-task wall-clock stops responding to user input. ENG-2802
removes the timer outright once the heartbeat watcher lands. A shared
timeout-deprecation helper centralises the message and the
user-passed-vs-default detection so the three commands stay in sync.
waitForTaskCompletion no longer enforces a wall-clock timeout. The
setTimeout block that emitted "Task timed out after Xs" is gone; the
unused timeoutMs option is dropped from WaitForTaskOptions and from
curate / query / dream call sites. The disconnect-grace path stays
intact and keeps mapping socket death to a retryable
AGENT_DISCONNECTED error.

In its place, a setInterval-driven stale watcher tracks
lastActivityAt: every LLM event, the new TaskEvents.HEARTBEAT
subscription, and the existing task lifecycle events refresh the
timestamp. When 30s elapse without any activity (3x the daemon's
10s heartbeat cadence so a single missed beat is not a false
positive), the CLI rejects with "Daemon is unresponsive on this
task". Slow but progressing tasks ride a continuous stream of
heartbeats and never trip the watcher; truly stuck daemons surface
quickly without the misleading wall-clock message.
M-1 (logic flaw, FileSettingsStore.reset): The previous implementation
ran partition() on the remaining raw values after deleting the target
key, which silently demoted (and then dropped) any OTHER invalid
pre-existing entries — a single `brv settings reset` could wipe the
whole settings file when the user only asked to reset one key. reset
now writes back the raw values verbatim minus the deleted key, so
unrelated invalid entries survive until the startup loader surfaces
them as warnings. SettingsFile.values is retyped as
Record<string, unknown> to reflect that the persisted shape can
legitimately retain such entries.

M-2 (TUI UX gap): SettingsPage gated both useInput handlers on browse
/ edit, leaving the user trapped if a SET write hung. A third handler
active during 'saving' now exits on Esc; the in-flight mutation
resolves in the background.

M-3 (Outside-In, magic strings): The five setting key names were
duplicated as inline string literals across SettingsValidator,
settings-bootstrap, cipher-agent, and the registry itself. A single
SETTINGS_KEYS const in core/domain/entities/settings.ts now feeds all
call sites — a rename is now a compile error everywhere instead of a
silent miss in getAgentSettingValue.

L-1 (CLI heartbeat correctness): The watcher's LLM event handlers
checked `if (!data.taskId) return` (presence only) instead of equality
against the awaited taskId. In multi-task scenarios where the CLI
received broadcast events for other tasks, markActive() falsely
refreshed our stale clock. Handlers now mirror the HEARTBEAT filter
(`data.taskId !== taskId` → return). Added a regression test that
streams events for a different taskId and asserts the watcher still
surfaces the stuck-daemon error.
ENG-2802 (M6 T3): CLI heartbeat watcher missed the task:cancelled
terminal event. A cancel arriving with no following activity would
trip the stale-check and falsely reject with "Daemon is unresponsive
on this task". Add a CANCELLED subscriber that disposes cleanly and
covers the 4th AC path with a unit test.

ENG-2801 (M6 T2): the deprecation message embedded the
llm.iterationBudgetMs setting key. M6 T2 AC explicitly requires the
wording to omit any specific setting key so M6 ships independently
of M1/M2/M3 and survives setting renames. Drop the suffix and lock
the new wording in both the dedicated test and the per-command
deprecation tests.
Pure-utility module at src/shared/utils/format-duration.ts exporting
formatDuration(ms), formatCount(n), and parseDuration(input). Foundation
for M7 T3 (oclif renderers) and M7 T4 (TUI page redesign): both surfaces
will import these helpers so values render identically and human input
syntax stays consistent.

parseDuration accepts s / m / h / ms parts (case-insensitive, multi-part
with optional whitespace) plus bare ms integers for back-compat with
existing scripts. Fractions, unknown units, empty, and malformed inputs
return a discriminated DurationParseError carrying a user-facing hint.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

Claude finished @bao-byterover's task in 8m 2s —— View job


Review — PR #673 (Proj/brv settings + heartbeat-based waiter + WebUI configuration revamp)

Overall score: 4 / 5 — Substantial, well-tested, mostly clean. A few real correctness concerns flagged inline before merge; nothing structurally wrong with the design.

  • Read CLAUDE.md + skim PR body
  • Review settings store/validator + bootstrap
  • Review task-client heartbeat + heartbeat manager
  • Review LLM abort helper + ai-sdk-content-generator
  • Review oclif brv settings commands
  • Review TUI/WebUI settings surfaces
  • Review transport handler + shared events
  • Post inline comments
  • Post summary review

Summary

A large, well-scoped change that adds (a) a registry-driven global settings system with persistence/validation/transport plumbing, (b) heartbeat-based CLI task waiting that replaces fixed wall-clock --timeout, (c) per-request requestTimeoutMs plumbed through every AI SDK provider, and (d) a refreshed WebUI configuration layout. The Outside-In pattern is respected (commands drive the DTOs, not the reverse), the test coverage on the new units is solid, and the import boundaries between oclif/, tui/, webui/, and server/ are preserved.

Strengths

  • praise: SettingsValidator.partition() cleanly separates validation from persistence, with a deliberate "preserve invalid entries on reset" rule that protects the user's file from collateral damage. Coupling rules (requestTimeout ≤ iterationBudget) live there too, not scattered across consumers.
  • praise: The dispatch on descriptor.unit ('ms' | 'count') rather than parsing key suffixes is the right call — adding a non-ms setting won't break formatting.
  • praise: TaskHeartbeatManager is transport-agnostic and unit-testable, with idempotent register() and a debounce-on-activity pattern that avoids redundant emissions on noisy tasks.
  • praise: Settings file path is atomic-write-via-temp (rename), which is the right choice for daemon-shared state. The schema version field gives you a clean migration hook later.
  • praise: Test coverage on the heartbeat watcher is thorough — covers stale-fire, foreign-task filtering, cancellation cleanup, and the disconnect → AGENT_DISCONNECTED retryable path.

Findings (inline comments)

# Severity File Issue
1 issue src/oclif/lib/task-client.ts (cold-start) STALE_THRESHOLD_MS = 30 s is measured from waitForTaskCompletion() return, but task:started (which triggers heartbeat registration) only fires after agent fork + provider/auth/settings RTTs. Cold-start on Windows/AV-throttled hosts can exceed 30 s → false "Daemon is unresponsive" rejection. Regression vs. the old 300 s wall-clock for cold starts.
2 issue src/agent/infra/llm/generators/ai-sdk-content-generator.ts In generateContentStream, the AbortSignal is attached for the entire stream lifetime. A long-form completion that streams chunks steadily for >120 s aborts as LlmRequestTimeoutError, even though data is flowing. Slow-local-LLM users (the exact users this PR targets) will hit this. Consider idle-deadline semantics or document the descriptor as a total stream deadline.
3 issue src/server/infra/storage/file-settings-store.ts set() reads via readOverrides() (silently drops invalid entries) and writes back — destroying any pre-existing invalid sibling entries. reset() uses the raw-merge pattern that preserves them. Asymmetric semantics on the same file.
4 issue (docs) src/oclif/commands/curate/index.ts The pre-existing --timeout 600 example in the examples array contradicts the deprecation introduced by this PR. Same audit needed for dream.ts / query.ts.
5 suggestion src/oclif/lib/timeout-deprecation.ts userValue === defaultValue cannot tell "user passed --timeout 300" from "oclif filled the default." A script pinning the default explicitly never sees the deprecation message.
6 nitpick src/server/infra/daemon/brv-server.ts taskHeartbeatManager never plumbed into ShutdownHandler → no dispose() on graceful shutdown. Cosmetic; Node exit reaps timers.
7 nitpick src/webui/features/settings/components/settings-row.tsx Toast renders "1,800,000 milliseconds" instead of "30m" for ms-typed settings. formatDuration(value) already exists in shared utils.

Smaller observations (not posted inline)

  • note: SETTINGS_REGISTRY allows llm.requestTimeoutMs and llm.iterationBudgetMs to both reach 3 600 000 ms — a user could set requestTimeoutMs = 3 500 000 and the coupling check still passes. Probably fine since the upper bound is a sanity cap rather than a meaningful budget, but worth confirming nothing downstream interprets a 58-minute single-request timeout as "infinity".
  • note: format-duration.ts parseDuration happily accepts unordered parts (30s30m parses fine). Not a bug — additive grammar — but means error messages can't reject obvious typos for free.
  • note: bootstrapSettings log prefix is [settings] while the surrounding daemon log lines use [Daemon]. Cosmetic.
  • note: agent-settings-snapshot.ts correctly isolates per-process module state, but the "cache never refreshes during agent lifetime" contract means a brv settings set followed by brv curate (without restart) silently uses stale values. The restart-required banner across UIs makes this discoverable, so this is fine — just worth noting in tests that future "live reload" work touches both this module and task-history-store-cache.ts (same staleness shape).
  • note: settings-handler.ts LIST re-derives DTOs from SETTINGS_REGISTRY and overlays store.list() results — but store.list() is contractually one-row-per-descriptor, so the ?? descriptor.default fallback at line 50 is dead. Trivial.
  • chore: Branch is behind main (PR body acknowledges); rebase + re-run npm test before merge per your own checklist. The "Tests added or updated and passing (npm test)" box is unchecked — make sure CI confirms green on the merged tip.

Verdict

Strongly recommend addressing #1 (cold-start regression) and #2 (stream-deadline semantics) before merge — both are user-visible regressions for the exact slow-LLM scenario the PR is designed to support. #3 is an inconsistency I'd want resolved either way (decide whether set is destructive). The rest are polish.
· branch proj/brv-setting

Comment thread src/oclif/commands/curate/index.ts
Comment thread src/oclif/lib/task-client.ts
Comment thread src/oclif/lib/timeout-deprecation.ts
Comment thread src/agent/infra/llm/generators/ai-sdk-content-generator.ts
Comment thread src/server/infra/daemon/brv-server.ts
Comment thread src/webui/features/settings/components/settings-row.tsx
Comment thread src/server/infra/storage/file-settings-store.ts
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

Claude finished @bao-byterover's task in 3m 19s —— View job


Review — PR #673 (synchronize follow-up)

Overall score: 4 / 5 — Unchanged from prior review. Branch is now up-to-date with main (✅ one of the previous chore items), but the seven inline findings from the 2026-05-18 review have NOT been addressed in this sync. The only new commits since that review are bd7c365 (ENG-2854 — fork-PR validation gating, unrelated) and c0b6fce (merge from main — CI + unrelated context-tree event).


Status of prior findings

All seven prior inline comments still apply at the exact same lines. None of the source files involved (task-client.ts, ai-sdk-content-generator.ts, ai-sdk-abort-helper.ts, file-settings-store.ts, curate/index.ts, timeout-deprecation.ts, brv-server.ts, settings-row.tsx) were touched in the two commits added since the previous review.

# Severity File / line Verified still present in the latest tip?
1 issue (cold-start regression) src/oclif/lib/task-client.ts:184 lastActivityAt = Date.now() initialized at watcher creation; STALE_THRESHOLD_MS = 30_000; heartbeat manager schedules first heartbeat at intervalMs after register() (task-heartbeat-manager.ts:104-108) — no immediate fire. Daemon task:started still fires only after fork + init pipeline (task-router.ts:1571-1573). Cold-start window unchanged.
2 issue (stream-deadline semantics) src/agent/infra/llm/generators/ai-sdk-content-generator.ts:148 createAbortContext(this.requestTimeoutMs) (ai-sdk-abort-helper.ts:44-64) sets a single setTimeout at construction; nothing inside the for await loop on result.fullStream resets it. Total stream deadline behavior unchanged.
3 issue (asymmetric set/reset semantics) src/server/infra/storage/file-settings-store.ts:110-121 set() still reads via readOverrides() (which strips invalid via partition() at line 129) and writes the result back. reset() (line 89) still uses readRawValues() to preserve invalid siblings. Asymmetry unchanged.
4 issue (stale --timeout example) src/oclif/commands/curate/index.ts:66-67 ✅ Example block # Increase timeout for slow models (in seconds)\n... --timeout 600 still present in the examples array. dream.ts / query.ts examples have no --timeout text (verified clean).
5 suggestion (deprecation gap) src/oclif/lib/timeout-deprecation.ts:37 userValue === defaultValue early-return unchanged. Explicit --timeout 300 invocations still never see the deprecation message.
6 nitpick (resource hygiene) src/server/infra/daemon/brv-server.ts:445-452 taskHeartbeatManager.dispose() is still never called from TransportHandlers.cleanup() (only taskRouter.clearTasks() runs) or ShutdownHandler.
7 nitpick (UX polish) src/webui/features/settings/components/settings-row.tsx:41 toastValue still renders "${formatCount(value)} milliseconds" instead of using formatDuration from shared/utils/format-duration.ts.

No new findings introduced by this sync

The merge commit c0b6fce touched only:

  • .github/workflows/pr-*.yml and CONTRIBUTING.md (CI / docs)
  • packages/byterover-packages submodule bump
  • src/shared/transport/events/context-tree-events.ts and one webui context-tree panel (unrelated to the settings/heartbeat surfaces)

None of those changes interact with the settings, heartbeat, or LLM-abort code paths under review. bd7c365 is workflow-only.

Recommendation

Re-recommend addressing #1 (cold-start regression) and #2 (stream-deadline semantics) before merge — these are the two findings that materially regress the slow-LLM scenario this PR is designed to support. #3 should be resolved either way (decide whether set is destructive to invalid siblings, then document or fix). #4 is a 2-line examples array deletion. #5/#6/#7 are polish and can ship as-is or be batched into a follow-up.

The "Tests added or updated and passing (npm test)" checkbox in the PR body is still unchecked — please confirm CI is green on the merged tip before requesting final approval.
· branch proj/brv-setting

Two regressions surfaced by the PR review on #673:

- CLI task watcher subscribed only to heartbeat/LLM events, none of
  which fire during the agent fork + auth/provider init pipeline.
  Cold starts that exceeded 30s (Windows under AV, slow disks) triggered
  a false "Daemon is unresponsive" rejection. Watcher now also listens
  for task:created (synchronous at T+0), task:ack, and task:started,
  all filtered by taskId so foreign-task broadcasts do not pollute.

- The streaming AbortSignal was a total deadline measured from
  construction; a long-form completion that streamed chunks steadily for
  longer than llm.requestTimeoutMs aborted as LlmRequestTimeoutError
  even while data was flowing. Added AbortContext.recordActivity() that
  resets the timer; the streaming generator loop calls it on every chunk
  so the deadline becomes idle-deadline for streams. Non-streaming
  generateContent uses withRequestTimeout unchanged, keeping the
  total-deadline semantic correct for one-shot Promises.

Note: llm.requestTimeoutMs description in SETTINGS_REGISTRY still reads
"Max wait time per LLM response", which is now ambiguous for streams.
Description update is intentionally NOT in this commit -- needs design
call (update description vs revert to total vs split into two settings).

Tests: 1 cold-start regression test on task-client-heartbeat (the
existing 30s-no-event reject test stays as-is to lock the
no-activity stall behavior); 4 idle-deadline contract tests on
createAbortContext (idle reset, stall still detected, post-fire no-op,
undefined-timeoutMs no-op).
…dline

Adversarial gap analysis on top of 9da640a uncovered four scenarios the
existing tests did not lock down. All new tests fail without the prior
fixes and pass against the current implementation.

task-client heartbeat watcher (+3 cases):
- foreign-task task:created / task:ack / task:started must NOT bump our
  watcher (regression guard for the new lifecycle subscriptions: without
  the taskId filter, a noisy peer task in the same project room would
  keep our watcher alive forever and mask real stalls)
- single task:created bump followed by 30s of silence still triggers
  the stale rejection (proves the bump shifts the deadline but does not
  permanently disable the watcher)
- two concurrent watchers for different taskIds do not bump each other
  on lifecycle events — per-watcher taskId filter isolates them

ai-sdk-abort-helper (+1 case):
- many rapid recordActivity() calls do not leak setTimeout handles;
  the stream loop calls recordActivity on EVERY chunk, so a slow model
  emitting hundreds of chunks per second must not accumulate timers.
  Asserts clock.countTimers() stays at 1 across 500 rapid bumps and
  drops to 0 after cleanup()
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

Claude finished @bao-byterover's task in 4m 3s —— View job


Review — PR #673 (sync — 9da640a + 3f0e239)

Overall score: 4 / 5 — Materially better than the prior sync. The two regressions flagged as blocking before merge (#1 cold-start, #2 stream-deadline) are now properly addressed with focused fixes and good test coverage. The remaining findings (#3#7) carry over unchanged.


What changed in this sync

Two commits on top of the previous review:

Commit Targets prior finding Verdict
9da640a fix: cold-start false-reject and stream idle-deadline for slow LLMs #1, #2 ✅ Both addressed
3f0e239 test: add edge-case coverage for heartbeat watcher and abort idle-deadline follow-up ✅ Useful adversarial tests

Status of each prior finding on the latest tip

# Severity File Status Notes
1 issue (cold-start) src/oclif/lib/task-client.ts:228-246 fixed Watcher now subscribes to task:created, task:ack, and task:started with strict taskId filtering. task:created is emitted synchronously by task-router.ts:938 before any await, so the original "30 s from waitForTaskCompletion() Promise creation with zero events" window is closed. The new 30 s budget is now measured between task:ack and task:started (i.e., the agent fork/boot window), which is the realistic worst-case anyway. New cold-start regression test at test/unit/oclif/lib/task-client-heartbeat.test.ts:325 locks the behavior, plus 3 cross-task-isolation tests confirm peer broadcasts don't accidentally bump our watcher.
2 issue (stream deadline) src/agent/infra/llm/generators/ai-sdk-content-generator.ts:148-169, src/agent/infra/llm/generators/ai-sdk-abort-helper.ts:82-86 fixed createAbortContext().recordActivity() clears and re-arms the underlying setTimeout on every chunk; the streaming for await loop calls it on each event. Non-streaming generateContent still uses withRequestTimeout (total deadline), which is the right semantic for one-shot Promises. Idle-deadline contract is covered by 5 tests in ai-sdk-abort-helper.test.ts including a "500 rapid bumps stays at 1 timer handle" leak guard. Caveat the author already called out in the commit message: SETTINGS_REGISTRY description for llm.requestTimeoutMs still reads "Max wait time per LLM response", which is now ambiguous (idle for streams, total for one-shots). A follow-up should either reword to e.g. "Max idle time during streams / max total time for one-shot completions" or split into two settings.
3 issue (set/reset asymmetry) src/server/infra/storage/file-settings-store.ts:110-121 ❌ unchanged set() still reads via readOverrides() → invalid sibling entries are silently dropped on write. reset() correctly uses readRawValues(). Decide whether set should be destructive to invalid siblings; either fix or document.
4 issue (stale docs) src/oclif/commands/curate/index.ts:66-67 ❌ unchanged # Increase timeout for slow models (in seconds)\n... --timeout 600 is still in the examples array. dream.ts / query.ts remain clean. Two-line delete (or rewrite as brv settings set llm.iterationBudgetMs ... to demonstrate the new path).
5 suggestion (deprecation gap) src/oclif/lib/timeout-deprecation.ts:37 ❌ unchanged userValue === defaultValue early-return still hides the warning from explicit --timeout 300 callers.
6 nitpick (resource hygiene) src/server/infra/daemon/brv-server.ts:445-509, src/server/infra/process/transport-handlers.ts:67-117, src/server/infra/process/task-router.ts:342-350 ❌ unchanged TaskHeartbeatManager has a dispose() method (task-heartbeat-manager.ts:49), but it is never called. Note the stale comment in transport-handlers.ts:67-68 already claims clearTasks() "can dispose its timers" — the contract is documented but not implemented. One-line fix: options.taskHeartbeatManager?.dispose() inside TransportHandlers.cleanup().
7 nitpick (UX polish) src/webui/features/settings/components/settings-row.tsx:41 ❌ unchanged Toast still renders "1,800,000 milliseconds". formatDuration helper exists in shared/utils/format-duration.ts.

New observations on this sync

  • praise: The taskId-filter on task:created / task:ack / task:started is consistently applied in task-client.ts:235-246, and the cross-task-isolation tests (task-client-heartbeat.test.ts:213, plus the "two concurrent watchers" test) explicitly guard against the broadcast-to-project-room behavior in task-router.ts:941-948. That was the exact failure mode I would have flagged on a naïve subscription; nicely pre-empted.
  • praise: The leak-guard test (ai-sdk-abort-helper.test.ts — 500 rapid recordActivity() calls, clock.countTimers() stays at 1) is the right adversarial assertion for the "clear + re-arm on every chunk" idiom. A streaming local model can emit hundreds of chunks/sec; without this test a future refactor could trivially leak.
  • note (carried from author's commit message): The llm.requestTimeoutMs descriptor in SETTINGS_REGISTRY is intentionally left ambiguous in this commit. Worth a follow-up before the next release so users sizing the setting for cold-start latency understand they're now sizing it for max-idle-between-chunks instead.
  • note: The new cold-start budget is task:ack → task:started rather than Promise creation → first event. On extreme Windows-under-AV hosts where agent fork + ESM boot + 4 daemon RTTs (auth/provider/billing/project config) collectively exceed 30 s, you could still see a false-reject. Mitigation: have agent-process.ts emit a single task:ack-style activity ping back to the originating client right after agent:register succeeds but before the slow init pipeline runs. Not blocking — the realistic worst case is well inside 30 s — but worth a Linear ticket if Windows users surface this in telemetry.

Recommendation

Approve once #3 (set/reset asymmetry — real correctness issue) and #4 (2-line docs delete) are addressed. #5/#6/#7 are polish and can ship as-is or be batched into a follow-up. The two blocking findings from the prior reviews are well-resolved with appropriate test coverage; this is a much stronger state than the 2026-05-19 sync.

branch proj/brv-setting ·

@bao-byterover bao-byterover merged commit f14047c into main May 20, 2026
7 checks passed
@bao-byterover bao-byterover deleted the proj/brv-setting branch May 20, 2026 02:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants