Skip to content

Feat/eng 2512#655

Merged
ncnthien merged 12 commits into
proj/rework-llm-billingfrom
feat/ENG-2512
May 14, 2026
Merged

Feat/eng 2512#655
ncnthien merged 12 commits into
proj/rework-llm-billingfrom
feat/ENG-2512

Conversation

@ncnthien
Copy link
Copy Markdown
Collaborator

Summary

  • Problem:
  • Why it matters:
  • What changed:
  • What did NOT change (scope boundary):

Type of change

  • Bug fix
  • New feature
  • Refactor (no behavior change)
  • Documentation
  • Test
  • Chore (build, dependencies, CI)

Scope (select all touched areas)

  • TUI / REPL
  • Agent / Tools
  • LLM Providers
  • Server / Daemon
  • Shared (constants, types, transport events)
  • CLI Commands (oclif)
  • Hub / Connectors
  • Cloud Sync
  • CI/CD / Infra

Linked issues

  • Closes #
  • Related #

Root cause (bug fixes only, otherwise write N/A)

  • Root cause:
  • Why this was not caught earlier:

Test plan

  • Coverage added:
    • Unit test
    • Integration test
    • Manual verification only
  • Test file(s):
  • Key scenario(s) covered:

User-visible changes

List user-visible changes (including defaults, config, or CLI output).
If none, write None.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording

Checklist

  • Tests added or updated and passing (npm test)
  • Lint passes (npm run lint)
  • Type check passes (npm run typecheck)
  • Build succeeds (npm run build)
  • Commits follow Conventional Commits format
  • Documentation updated (if applicable)
  • No breaking changes (or clearly documented above)
  • Branch is up to date with main

Risks and mitigations

List real risks for this PR. If none, write None.

  • Risk:
    • Mitigation:

leehpham and others added 11 commits May 7, 2026 22:16
* feat: [ENG-2602] write semantic frontmatter on synthesized files

Synthesized files (created by `brv dream`'s synthesize operation) wrote
only four frontmatter fields (confidence, sources, synthesized_at, type).
The cogit backend therefore exposed empty title / short_description /
tags in DtoV3MemoryCardResource and the web UI's card-mode display
rendered empty preview slots.

writeSynthesisFile now also emits the seven semantic fields that regular
`brv save` files write (title, summary, tags, related, keywords,
createdAt, updatedAt). The four synthesis markers stay at the top of the
frontmatter in their existing on-disk order so re-generating an old
synthesized file does not produce a mechanical reorder diff. The seven
new fields are appended in the same order generateFrontmatter uses in
markdown-writer.ts so the on-disk shape matches regular files.

The synthesis LLM prompt and SynthesisCandidateSchema are extended so
the model produces meaningful summary / tags / keywords as part of the
synthesis JSON; related is left empty since synthesized files use
`sources` for provenance.

Forward-looking only — existing synthesized files keep their four-field
shape until manually regenerated. `needsReview` gating (confidence < 0.7)
is unchanged.

* fix: [ENG-2602] align YAML flow style across dream writers

Synthesized files (`synthesize.ts`) and consolidate's `addFrontmatterFields`
helper produced block-style array frontmatter (`tags:\n  - a\n  - b`)
instead of the flow style (`tags: [a, b, c]`) that `markdown-writer.ts`
uses for regular `brv curate` output. Result: three on-disk dialects in
the same context tree — regular files vs synthesized files vs any file
consolidate had rewritten.

- synthesize.ts: pass `flowLevel: 1` to yamlDump so newly synthesized
  files emit flow-style arrays matching markdown-writer.
- consolidate.ts addFrontmatterFields: drop both yamlDump sites from
  flowLevel: 2 to flowLevel: 1 so TEMPORAL_UPDATE / MERGE rewrites stop
  reflowing brackets into block style on every consolidate pass.

New regression test asserts a TEMPORAL_UPDATE round-trip preserves
flow-style arrays.

* refactor: [ENG-2602] address review-agent feedback on synthesize schema and writer

- SynthesisCandidateSchema: add max() bounds (summary 500, tags 8,
  keywords 15) so a misbehaving model cannot land oversized text or
  unbounded arrays directly in YAML frontmatter and on the card. Bounds
  sit slightly above the prompt's soft targets to avoid false rejects.
- writeSynthesisFile: normalize tags to lowercase kebab-case in code
  rather than trusting the LLM to honor the prompt rule, so card chips
  and BM25 search see consistent labels.
- synthesize.test.ts: pin flow-style array output with a
  `^tags: \[/m` regex assertion so a future flowLevel revert in the
  writer fails the synthesis test directly, not only the consolidate
  test.
#618)

* fix: [ENG-2676] strip derived-artifact paths from related: frontmatter

`*.abstract.md` and `*.overview.md` (and other derived artifacts) are
excluded from the push payload via isExcludedFromSync(). When they
landed in a doc's `related:` field, the parent doc shipped to remote
with dangling references to siblings that were never sent.

Two write paths could produce them; both now filter:

- consolidate.ts `addRelatedLinks` (dream's CROSS_REFERENCE) filters
  incoming AND pre-existing entries, so each touch opportunistically
  cleans up legacy dangling refs.
- curate-tool.ts `filterValidFiles` drops excluded paths from
  `relations` before MarkdownWriter writes them — covers both ADD and
  UPDATE since both call sites share the helper.

Pre-existing dangling refs in untouched docs are out of scope; a
separate migration can scrub them.

* fix: [ENG-2676] address PR #618 review — filter existing relations on UPDATE

PR #618 review flagged that the curate UPDATE path could still leak
derived-artifact entries into `related:` because `existingParsed.relations`
flowed unfiltered into both `detectStructuralLoss` (false-positive
`lostRelations`, bumping impact to high) and `resolveStructuralLoss`
(union-merging the legacy bad data back into the final write).

Filter `existingParsed.relations` through isExcludedFromSync immediately
after parse so all downstream conflict logic sees a clean view. Adds an
UPDATE-path test that pre-seeds a stale `.abstract.md` entry and asserts
it is stripped.

Also fixes the reviewer nit on consolidate.ts:681 — early-return when
`incoming` is empty and there's no existing frontmatter, so we don't
wrap an otherwise-clean file in `---\nrelated: []\n---\n`.

* fix: [ENG-2676] address PR #618 follow-up — tighten 4 non-blocking items

Round 2 of review-driven cleanup. None blocking, all small.

- Filter executeCrossReference target files: skip derived-artifact paths
  in `action.files` so addRelatedLinks never writes `related:` onto a
  `.abstract.md` / `.overview.md` if the LLM ever proposes one.
- addRelatedLinks: don't introduce `related: []` into a file whose
  frontmatter never had the key, when the merged set is empty.
- Tighten existingParsed.relations guard from truthy to length-aware so
  the no-op filter is actually skipped on empty arrays.
- UPDATE relations test: assert `impact === 'low'` to lock in the
  no-false-positive structural-loss-elevation invariant. Also retains
  the legitimate sibling in proposed so the only "lost" entry would be
  the filtered derived one — the assertion now genuinely exercises the
  filter's effect on conflict-detector counting.
* feat: [ENG-2486] Extend transport schemas with provider/model (#554)

* feat: [ENG-2487] TaskRouter stamps provider + model on task:create (#555)

* feat: [ENG-2488] Wire resolveActiveProvider in daemon bootstrap (#556)

* feat: [ENG-2491] Domain entity + store interface (Level 2 schema) (#559)

* feat: [ENG-2492] FileTaskHistoryStore: Index + Data API (basic save/get/list)

* feat: [ENG-2498] Transport schemas (pagination, get, delete events) (#560)

* feat: [ENG-2495] FileTaskHistoryStore: Delete + Clear API (#561)

* feat: [ENG-2494] FileTaskHistoryStore: Stale recovery on read (#562)

* feat: [ENG-2496] Daemon llmservice accumulator + real-time TaskHistoryHook (#565)

* feat: [ENG-2497] Daemon bootstrap wiring + per-project store factory + startup audit (#567)

* feat: [ENG-2499] Daemon handlers (list paginated, get, delete, bulk, clear) + task:deleted broadcast (#568)

* feat: [ENG-2493] FileTaskHistoryStore: Prune + Compaction (#569)

* Feat/eng 2515 (#575)

* feat: [ENG-2515] Task history persistence — correctness fixes before WebUI ships

* fix: [ENG-2515] retention design + WebUI provider/model payload

Two correctness fixes surfaced while preparing the M2.10–M2.13 WebUI
handover:

1. Disable age-based task history retention by default. Age prune (30
   days) was a design mistake: it silently deletes user tasks even when
   the count cap (1000) has plenty of room left. Task history is a
   business artifact for audit/review, not a log — count-based rotation
   is the correct retention policy. Set the default to 0; deployments
   that want time-based eviction can still opt in via the constructor
   option.

2. Expose provider/model on the WebUI task:created subscriber. Daemon
   stamps both fields on every task:create (M1) but the local
   TaskCreatedPayload omitted them, so the chip M1.04 builds on would
   have no source. Extend the payload type and forward both fields into
   upsertStatus.

* Feat/eng 2515 (#576)

* feat: [ENG-2515] Task history persistence — correctness fixes before WebUI ships

* fix: [ENG-2515] retention design + WebUI provider/model payload

Two correctness fixes surfaced while preparing the M2.10–M2.13 WebUI
handover:

1. Disable age-based task history retention by default. Age prune (30
   days) was a design mistake: it silently deletes user tasks even when
   the count cap (1000) has plenty of room left. Task history is a
   business artifact for audit/review, not a log — count-based rotation
   is the correct retention policy. Set the default to 0; deployments
   that want time-based eviction can still opt in via the constructor
   option.

2. Expose provider/model on the WebUI task:created subscriber. Daemon
   stamps both fields on every task:create (M1) but the local
   TaskCreatedPayload omitted them, so the chip M1.04 builds on would
   have no source. Extend the payload type and forward both fields into
   upsertStatus.

* fix: [ENG-2515] eliminate task-history test flake — close cache-invalidation race and add deterministic flush API

Two pathologies surfaced as ~13% flake on local and similar on CI:

1. Cache-invalidation race in production code. A firePrune pass that
   started before a save() could finish AFTER the save's
   indexCache = undefined, overwriting the invalidation with a snapshot
   missing the just-saved row. Symptom: list(), clear(), deleteMany()
   returning N-1 entries after N awaited saves. Fix: bump indexEpoch on
   every index write (save / tombstone / recovery / compaction);
   doReadIndexDedup samples the epoch at the start and skips its
   setCache step if the epoch advanced. Also clear indexDedupInFlight
   on every write so a list() cannot pick up a pre-write in-flight
   promise.

2. Timing-based polling in tests. waitForPruneToSettle polled at 80 x
   4ms with a 4-stable-check exit, which under CI parallel-worker load
   missed the setTimeout(0) macrotask window. Fix: expose
   flushPendingOperations() — each firePrune wraps its setTimeout in a
   promise assigned to pruneChain; the public method awaits the chain
   (loops because pruneRequested re-extends it) and drains
   operationLock. Test callers replace polling with this single
   await store.flushPendingOperations().

Plus: tempDir entropy upgraded from Math.random() to randomUUID() in
two test files to remove a residual collision risk in CI parallel runs.

Verification: a new race-regression test
(file-task-history-store-cache-race.test.ts) reproduces the race at
~27% rate on production code and drops to 0% post-fix; running each
affected test file 30x consecutively yields 120/120 pass; full suite
npm test reports 7103 passing, 0 failing.

* fix: [ENG-2515] FileTaskHistoryStore — chunk tombstone appends under 4 KB to keep concurrent saves from corrupting JSON lines

Large clear() / deleteMany() batches previously issued a single appendFile
of all tombstones joined. POSIX guarantees nothing about regular-file write
atomicity (PIPE_BUF applies to FIFOs/sockets only). On filesystems that
don't serialize appends per inode, an unlocked concurrent save() can
interleave content into the middle of a multi-line tombstone write,
corrupting a JSON line. The line is then silently skipped by
IndexLineSchema.safeParse, so the tombstone is "missing" from the dedup
map even though the data file was already unlinked → list() returns a
ghost row whose getById() returns undefined.

Fix: introduce chunkLinesByBytes (exported for direct unit-test) and
issue tombstone appends in chunks under MAX_APPEND_CHUNK_BYTES (3.5 KB).
Each individual appendFile call stays under the 4 KB page boundary so
the kernel page-cache write path treats it as a single sector-level
write on common filesystems. Chunks are sequential under the existing
withOperationLock so concurrent saves can only land BETWEEN chunks,
never WITHIN one.

Tests: chunker unit test covers boundary + oversized-line + 200-tombstone
realistic case; integration test covers (a) clear() with 200 entries
removing all tombstones + data files, (b) every appendFile chunk well-formed
JSON post-clear, (c) clear() interleaved with 50 concurrent fresh saves
keeping the index parseable.

* refactor: [ENG-2515] move TaskHistoryEntry into shared/ + runtime layer-isolation test

The persisted-entry shape lived in src/server/core/domain/entities/. The
WebUI Tasks tab needs the same TS type (TaskGetResponse, stored detail),
which forced a webui→server import — inverting the boundary the ESLint
rule on tui/ already enforces.

Move the TS shape (TaskHistoryEntry, TaskHistoryEntryBase, TaskErrorData,
TaskHistoryStatus, TASK_HISTORY_SCHEMA_VERSION) into
src/shared/transport/events/task-events.ts. The Zod schema stays in
src/server/core/domain/entities/task-history-entry.ts and now carries
`satisfies z.ZodType<TaskHistoryEntry>`, so any drift between the runtime
schema and the type is a typecheck error. The entity file re-exports the
type so existing server-side imports continue to compile unchanged.

test/unit/shared/layer-isolation.test.ts: runtime walk over src/shared/
asserting no `from '...server/...'` (or other higher layers) appears in
any .ts/.tsx file. Catches regressions even if a future ESLint rule is
bypassed.

Incidental: src/server/infra/process/task-router.ts picks up formatting
churn from the prettier/eslint-fix pass on the same change set (line
wrapping only, no logic change).

* feat: [ENG-2490] add provider chip column to WebUI task list

* feat: [ENG-2500] wire WebUI delete + bulk delete + clear-completed mutations

* feat: [ENG-2501] render task detail from disk via task:get for cold tasks

* feat: [ENG-2503] dim styling + tooltip for daemon-interrupted task entries

* feat: [ENG-2490] cover (provider, '') edge case in formatProviderModel

* feat: [ENG-2500] keep selection on bulk-delete failure + cover Delete failed fallback

* feat: [ENG-2501] gate task:get on live status + pin enabled + infinite staleTime

* feat: [ENG-2503] tighten isInterrupted status type + cover precedence

* feat: [ENG-2503] use design-system Tooltip for interrupted-row hint

* feat: [ENG-2539] M2.16 task:list server-side filter/search + numbered… (#594)

* feat: [ENG-2539] M2.16 task:list server-side filter/search + numbered pagination

Schema: drop cursor (before/limit/nextCursor), add page/pageSize/total/pageCount
+ searchText, provider[], model[], createdAfter/Before, minDuration/maxDurationMs.
Response adds counts (Model A: matches current filter scope), availableProviders,
availableModels (history-derived, exclude pivots so dropdowns dont shrink).

Handler: 2-pass search. Pass-1 over index summary (content + error.message).
Pass-2 lazy-cracks data file via store.getById for completed tasks where full
result text might match. In-memory tasks read task.result directly (no I/O).
Race-safe: getById errors swallowed with log-once-per-query.

Store: ITaskHistoryStore.list rename after/before -> createdAfter/createdBefore,
add provider/model coarse pre-filter, drop limit (handler paginates).

Tests: 5 new test files updated. New cases cover all 7 filter dims, AND
combination, pass-2 happy path (full result match past byte 1500), pass-2
in-memory path, file-race swallow, counts Model A invariant counts.all === total,
availableProviders/Models exclude pivots + symmetric empty-string guard.

* fix: [ENG-2539] M2.16 PR review fixes

Address all 8 inline comments from PR #594:

- Docs: counts is Model A post-filter, counts.all === total invariant
  (was incorrectly described as pre-status-filter)
- Docs: page is echoed back as-sent without upper-bound clamp against
  pageCount; documented contract on JSDoc
- Interface: document why provider/model/status options remain on
  ITaskHistoryStore.list (direct-caller use; handler bypasses pivots)
- mapBounded signature: explicit fn returns R | undefined
- handleTaskList: refactor 2-pass to single-pass over candidatesNoSearch;
  derive merged inline; pass-2 reuses the map (was 2xN, now 1xN)
- Comment: clarify pass2Filter spread auto-inherits new non-pivot dims
- Schemas: bind via satisfies z.ZodType<X> to task-events.ts interfaces;
  re-export TaskListRequest/Response/Counts/AvailableModel from shared so
  the interface is single source of truth (compile-time drift prevention)
- Tests: add maxDurationMs upper-bound + no-store fallback coverage

* feat: [ENG-2502] paginate task list with useInfiniteQuery + Load more table row

* feat: [ENG-2502] keyboard-accessible Load more row + indent describe blocks

* feat: [ENG-2547] M2.17 WebUI numbered pagination + Filter dropdown + active filter tags

* feat: [ENG-2547] address bot review: memo serverStatus + closeTask, dedup useGetProviders + isDurationPreset, memo tags

* feat: [ENG-2610] M2.18 pagination polish + default page size 20 + friendly provider names

* feat: [ENG-2610] memoize providerNames against providersResponse, not derived array

---------

Co-authored-by: ncnthien <nhatthien185@gmail.com>
* chore: Bump veversion to 3.12.0

* chore: Update CLAUDE.md

* chore: Update CHANGELOG.md
- Drop clawhub preflight; install skill via `npx clawhub@latest`
- Auto-install brv CLI via install.sh when missing instead of erroring
- Remove provider preflight check; surface "connect a provider" as a
  post-install next step in print_success
- Remove Daily Knowledge Mining cron feature (configure_daily_mining,
  check_cron_exists, find_cron_job_id, remove_cron_job)
- Remove ByteRover Onboarding Plugin feature (file generation, config
  enable, --reset-onboarding flag, reset_onboarding)
- Make @byterover/byterover plugin install failure fatal (was warn)
- Drop unbound $ONBOARDED_MARKER reference in print_success that
  crashed every install under `set -eu`
- Remove leftover onboarding dead code: ONBOARDING_PLUGIN_DIR const,
  enable_onboarding_plugin_in_config, disable_onboarding_plugin_in_config,
  remove_onboarding_plugin_files, and the byterover-onboarding grep in
  ensure_plugin_active
- Drop orphaned "--- Onboarding Options ---" header in main()
- Announce the install.sh URL before piping to sh for auditability
- Remove stale "Daily Knowledge Mining" header comment
- Add trailing newline at EOF
feat: [ENG-2764] simplify openclaw-setup installer flow
* fix: [ENG-2221] disambiguate shared rules files via footer agent name

When Amp / Codex / OpenCode share AGENTS.md (and Claude Code / OpenClaude
share CLAUDE.md), the rules connector's status check previously reported
installed:true for every agent that maps to the file as soon as the BRV
markers were present. The user only installed once, but `brv connectors
list` showed up to three rules entries.

Read the existing `Generated by ByteRover CLI for X` footer that the
template service already writes inside the BRV section, and require the
extracted agent name to match the agent being queried. Footer-less
files (legacy installs predating the footer) keep today's behavior so
existing users are not surprised.

The disambiguation lives only in the rules connector. The MCP, hook,
and skill connectors do not have a shared-file collision today: the only
manual-mode MCP that reads AGENTS.md content is OpenCode, and no other
agent overlaps with it on that path. The new helper
`extractInstalledAgentFromBrvSection` is generic and can be reused if
that situation changes.

* refactor: [ENG-2221] address review feedback on tag boundary and test descriptions

- Tighten `BRV_RULE_TAG` matching to require the trailing space delimiter
  so a malformed footer like `Generated by ByteRover CLIforXxx` no longer
  matches. Add a unit test pinning the new boundary.
- Rename five "returns null" test cases to "returns undefined" to match
  the helper's actual return type and the project's nullish convention.

---------

Co-authored-by: bao-byterover <bao@byterover.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 14, 2026

Claude finished @ncnthien's task in 6m 32s —— View job


Claude Code Review — PR #655

Overall score: ⭐⭐⭐⭐ 4 / 5 — Solid, well-architected feature with comprehensive coverage; a few non-blocking refinement opportunities flagged inline.

Todo

  • Read repo CLAUDE.md and survey PR diff
  • Review domain entities + transport schemas
  • Review FileTaskHistoryStore + chunk/race/tombstone tests
  • Review TaskHistoryHook / cache / entry-builder
  • Review task-router additions (M2.16 pagination, M2.09 wire handlers)
  • Review brv-server.ts startup wiring
  • Review connectors (rules-connector, footer agent disambiguation)
  • Review webui APIs / components / hooks
  • Review dream operations (consolidate, synthesize)
  • Review curate-tool derived-artifact filter
  • Post inline comments + summary

What I checked

This PR layers a per-project task history journal (Level 1 index + Level 2 detail), the wire handlers / Web-UI surface, dream/curate derived-artifact filtering, and a connector-footer fix for shared rule files. The bulk is FileTaskHistoryStore (1042 lines, 1700+ test lines), TaskRouter extensions (~800 added), and the Web UI Tasks tab refactor.

Correctness

  • FileTaskHistoryStore race-handling is genuinely thoughtful — the indexEpoch monotonic counter, the C0 daemon-startup gate (lastSavedAt < daemonStartedAt), and the recoverPreRenameSaves re-read after compaction are exactly the right invariants. The test suite covers B2 / C1 / N2 races by name.
  • TASK_ID_PATTERN = /^[\w-]+$/ is applied before every dataPath(), blocking path traversal at the boundary.
  • handleTaskDelete.removed cleanly distinguishes real deletion from idempotent no-op, fixing the C4 inflate-on-bulk concern.
  • handleTaskSkippedByPreCheck correctly avoids agentPool.notifyTaskCompleted and the onTaskCompleted hooks for pre-dispatch skips — this avoids the activeTasks-counter undercount that would have let drainQueue over-dispatch.

Security

  • Transport schemas are .strict() where appropriate (TaskListRequestSchema, TaskListResponseSchema). taskIds are pattern-checked before any disk path is constructed.
  • Discriminated union on TaskHistoryEntry with schemaVersion literal catches on-disk drift at the parse boundary.
  • extractInstalledAgentFromBrvSection correctly slices only inside BRV_RULE_MARKERS, so user content outside the section can't spoof the footer.

Code quality

  • Strict TDD shows — nearly every new module ships with a sibling test file, including failure-path coverage (file-task-history-store-cache-race.test.ts, file-task-history-store-tombstone-chunking.test.ts, task-history-startup-audit.test.ts).
  • shared/ layer-isolation test (test/unit/shared/layer-isolation.test.ts) enforces the foundation rule from CLAUDE.md — nice touch.
  • The Outside-In flow is visible: TaskInfoTaskHistoryEntry is driven by what the Web UI actually consumes (Sheet detail view, paginated list); no speculative entity types.

Inline findings (all non-blocking)

  1. task-history-store-cache.ts — module-scoped stores / auditedProjects maps grow unbounded; consider LRU/TTL eviction.
  2. task-router.ts:482accumulateLlmEvent CHUNK reasoning subtly preserves isThinking: true when content is appended to a non-empty thinking marker, diverging from the TUI store's behavior.
  3. file-task-history-store.ts:379save() writes pretty-printed JSON of the full entry on every 100 ms throttle; minor at typical sizes, worth tuning for long streamed tasks.
  4. use-task-filter-params.ts:123 — no client-side upper clamp on pageSize parsed from the URL; server clamps to 1000 but the FE state can hold an arbitrary number.
  5. task-router.ts:1264handleTaskList is a 220-line method with disabled complexity rule; suggestion to split into mergeCandidates / runPass2Search / derivePivots helpers for testability.

What I did NOT review in depth

  • package-lock.json changes — assumed safe (typecheck/lint/test gates cover this).
  • scripts/openclaw-setup.sh (large deletion from prior merge of feat: [ENG-2764] simplify openclaw-setup installer flow #635, not part of ENG-2512 scope).
  • WebUI styling details (Tailwind class strings, design fidelity) — assumed visually verified in browser.

• branch feat/ENG-2512

Comment thread src/server/infra/process/task-history-store-cache.ts
Comment thread src/server/infra/process/task-router.ts
Comment thread src/server/infra/storage/file-task-history-store.ts
Comment thread src/webui/features/tasks/hooks/use-task-filter-params.ts
Comment thread src/server/infra/process/task-router.ts
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 14, 2026

Claude finished @ncnthien's task in 3m 31s —— View job


Claude Code Review — PR #655 (re-run)

Overall score: ⭐⭐⭐⭐ 4 / 5 — Solid, well-architected feature with comprehensive coverage; a few non-blocking refinement opportunities flagged inline.

Todo

  • Survey diff vs origin/proj/rework-llm-billing
  • Re-check FileTaskHistoryStore (race + chunking + tombstones)
  • Re-check TaskRouter (accumulator, pagination, lifecycle hooks)
  • Re-check task-history-hook / task-history-store-cache
  • Re-check dream / curate derived-artifact filter
  • Re-check connectors (rules-connector, footer agent disambiguation)
  • Re-check Web UI Tasks tab (filter params, pagination, table)
  • Confirm prior inline findings vs current branch state

What changed since the prior review

The only commits added on top of the previously reviewed state are two upstream merges:

No new behavioral or schema changes were introduced on the ENG-2512 line itself between the two reviews. Re-scanning the touched modules confirms the same findings still apply, all non-blocking.

What I checked (re-confirmation)

Correctness

  • FileTaskHistoryStoreindexEpoch monotonic counter, the C0 daemon-startup gate, and recoverPreRenameSaves post-rewrite re-read remain in place at file-task-history-store.ts:202, :238, :758. Test names still cover B2/C1/N2 races.
  • TASK_ID_PATTERN = /^[\w-]+$/ enforced at every disk-path boundary (file-task-history-store.ts:93, applied in save/delete/getById).
  • handleTaskDelete.removed still distinguishes real deletion from idempotent no-op (task-router.ts:1106-1113), so deleteBulk.deletedCount cannot inflate on unknown ids.
  • handleTaskSkippedByPreCheck still bypasses agentPool.notifyTaskCompleted and onTaskCompleted hooks for pre-dispatch skips — activeTasks counter stays correct.
  • flushTimer.unref?.() (task-router.ts:1571) keeps a pending throttle flush from blocking daemon shutdown.

Security

  • Transport schemas remain .strict() for TaskListRequest/Response.
  • extractInstalledAgentFromBrvSection only slices inside BRV_RULE_MARKERS (shared/constants.ts:15-20), so user content outside the markers can't spoof the footer.
  • Discriminated union on TaskHistoryEntrySchema (task-history-entry.ts:85-105) still rejects on-disk drift at parse time.

Code quality

  • Strict TDD coverage holds — every new module has a sibling test.
  • shared/ layer-isolation test still enforces foundation-only imports.
  • buildTaskHistoryEntryCandidate continues to dedupe what was previously duplicated between task-router.handleTaskGet and TaskHistoryHook.persist.

Inline findings (still applicable, non-blocking)

The four inline comments posted on the prior pass still hold against the current tree:

  1. task-history-store-cache.ts:22 — module-scoped stores / auditedProjects maps grow unbounded for the daemon's lifetime; LRU/TTL eviction or ProjectRegistry.onProjectClosed wiring is a worthwhile follow-up. Fix this →
  2. task-router.ts:497accumulateLlmEvent CHUNK Case B appends body content to a non-empty {isThinking: true} marker while preserving isThinking: true, diverging from the TUI store (tui/features/tasks/stores/tasks-store.ts) which unconditionally flips to isThinking: false on chunk arrival.
  3. file-task-history-store.ts:379save() writes pretty-printed JSON (null, 2) of the FULL entry on every 100ms throttle. For long streamed tasks this is O(N²) cumulative bytes. Dropping pretty-print is a free 2× win; skipping the data-file write entirely for in-flight created/started and persisting only on terminal status is a larger win the throttle currently bounds. Fix this →
  4. use-task-filter-params.ts:123pageSize parsed from URL has no upper clamp; server clamps to 1000 (task-router.ts:1479) but the FE state can hold an arbitrary number from ?pageSize=99999, polluting the React Query cache key.
  5. task-router.ts:1265handleTaskList is a 220-line method with eslint-disable-next-line complexity; extracting mergeCandidates / runPass2Search / derivePivots helpers would let each get unit-tested in isolation rather than only via end-to-end task-router-extended.test.ts.

What I did NOT review in depth


— branch feat/ENG-2512

@ncnthien ncnthien merged commit b7660fb into proj/rework-llm-billing May 14, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants