Feat/eng 2515 by bao-byterover · Pull Request #576 · campfirein/byterover-cli

bao-byterover · 2026-04-29T00:59:43Z

No description provided.

…WebUI ships

Two correctness fixes surfaced while preparing the M2.10–M2.13 WebUI handover: 1. Disable age-based task history retention by default. Age prune (30 days) was a design mistake: it silently deletes user tasks even when the count cap (1000) has plenty of room left. Task history is a business artifact for audit/review, not a log — count-based rotation is the correct retention policy. Set the default to 0; deployments that want time-based eviction can still opt in via the constructor option. 2. Expose provider/model on the WebUI task:created subscriber. Daemon stamps both fields on every task:create (M1) but the local TaskCreatedPayload omitted them, so the chip M1.04 builds on would have no source. Extend the payload type and forward both fields into upsertStatus.

…idation race and add deterministic flush API Two pathologies surfaced as ~13% flake on local and similar on CI: 1. Cache-invalidation race in production code. A firePrune pass that started before a save() could finish AFTER the save's indexCache = undefined, overwriting the invalidation with a snapshot missing the just-saved row. Symptom: list(), clear(), deleteMany() returning N-1 entries after N awaited saves. Fix: bump indexEpoch on every index write (save / tombstone / recovery / compaction); doReadIndexDedup samples the epoch at the start and skips its setCache step if the epoch advanced. Also clear indexDedupInFlight on every write so a list() cannot pick up a pre-write in-flight promise. 2. Timing-based polling in tests. waitForPruneToSettle polled at 80 x 4ms with a 4-stable-check exit, which under CI parallel-worker load missed the setTimeout(0) macrotask window. Fix: expose flushPendingOperations() — each firePrune wraps its setTimeout in a promise assigned to pruneChain; the public method awaits the chain (loops because pruneRequested re-extends it) and drains operationLock. Test callers replace polling with this single await store.flushPendingOperations(). Plus: tempDir entropy upgraded from Math.random() to randomUUID() in two test files to remove a residual collision risk in CI parallel runs. Verification: a new race-regression test (file-task-history-store-cache-race.test.ts) reproduces the race at ~27% rate on production code and drops to 0% post-fix; running each affected test file 30x consecutively yields 120/120 pass; full suite npm test reports 7103 passing, 0 failing.

…4 KB to keep concurrent saves from corrupting JSON lines Large clear() / deleteMany() batches previously issued a single appendFile of all tombstones joined. POSIX guarantees nothing about regular-file write atomicity (PIPE_BUF applies to FIFOs/sockets only). On filesystems that don't serialize appends per inode, an unlocked concurrent save() can interleave content into the middle of a multi-line tombstone write, corrupting a JSON line. The line is then silently skipped by IndexLineSchema.safeParse, so the tombstone is "missing" from the dedup map even though the data file was already unlinked → list() returns a ghost row whose getById() returns undefined. Fix: introduce chunkLinesByBytes (exported for direct unit-test) and issue tombstone appends in chunks under MAX_APPEND_CHUNK_BYTES (3.5 KB). Each individual appendFile call stays under the 4 KB page boundary so the kernel page-cache write path treats it as a single sector-level write on common filesystems. Chunks are sequential under the existing withOperationLock so concurrent saves can only land BETWEEN chunks, never WITHIN one. Tests: chunker unit test covers boundary + oversized-line + 200-tombstone realistic case; integration test covers (a) clear() with 200 entries removing all tombstones + data files, (b) every appendFile chunk well-formed JSON post-clear, (c) clear() interleaved with 50 concurrent fresh saves keeping the index parseable.

…er-isolation test The persisted-entry shape lived in src/server/core/domain/entities/. The WebUI Tasks tab needs the same TS type (TaskGetResponse, stored detail), which forced a webui→server import — inverting the boundary the ESLint rule on tui/ already enforces. Move the TS shape (TaskHistoryEntry, TaskHistoryEntryBase, TaskErrorData, TaskHistoryStatus, TASK_HISTORY_SCHEMA_VERSION) into src/shared/transport/events/task-events.ts. The Zod schema stays in src/server/core/domain/entities/task-history-entry.ts and now carries `satisfies z.ZodType<TaskHistoryEntry>`, so any drift between the runtime schema and the type is a typecheck error. The entity file re-exports the type so existing server-side imports continue to compile unchanged. test/unit/shared/layer-isolation.test.ts: runtime walk over src/shared/ asserting no `from '...server/...'` (or other higher layers) appears in any .ts/.tsx file. Catches regressions even if a future ESLint rule is bypassed. Incidental: src/server/infra/process/task-router.ts picks up formatting churn from the prettier/eslint-fix pass on the same change set (line wrapping only, no logic change).

Conflicts: src/server/infra/process/task-router.ts src/server/infra/storage/file-task-history-store.ts test/integration/infra/storage/file-task-history-store.test.ts test/integration/scenarios/task-history-webui-flow.test.ts

github-actions · 2026-04-29T02:01:22Z

Claude finished @bao-byterover's task in 3m 29s —— View job

PR Review: Feat/ENG-2515

Gather context and read changed files
Review task-history-entry.ts and task-events.ts
Review file-task-history-store.ts (core logic)
Review task-router.ts
Review test files
Post final review summary

Overall Score: 4 / 5

Good, well-documented work. The three concurrent-access fixes (Category B cache-invalidation race, tombstone chunking, and flushPendingOperations) are technically sound and backed by focused regression tests. The TaskHistoryEntry migration to shared/ is the right move and is guarded by a static layer-isolation test. Minor issues noted below.

What's Good

suggestion (non-blocking): The epoch-based cache invalidation in doReadIndexDedup (indexEpoch) is an elegant and correct solution to the Category B race. Sampling the epoch before I/O and refusing to cache if it advanced during the pass is the canonical generation-counter pattern for this class of problem.

suggestion (non-blocking): chunkLinesByBytes is a clean pure function, easy to test in isolation, and the "oversized line emitted as its own chunk" behaviour is both correct and documented. The unit test covering 200 realistic tombstone lines gives real confidence against regression.

suggestion (non-blocking): Moving TaskHistoryEntry / TaskHistoryStatus / ToolCallEvent etc. into shared/transport/events/task-events.ts is the right layer placement. The layer-isolation.test.ts enforces the invariant statically without needing CI to exercise a runtime import.

Issues

issue — flushPendingOperations() is test-only but part of the public API (src/server/infra/storage/file-task-history-store.ts:301)

The doc comment says "Test-only contract: production daemon never calls this", but the method is async flushPendingOperations() on the exported class. Any production consumer of ITaskHistoryStore could accidentally call it. TypeScript has no @internal access modifier, but marking it with a /** @internal */ JSDoc annotation and omitting it from the ITaskHistoryStore interface would make the contract clear.

Fix this →

issue — sweepOrphanData can delete a data file for a save() that races between recoverPreRenameSaves's readdir and sweepOrphanData's readdir (src/server/infra/storage/file-task-history-store.ts:637-640)

Sequence:

maybeCompact() rewrites the index; post-rewrite re-read populates postRewriteLiveIds.
recoverPreRenameSaves calls readdir — misses a concurrent save() that writes its data file after this point.
save() appends its index line.
sweepOrphanData calls its own readdir, finds the new data file, does not see its taskId in postRewriteLiveIds, and deletes it.
The index now has a live entry whose data file is gone → getById returns undefined while list() returns the row.

This is a pre-existing window (not introduced here), but the C1 fix closes the pre-rename half of the race without closing this post-recover-readdir half. The simplest mitigation is to re-read the index once more inside sweepOrphanData before issuing unlinks, or to make sweepOrphanData accept the tombstone set and skip taskIds that have an index line.

Fix this →

nitpick — flushPendingOperations() spin-loop has no iteration ceiling (src/server/infra/storage/file-task-history-store.ts:301-311)

let last = this.pruneChain
await last
while (last !== this.pruneChain) {
  last = this.pruneChain
  await last
}

In a test that fires saves rapidly and continuously, this loop could iterate for a very long time. The documented invariant (at most one re-queued pass via pruneRequested) bounds it to two passes in normal operation, but there's no guard. A max-iteration count or a timeout would make misuse more obvious.

nitpick — IndexTombstoneSchema optional schemaVersion accepts any future schema version (src/server/infra/storage/file-task-history-store.ts:82-87)

schemaVersion: z.literal(TASK_HISTORY_SCHEMA_VERSION).optional(),

z.literal(1).optional() means a tombstone with schemaVersion: 2 (from a future schema bump) would fail to parse (literal mismatch). The intent is back-compat with bare {_deleted, taskId} lines (no schemaVersion key), but if the schema version is ever bumped you'll also need to update this schema or tombstones from the new version become invisible. A comment noting this is worth adding; the current doc comment only mentions the back-compat direction.

nitpick — toListItem in task-router.ts treats task.files inconsistently with task.folderPath (src/server/infra/process/task-router.ts:196-211)

...(task.files && task.files.length > 0 ? {files: task.files} : {}),
...(task.folderPath ? {folderPath: task.folderPath} : {}),

files requires non-empty length, but folderPath only requires truthiness. An empty-array files is silently dropped. The FileTaskHistoryStore's IndexDataLineSchema accepts files: z.array(z.string()).optional() without a length guard, so a round-trip through the store would restore an empty array, creating a subtle inconsistency between in-memory and persisted list items. This is pre-existing but worth noting.

Test Observations

praise — The file-task-history-store-cache-race.test.ts correctly uses 30 × 10 iterations to expose the ~10–20% per-cycle flake rate. The makeEntry factory with a staleThresholdMs: Number.POSITIVE_INFINITY guard is exactly right.

praise — layer-isolation.test.ts using readFileSync + regex on the actual .ts source is a lightweight and reliable static-analysis approach. Walking the entire src/shared/ tree means new files are automatically covered.

praise — The tombstone-chunking test suite covers all three failure modes: correct deletion count, JSON integrity, and interleaved-save parsability.

* feat: [ENG-2486] Extend transport schemas with provider/model (#554) * feat: [ENG-2487] TaskRouter stamps provider + model on task:create (#555) * feat: [ENG-2488] Wire resolveActiveProvider in daemon bootstrap (#556) * feat: [ENG-2491] Domain entity + store interface (Level 2 schema) (#559) * feat: [ENG-2492] FileTaskHistoryStore: Index + Data API (basic save/get/list) * feat: [ENG-2498] Transport schemas (pagination, get, delete events) (#560) * feat: [ENG-2495] FileTaskHistoryStore: Delete + Clear API (#561) * feat: [ENG-2494] FileTaskHistoryStore: Stale recovery on read (#562) * feat: [ENG-2496] Daemon llmservice accumulator + real-time TaskHistoryHook (#565) * feat: [ENG-2497] Daemon bootstrap wiring + per-project store factory + startup audit (#567) * feat: [ENG-2499] Daemon handlers (list paginated, get, delete, bulk, clear) + task:deleted broadcast (#568) * feat: [ENG-2493] FileTaskHistoryStore: Prune + Compaction (#569) * Feat/eng 2515 (#575) * feat: [ENG-2515] Task history persistence — correctness fixes before WebUI ships * fix: [ENG-2515] retention design + WebUI provider/model payload Two correctness fixes surfaced while preparing the M2.10–M2.13 WebUI handover: 1. Disable age-based task history retention by default. Age prune (30 days) was a design mistake: it silently deletes user tasks even when the count cap (1000) has plenty of room left. Task history is a business artifact for audit/review, not a log — count-based rotation is the correct retention policy. Set the default to 0; deployments that want time-based eviction can still opt in via the constructor option. 2. Expose provider/model on the WebUI task:created subscriber. Daemon stamps both fields on every task:create (M1) but the local TaskCreatedPayload omitted them, so the chip M1.04 builds on would have no source. Extend the payload type and forward both fields into upsertStatus. * Feat/eng 2515 (#576) * feat: [ENG-2515] Task history persistence — correctness fixes before WebUI ships * fix: [ENG-2515] retention design + WebUI provider/model payload Two correctness fixes surfaced while preparing the M2.10–M2.13 WebUI handover: 1. Disable age-based task history retention by default. Age prune (30 days) was a design mistake: it silently deletes user tasks even when the count cap (1000) has plenty of room left. Task history is a business artifact for audit/review, not a log — count-based rotation is the correct retention policy. Set the default to 0; deployments that want time-based eviction can still opt in via the constructor option. 2. Expose provider/model on the WebUI task:created subscriber. Daemon stamps both fields on every task:create (M1) but the local TaskCreatedPayload omitted them, so the chip M1.04 builds on would have no source. Extend the payload type and forward both fields into upsertStatus. * fix: [ENG-2515] eliminate task-history test flake — close cache-invalidation race and add deterministic flush API Two pathologies surfaced as ~13% flake on local and similar on CI: 1. Cache-invalidation race in production code. A firePrune pass that started before a save() could finish AFTER the save's indexCache = undefined, overwriting the invalidation with a snapshot missing the just-saved row. Symptom: list(), clear(), deleteMany() returning N-1 entries after N awaited saves. Fix: bump indexEpoch on every index write (save / tombstone / recovery / compaction); doReadIndexDedup samples the epoch at the start and skips its setCache step if the epoch advanced. Also clear indexDedupInFlight on every write so a list() cannot pick up a pre-write in-flight promise. 2. Timing-based polling in tests. waitForPruneToSettle polled at 80 x 4ms with a 4-stable-check exit, which under CI parallel-worker load missed the setTimeout(0) macrotask window. Fix: expose flushPendingOperations() — each firePrune wraps its setTimeout in a promise assigned to pruneChain; the public method awaits the chain (loops because pruneRequested re-extends it) and drains operationLock. Test callers replace polling with this single await store.flushPendingOperations(). Plus: tempDir entropy upgraded from Math.random() to randomUUID() in two test files to remove a residual collision risk in CI parallel runs. Verification: a new race-regression test (file-task-history-store-cache-race.test.ts) reproduces the race at ~27% rate on production code and drops to 0% post-fix; running each affected test file 30x consecutively yields 120/120 pass; full suite npm test reports 7103 passing, 0 failing. * fix: [ENG-2515] FileTaskHistoryStore — chunk tombstone appends under 4 KB to keep concurrent saves from corrupting JSON lines Large clear() / deleteMany() batches previously issued a single appendFile of all tombstones joined. POSIX guarantees nothing about regular-file write atomicity (PIPE_BUF applies to FIFOs/sockets only). On filesystems that don't serialize appends per inode, an unlocked concurrent save() can interleave content into the middle of a multi-line tombstone write, corrupting a JSON line. The line is then silently skipped by IndexLineSchema.safeParse, so the tombstone is "missing" from the dedup map even though the data file was already unlinked → list() returns a ghost row whose getById() returns undefined. Fix: introduce chunkLinesByBytes (exported for direct unit-test) and issue tombstone appends in chunks under MAX_APPEND_CHUNK_BYTES (3.5 KB). Each individual appendFile call stays under the 4 KB page boundary so the kernel page-cache write path treats it as a single sector-level write on common filesystems. Chunks are sequential under the existing withOperationLock so concurrent saves can only land BETWEEN chunks, never WITHIN one. Tests: chunker unit test covers boundary + oversized-line + 200-tombstone realistic case; integration test covers (a) clear() with 200 entries removing all tombstones + data files, (b) every appendFile chunk well-formed JSON post-clear, (c) clear() interleaved with 50 concurrent fresh saves keeping the index parseable. * refactor: [ENG-2515] move TaskHistoryEntry into shared/ + runtime layer-isolation test The persisted-entry shape lived in src/server/core/domain/entities/. The WebUI Tasks tab needs the same TS type (TaskGetResponse, stored detail), which forced a webui→server import — inverting the boundary the ESLint rule on tui/ already enforces. Move the TS shape (TaskHistoryEntry, TaskHistoryEntryBase, TaskErrorData, TaskHistoryStatus, TASK_HISTORY_SCHEMA_VERSION) into src/shared/transport/events/task-events.ts. The Zod schema stays in src/server/core/domain/entities/task-history-entry.ts and now carries `satisfies z.ZodType<TaskHistoryEntry>`, so any drift between the runtime schema and the type is a typecheck error. The entity file re-exports the type so existing server-side imports continue to compile unchanged. test/unit/shared/layer-isolation.test.ts: runtime walk over src/shared/ asserting no `from '...server/...'` (or other higher layers) appears in any .ts/.tsx file. Catches regressions even if a future ESLint rule is bypassed. Incidental: src/server/infra/process/task-router.ts picks up formatting churn from the prettier/eslint-fix pass on the same change set (line wrapping only, no logic change). * feat: [ENG-2490] add provider chip column to WebUI task list * feat: [ENG-2500] wire WebUI delete + bulk delete + clear-completed mutations * feat: [ENG-2501] render task detail from disk via task:get for cold tasks * feat: [ENG-2503] dim styling + tooltip for daemon-interrupted task entries * feat: [ENG-2490] cover (provider, '') edge case in formatProviderModel * feat: [ENG-2500] keep selection on bulk-delete failure + cover Delete failed fallback * feat: [ENG-2501] gate task:get on live status + pin enabled + infinite staleTime * feat: [ENG-2503] tighten isInterrupted status type + cover precedence * feat: [ENG-2503] use design-system Tooltip for interrupted-row hint * feat: [ENG-2539] M2.16 task:list server-side filter/search + numbered… (#594) * feat: [ENG-2539] M2.16 task:list server-side filter/search + numbered pagination Schema: drop cursor (before/limit/nextCursor), add page/pageSize/total/pageCount + searchText, provider[], model[], createdAfter/Before, minDuration/maxDurationMs. Response adds counts (Model A: matches current filter scope), availableProviders, availableModels (history-derived, exclude pivots so dropdowns dont shrink). Handler: 2-pass search. Pass-1 over index summary (content + error.message). Pass-2 lazy-cracks data file via store.getById for completed tasks where full result text might match. In-memory tasks read task.result directly (no I/O). Race-safe: getById errors swallowed with log-once-per-query. Store: ITaskHistoryStore.list rename after/before -> createdAfter/createdBefore, add provider/model coarse pre-filter, drop limit (handler paginates). Tests: 5 new test files updated. New cases cover all 7 filter dims, AND combination, pass-2 happy path (full result match past byte 1500), pass-2 in-memory path, file-race swallow, counts Model A invariant counts.all === total, availableProviders/Models exclude pivots + symmetric empty-string guard. * fix: [ENG-2539] M2.16 PR review fixes Address all 8 inline comments from PR #594: - Docs: counts is Model A post-filter, counts.all === total invariant (was incorrectly described as pre-status-filter) - Docs: page is echoed back as-sent without upper-bound clamp against pageCount; documented contract on JSDoc - Interface: document why provider/model/status options remain on ITaskHistoryStore.list (direct-caller use; handler bypasses pivots) - mapBounded signature: explicit fn returns R | undefined - handleTaskList: refactor 2-pass to single-pass over candidatesNoSearch; derive merged inline; pass-2 reuses the map (was 2xN, now 1xN) - Comment: clarify pass2Filter spread auto-inherits new non-pivot dims - Schemas: bind via satisfies z.ZodType<X> to task-events.ts interfaces; re-export TaskListRequest/Response/Counts/AvailableModel from shared so the interface is single source of truth (compile-time drift prevention) - Tests: add maxDurationMs upper-bound + no-store fallback coverage * feat: [ENG-2502] paginate task list with useInfiniteQuery + Load more table row * feat: [ENG-2502] keyboard-accessible Load more row + indent describe blocks * feat: [ENG-2547] M2.17 WebUI numbered pagination + Filter dropdown + active filter tags * feat: [ENG-2547] address bot review: memo serverStatus + closeTask, dedup useGetProviders + isDurationPreset, memo tags * feat: [ENG-2610] M2.18 pagination polish + default page size 20 + friendly provider names * feat: [ENG-2610] memoize providerNames against providersResponse, not derived array --------- Co-authored-by: ncnthien <nhatthien185@gmail.com>

bao-byterover added 5 commits April 28, 2026 11:11

feat: [ENG-2515] Task history persistence — correctness fixes before …

3a65702

…WebUI ships

bao-byterover requested review from leehpham, ncnthien and wzlng as code owners April 29, 2026 00:59

Merge branch 'proj/persis-task-history' into feat/ENG-2515

253a7ab

Conflicts: src/server/infra/process/task-router.ts src/server/infra/storage/file-task-history-store.ts test/integration/infra/storage/file-task-history-store.test.ts test/integration/scenarios/task-history-webui-flow.test.ts

bao-byterover merged commit 9828bb4 into proj/persis-task-history Apr 29, 2026
6 checks passed

bao-byterover deleted the feat/ENG-2515 branch April 29, 2026 02:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/eng 2515#576

Feat/eng 2515#576
bao-byterover merged 6 commits into
proj/persis-task-historyfrom
feat/ENG-2515

bao-byterover commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bao-byterover commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Feat/ENG-2515

Overall Score: 4 / 5

What's Good

Issues

Test Observations

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Apr 29, 2026 •

edited

Loading