Problem
The reward scoring pipeline processes only ~7 episodes on bootstrap and then goes permanently idle, leaving the vast majority of traces unscored. On a 43 MB database with 3,600 traces, only 45 (1.3%) had r_human scores before fixing.
Root Cause
Three interacting bugs in memory-core.ts:
1. episodeRewardIsDirty() excludes abandoned episodes (line ~1274)
The dirty-check condition only matches episodes with closeReason === "finalized" or recoveryReason === "missed_session_end". But 219 of 224 closed episodes had closeReason: "abandoned", so they silently failed the dirty check on every bootstrap scan.
2. No polling fallback after bootstrap
autoRescoreDirtyClosedEpisodes() is called once during init() and never again. After bootstrap completes, the daemon bridge sits permanently idle with no mechanism to retry missed episodes.
3. Two-bridge isolation (contributing factor)
The viewer daemon bridge (--daemon) and JSON-RPC bridge (--no-viewer) are separate processes with separate in-memory event buses. New captures flow through the JSON-RPC bridge's pipeline; the daemon's reward subscriber never sees those events. The daemon only activates its own pipeline during init().
Why exactly 7 episodes got scored?
Those 7 were open episodes with traces at init-time, processed by recoverOpenEpisodesAsSessionEnd() as "stale." All 219 closed abandoned episodes were silently excluded.
Fix (two changes, both in memory-core.ts)
1. Add closeReason === "abandoned" to episodeRewardIsDirty()
- (meta.closeReason === "finalized" || meta.recoveryReason === "missed_session_end")
+ (meta.closeReason === "finalized" ||
+ meta.closeReason === "abandoned" ||
+ meta.recoveryReason === "missed_session_end")
2. Add periodic rescore timer in init()
const rescoreInterval = setInterval(() => {
void autoRescoreDirtyClosedEpisodes().catch((err) => {
log.debug("periodic_rescore.error", {
err: err instanceof Error ? err.message : String(err),
});
});
}, 10 * 60 * 1000);
(rescoreInterval as unknown as { unref?: () => void }).unref?.();
This covers episodes that miss the startup scan (closed after init, or retry of failed reward runs). The 10-minute interval is safe because autoRescoreDirtyClosedEpisodes has its own 30-second dedup guard.
Results After Fix
Before (3,600 traces):
- r_human scored: 45 (1.3%)
- Large episodes completely skipped: ep_0f9jsh492n40 (1,367 traces), ep_x95apvvw7gdx (444 traces), ep_kt2ds1afhssq (222 traces)
Within 3 minutes of restart:
- r_human: 45 -> 257 (212 freshly scored)
- ep_kt2ds1afhssq (222 traces): scored with rHuman=1.0 in 1.1 seconds
- Estimated full backlog: 60-90 minutes
Environment
- Version: v2.0.5
- DB: 43 MB, WAL mode
- LLM: qwen/qwen3-235b-a22b-2507 via OpenRouter
- Config: lightweightMemory.enabled: false
Problem
The reward scoring pipeline processes only ~7 episodes on bootstrap and then goes permanently idle, leaving the vast majority of traces unscored. On a 43 MB database with 3,600 traces, only 45 (1.3%) had r_human scores before fixing.
Root Cause
Three interacting bugs in
memory-core.ts:1.
episodeRewardIsDirty()excludes abandoned episodes (line ~1274)The dirty-check condition only matches episodes with
closeReason === "finalized"orrecoveryReason === "missed_session_end". But 219 of 224 closed episodes hadcloseReason: "abandoned", so they silently failed the dirty check on every bootstrap scan.2. No polling fallback after bootstrap
autoRescoreDirtyClosedEpisodes()is called once duringinit()and never again. After bootstrap completes, the daemon bridge sits permanently idle with no mechanism to retry missed episodes.3. Two-bridge isolation (contributing factor)
The viewer daemon bridge (
--daemon) and JSON-RPC bridge (--no-viewer) are separate processes with separate in-memory event buses. New captures flow through the JSON-RPC bridge's pipeline; the daemon's reward subscriber never sees those events. The daemon only activates its own pipeline duringinit().Why exactly 7 episodes got scored?
Those 7 were open episodes with traces at init-time, processed by
recoverOpenEpisodesAsSessionEnd()as "stale." All 219 closed abandoned episodes were silently excluded.Fix (two changes, both in
memory-core.ts)1. Add
closeReason === "abandoned"toepisodeRewardIsDirty()2. Add periodic rescore timer in
init()This covers episodes that miss the startup scan (closed after init, or retry of failed reward runs). The 10-minute interval is safe because
autoRescoreDirtyClosedEpisodeshas its own 30-second dedup guard.Results After Fix
Before (3,600 traces):
Within 3 minutes of restart:
Environment