fix: downweight knowledge in recall when session content exists by BYK · Pull Request #432 · BYK/loreai

BYK · 2026-05-20T21:20:40Z

Summary

When scope=all and temporal/distillation results exist, apply 0.6x weight to knowledge BM25 and vector RRF lists. This deprioritizes cross-session LTM entries when session-specific content is available.

Problem

When the model invokes recall with a query like "alternative approach flock locking", an LTM knowledge entry about proper-lockfile (which contains terms like "flock", "advisory locking") outranks the actual session temporal message about "lock file with staleness check". The LTM entry has perfect keyword overlap with the query while the temporal message has weaker BM25 relevance.

Fix

Track whether temporal/distillation results exist across the query expansion loop (hasSessionResults flag). When scope === 'all' and session-specific results exist:

Knowledge BM25 list: weight: 0.6 (was implicit 1.0)
Knowledge vector list: weight: vectorWeight * 0.6 (was vectorWeight)

This ensures session-specific content ranks higher when both sources match, while knowledge still surfaces when no session content exists (e.g., cross-session queries, scope: 'knowledge').

RRF Score Impact (both at rank 0, 2-term query)

Source	Before	After
Knowledge (BM25 + vector)	0.0167 + 0.0250 = 0.042	0.0100 + 0.0150 = 0.025
Temporal (BM25 + recency + vector)	0.0167 + 0.0167 + 0.0250 = 0.058	unchanged

Eval Results (CM-1, 400K inflation)

Score: 4.39 (up from 3.69 baseline). 12 of 15 questions >= 4.0.

Remaining failures (m3, h4) are a query generation problem, not ranking: the distillation compresses away the distinguishing term ("staleness check") so the model can't include it in its recall query.

Tests

1752 pass, 0 fail
Typecheck clean across all 4 packages

When scope=all and temporal/distillation results exist, apply 0.6x weight to knowledge BM25 and vector RRF lists. This deprioritizes cross-session LTM entries when session-specific content is available — temporal details about what actually happened are more likely the answer than general knowledge. Also adds session-affinity RRF boost and scripted assistant storage in eval. Eval score: 3.69 → 4.39 at 400K inflation.

…indow (#435) Updates marketing copy with the latest eval results from the recall quality + distillation transparency work (#428, #430, #431, #432, #433, #434). ### README.md - Context retention table: Medium 2.3→4.1, Hard 3.3→4.8, Average 3.9→4.6 - Lore vs tail-window delta: +50%→+77% - Added footnote: Lore scores averaged across multiple runs; TW/compaction baselines from a prior eval run with the same scenarios - Added v6 to version history ### docs/index.html - Hero stat: +50%→+77% vs tail-window - Detail retention: 4.8→4.6 (overall average across difficulty levels, multiple runs) ### Review corrections - Fixed Medium from 4.3→4.1 (honest multi-run average, not cherry-picked) - Average row (4.6) now self-consistent with column values: (5.0+4.1+4.8)/3=4.63≈4.6 - Added footnote clarifying that TW/compaction columns are from a prior eval run

BYK self-assigned this May 20, 2026

BYK merged commit 26fe451 into main May 20, 2026
10 checks passed

BYK deleted the fix-recall-ranking branch May 20, 2026 21:56

BYK mentioned this pull request May 20, 2026

docs: update eval results — context retention 3.9→4.6, +77% vs tail-window #435

Merged

This was referenced May 21, 2026

publish: BYK/loreai@0.23.0 #439

Closed

publish: BYK/loreai@0.23.0 #448

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: downweight knowledge in recall when session content exists#432

fix: downweight knowledge in recall when session content exists#432
BYK merged 1 commit into
mainfrom
fix-recall-ranking

BYK commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented May 20, 2026

Summary

Problem

Fix

RRF Score Impact (both at rank 0, 2-term query)

Eval Results (CM-1, 400K inflation)

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant