bug: recall fails to find late-session details (easy questions score worst in CM-1)

## Problem

In the CM-1 eval, "easy" questions about late-session details (turns 38-45) score **1.9 average** — worse than "hard" questions about early-session details (turns 0-10) which score **3.6 average**. This is inverted from expectations.

## Evidence

| Question | Expected Answer | Model Answer | Score |
|---|---|---|---|
| cm-1-e1 | `src/__tests__/upload-abort.test.ts` | (marker text, not a real answer) | 1 |
| cm-1-e2 | `fix: resolve stale temp file accumulation and ENOSPC handling` | "PR #342" (hallucinated) | 1 |
| cm-1-e3 | 17 tests passed | "not explicitly listed" | 2 |
| cm-1-e4 | `fix/upload-cleanup-lock` | "branch name was not recorded" | 1 |
| cm-1-e5 | unused 'open' import in cleanup.ts | (correct) | 4.3 |

Meanwhile hard questions (Sentry issue ID, user who reported bug, stack trace) scored 4-5.

## Root Cause Hypothesis

The easy questions ask about details from the **last few turns** of the conversation. In the eval, the QA phase runs in a **new session** after the conversation phase. The late-session details exist in:
1. The temporal messages table (but may be filtered by `distilled=0` if distillation ran)
2. Distillation summaries (but specific details like branch names and PR titles may be compressed away)

The recall search may not rank these recent-but-distilled details highly enough, or the distillation may lose granular facts (exact branch names, test counts, PR titles).

## Possible Fixes

1. Ensure distillation preserves specific identifiers (branch names, file paths, version numbers)
2. Check if `distilled=0` filter on temporal search is too aggressive for QA questions
3. Boost recency in recall scoring for within-session queries

## Impact

Easy questions: avg 1.9 (should be ~4+)
Overall CM-1 impact: dragging score from ~3.6 down to 2.8

## Context

Discovered during live eval of #404 (multi-turn recall). The multi-turn recall mechanism is working well for hard questions (early-session detail retrieval), but late-session details are paradoxically harder to find.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: recall fails to find late-session details (easy questions score worst in CM-1) #410

Problem

Evidence

Root Cause Hypothesis

Possible Fixes

Impact

Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question	Expected Answer	Model Answer	Score
cm-1-e1	`src/__tests__/upload-abort.test.ts`	(marker text, not a real answer)	1
cm-1-e2	`fix: resolve stale temp file accumulation and ENOSPC handling`	"PR #342" (hallucinated)	1
cm-1-e3	17 tests passed	"not explicitly listed"	2
cm-1-e4	`fix/upload-cleanup-lock`	"branch name was not recorded"	1
cm-1-e5	unused 'open' import in cleanup.ts	(correct)	4.3

bug: recall fails to find late-session details (easy questions score worst in CM-1) #410

Description

Problem

Evidence

Root Cause Hypothesis

Possible Fixes

Impact

Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions