bug: model writes marker-like text instead of calling recall tool

## Problem

In the CM-1 eval (live mode), the model sometimes writes text that resembles a recall marker (e.g. `📚 Fetching details for t:... and t:... simultaneously…`) as its **final answer** instead of actually calling the recall tool.

## Evidence

Question cm-1-e1: "What test file was created for the upload abort handling?"
- Expected: `src/__tests__/upload-abort.test.ts`
- Got: `📚 Fetching details for t:e8a949d7... and t:36d96080... simultaneously…`
- Score: 1/5

The model generated free-form text that looks like a marker but never invoked the recall tool. The actual `buildRecallMarker()` produces `📚 Fetching detail for <id>…` (singular), so this isn't a marker leak — the model composed this text itself.

## Root Cause Hypothesis

The model sees recall markers from previous turns in the conversation and mimics the format instead of using the tool. This could be addressed by:
1. Making the marker format less "tool-like" so the model doesn't confuse it with an action
2. Adding explicit instructions in the recall tool description to always use the tool, never write markers manually
3. Adjusting the QA prompt to discourage this behavior

## Impact

This affected 1 of 15 CM-1 questions in live eval. Score impact: ~0.27 points on the overall CM-1 average.

## Context

Discovered during live eval of #404 (multi-turn recall).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: model writes marker-like text instead of calling recall tool #409

Problem

Evidence

Root Cause Hypothesis

Impact

Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

bug: model writes marker-like text instead of calling recall tool #409

Description

Problem

Evidence

Root Cause Hypothesis

Impact

Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions