Evaluate typed edges on existing chunk_links graph

## Context

We already have a similarity graph: `chunk_links` stores pre-computed cosine similarity edges between chunks across sessions (top-3 neighbors per chunk, min similarity 0.35). This is traversed at query time via `_expand_cross_session()` with a 0.7 discount factor.

The current edges are untyped — they say "these are related" but not HOW (supports, contradicts, supersedes, extends, etc.). The question: does adding edge types improve retrieval quality enough to justify the cost?

## What we already have (5 expansion mechanisms)

1. **Cross-session chunk links** — `chunk_links` table, cosine similarity, pre-computed at build
2. **Same-session context** — surrounding turns via `turn_index` lookup
3. **Knowledge node source expansion** — `source_sessions`/`source_turns` back-references
4. **Topic clustering** — Jaccard-based clusters in `cluster` table
5. **Hybrid RRF** — BM25 + embedding fusion with auto-boost

## Proposed evaluation

### Phase 1: Measure the gap (no code changes)

Add a "multi-hop" category to CodeMemo with 10-15 questions that require connecting information across sessions where the connection isn't obvious from keywords or embeddings alone. Examples:
- "We decided X in March, then reversed it in April. What's the current state?"
- "Agent A proposed a fix. Agent B found a problem with it. What was the problem?"
- Questions where following a typed edge (supersedes, contradicts) would help but generic similarity wouldn't

Run against current system. If flat retrieval scores >85%, typed edges may not be worth the cost.

### Phase 2: Prototype typed edges (if Phase 1 shows a gap)

Option A — **Classify at build time**: During `build_cross_session_links()`, run a lightweight classifier on each (source, target) pair to assign a type. Could be rule-based (temporal ordering → supersedes, high similarity + different conclusion → contradicts) or a small model.

Option B — **Classify at query time**: Keep untyped edges, but when expanding, use query intent to filter. A "what changed" query only follows edges where timestamps differ significantly. A "what supports" query only follows high-similarity same-conclusion edges.

Option C — **Add `link_type` column to `chunk_links`**: Enrichment model outputs edge type alongside knowledge nodes in the same pass. No extra LLM call — just an extra field in the extraction prompt.

### Speed constraints (local-first)

- Current `build_cross_session_links()` time: needs benchmarking
- Budget: typed edge classification should add <20% to build time
- Query-time expansion must stay <50ms for the full graph hop
- No external API calls for edge classification — must run on local models or rules
- `CROSS_LINK_MAX_EXPAND=3` means we're only traversing 3 edges per result — typed filtering on 3 edges is essentially free

## Decision criteria

| Metric | Threshold to ship |
|--------|------------------|
| Multi-hop accuracy improvement | >5% on new CodeMemo category |
| Build time increase | <20% |
| Query latency increase | <10ms |
| Cold start impact | <2s additional |

If we can't hit all four, the current untyped graph is good enough.

## References

- r/AIMemory thread on knowledge graph expectations (2026-04-06)
- Current expansion: `core.py:1160-1321` (build), `core.py:2068` (query-time expand)
- `chunk_links` schema: `(source_id TEXT, target_id TEXT, similarity REAL)`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate typed edges on existing chunk_links graph #504

Context

What we already have (5 expansion mechanisms)

Proposed evaluation

Phase 1: Measure the gap (no code changes)

Phase 2: Prototype typed edges (if Phase 1 shows a gap)

Speed constraints (local-first)

Decision criteria

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Metric	Threshold to ship
Multi-hop accuracy improvement	>5% on new CodeMemo category
Build time increase	<20%
Query latency increase	<10ms
Cold start impact	<2s additional

Evaluate typed edges on existing chunk_links graph #504

Description

Context

What we already have (5 expansion mechanisms)

Proposed evaluation

Phase 1: Measure the gap (no code changes)

Phase 2: Prototype typed edges (if Phase 1 shows a gap)

Speed constraints (local-first)

Decision criteria

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions