ai-partner: client-side retrieval with sqlite-vec

**Parent epic:** #1446 (Amicus — AI Study Partner v1)
**Phase:** 1 · **Size:** M · **Depends on:** #1448, #1450

Client-side retrieval pipeline: embed user query via proxy → vector search against local `scripture.db` → apply profile + context boosts → return top-10 chunks for the Amicus call.

---

## Files to create

- `app/src/services/amicus/retrieval.ts` — main retrieval orchestrator
- `app/src/services/amicus/vectorSearch.ts` — wraps sqlite-vec queries against scripture.db
- `app/src/services/amicus/rerank.ts` — profile/context boosting + diversity filter
- `app/src/services/amicus/embed.ts` — calls `/ai/embed` via proxy
- `app/src/services/amicus/types.ts` — shared types for the amicus service layer
- `app/src/services/amicus/__tests__/retrieval.test.ts` — unit tests

## Files to modify

- `app/src/db/index.ts` — ensure sqlite-vec extension is loaded when opening scripture.db (see #1448)
- `app/src/services/index.ts` — re-export amicus service entry point

## Conventions to follow

- Service layer pattern: follow existing `app/src/services/` structure (small modules, pure functions where possible, side effects isolated to the orchestrator)
- Logging: use `app/src/utils/logger.ts` — never `console.log`
- TypeScript strict mode; no `any` types (matches Phase 4 architectural polish work per userMemories)
- Imports use `@/` path alias where available

---

## Core API surface

```ts
// retrieval.ts
export interface RetrievedChunk {
  chunk_id: string;
  source_type: "section_panel" | "chapter_panel" | "word_study" | "lexicon_entry"
             | "debate_topic" | "cross_ref_thread_note" | "journey_stop" | "meta_faq";
  source_id: string;
  text: string;
  score: number;             // final score after boosts
  metadata: {
    scholar_id?: string;
    tradition?: string;
    book_id?: string;
    chapter_num?: number;
    verse_start?: number;
    verse_end?: number;
    panel_type?: string;
  };
}

export interface RetrievalContext {
  query: string;
  profile: CompressedProfile;   // from #1452
  currentChapterRef: ChapterRef | null;   // e.g. { book_id: "romans", chapter_num: 9 }
}

export async function retrieve(ctx: RetrievalContext): Promise<RetrievedChunk[]>;
```

Returns up to 10 chunks, sorted by final score desc.

## Algorithm (match the plan's §6 design)

1. **Embed query** — call proxy `/ai/embed` with query text; receive 1536-dim vector
2. **Vector search** — `SELECT rowid, distance FROM embeddings WHERE embedding MATCH ? ORDER BY distance LIMIT 40`; join to `chunk_text` and `chunk_metadata` on rowid
3. **Normalize scores** — convert distance to similarity (`1 - distance`), so higher = more relevant
4. **Boost current chapter** — if `metadata.book_id == ctx.currentChapterRef.book_id && metadata.chapter_num == ctx.currentChapterRef.chapter_num`, multiply score × 1.5
5. **Boost preferred scholars** — if `metadata.scholar_id ∈ profile.preferred_scholars`, multiply × 1.1
6. **Boost preferred tradition** — if `metadata.tradition ∈ profile.preferred_traditions`, multiply × 1.05
7. **Diversity filter** — no more than 2 chunks per `scholar_id` in final output (avoid single-scholar dominance)
8. **Return top 10** — sorted by final score desc

Each step is a pure function in its own file for testability (see `rerank.ts`).

## Cost & latency targets

- Embed call: ~100-200ms via proxy
- sqlite-vec search: <50ms on-device (local)
- Re-rank + diversity: <10ms (pure compute)
- **Total retrieval latency target: <250ms p50, <500ms p95**

Measure and log these as perf metrics.

## Error handling

- Proxy embed call fails → retry once with 1s backoff → if still fails, return `AmicusError.EMBED_FAILED` (upstream surfaces graceful fallback UI)
- sqlite-vec extension not loaded → throw `AmicusError.EXTENSION_NOT_LOADED` with clear message (developer error, app should have failed earlier)
- Zero results from vector search → return empty array (upstream treats as "out-of-corpus" case)

## Offline behavior

If proxy embed fails due to network, the entire feature is unavailable — retrieval cannot proceed without a query vector. Return `AmicusError.OFFLINE` so the UI can show the appropriate disabled state. No silent degraded mode.

---

## Acceptance criteria

- [ ] `retrieve()` returns up to 10 chunks for a test query in <500ms p95 on a mid-tier device
- [ ] Current-chapter boost applied correctly (verified by unit test with mock scores)
- [ ] Diversity filter enforces max 2 chunks per scholar
- [ ] Profile boosts applied conditionally and do not cause score inversions
- [ ] Embed failure returns typed error, does not throw uncaught
- [ ] Offline mode returns typed OFFLINE error
- [ ] Unit tests cover: boost math, diversity filter, score ordering, error paths
- [ ] No `any` types; strict TypeScript passes
- [ ] Matches existing service-layer conventions; no circular deps

## Out of scope

- Compressed profile generation — that's #1452
- The actual LLM call — separate service, wired in Phase 2 (#1455)
- Caching retrieval results — defer; most queries are unique


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ai-partner: client-side retrieval with sqlite-vec #1451

Files to create

Files to modify

Conventions to follow

Core API surface

Algorithm (match the plan's §6 design)

Cost & latency targets

Error handling

Offline behavior

Acceptance criteria

Out of scope

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ai-partner: client-side retrieval with sqlite-vec #1451

Description

Files to create

Files to modify

Conventions to follow

Core API surface

Algorithm (match the plan's §6 design)

Cost & latency targets

Error handling

Offline behavior

Acceptance criteria

Out of scope

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions