Skip to content

ai-partner: client-side retrieval with sqlite-vec #1451

@CraigBuckmaster

Description

@CraigBuckmaster

Parent epic: #1446 (Amicus — AI Study Partner v1)
Phase: 1 · Size: M · Depends on: #1448, #1450

Client-side retrieval pipeline: embed user query via proxy → vector search against local scripture.db → apply profile + context boosts → return top-10 chunks for the Amicus call.


Files to create

  • app/src/services/amicus/retrieval.ts — main retrieval orchestrator
  • app/src/services/amicus/vectorSearch.ts — wraps sqlite-vec queries against scripture.db
  • app/src/services/amicus/rerank.ts — profile/context boosting + diversity filter
  • app/src/services/amicus/embed.ts — calls /ai/embed via proxy
  • app/src/services/amicus/types.ts — shared types for the amicus service layer
  • app/src/services/amicus/__tests__/retrieval.test.ts — unit tests

Files to modify

Conventions to follow

  • Service layer pattern: follow existing app/src/services/ structure (small modules, pure functions where possible, side effects isolated to the orchestrator)
  • Logging: use app/src/utils/logger.ts — never console.log
  • TypeScript strict mode; no any types (matches Phase 4 architectural polish work per userMemories)
  • Imports use @/ path alias where available

Core API surface

// retrieval.ts
export interface RetrievedChunk {
  chunk_id: string;
  source_type: "section_panel" | "chapter_panel" | "word_study" | "lexicon_entry"
             | "debate_topic" | "cross_ref_thread_note" | "journey_stop" | "meta_faq";
  source_id: string;
  text: string;
  score: number;             // final score after boosts
  metadata: {
    scholar_id?: string;
    tradition?: string;
    book_id?: string;
    chapter_num?: number;
    verse_start?: number;
    verse_end?: number;
    panel_type?: string;
  };
}

export interface RetrievalContext {
  query: string;
  profile: CompressedProfile;   // from #1452
  currentChapterRef: ChapterRef | null;   // e.g. { book_id: "romans", chapter_num: 9 }
}

export async function retrieve(ctx: RetrievalContext): Promise<RetrievedChunk[]>;

Returns up to 10 chunks, sorted by final score desc.

Algorithm (match the plan's §6 design)

  1. Embed query — call proxy /ai/embed with query text; receive 1536-dim vector
  2. Vector searchSELECT rowid, distance FROM embeddings WHERE embedding MATCH ? ORDER BY distance LIMIT 40; join to chunk_text and chunk_metadata on rowid
  3. Normalize scores — convert distance to similarity (1 - distance), so higher = more relevant
  4. Boost current chapter — if metadata.book_id == ctx.currentChapterRef.book_id && metadata.chapter_num == ctx.currentChapterRef.chapter_num, multiply score × 1.5
  5. Boost preferred scholars — if metadata.scholar_id ∈ profile.preferred_scholars, multiply × 1.1
  6. Boost preferred tradition — if metadata.tradition ∈ profile.preferred_traditions, multiply × 1.05
  7. Diversity filter — no more than 2 chunks per scholar_id in final output (avoid single-scholar dominance)
  8. Return top 10 — sorted by final score desc

Each step is a pure function in its own file for testability (see rerank.ts).

Cost & latency targets

  • Embed call: ~100-200ms via proxy
  • sqlite-vec search: <50ms on-device (local)
  • Re-rank + diversity: <10ms (pure compute)
  • Total retrieval latency target: <250ms p50, <500ms p95

Measure and log these as perf metrics.

Error handling

  • Proxy embed call fails → retry once with 1s backoff → if still fails, return AmicusError.EMBED_FAILED (upstream surfaces graceful fallback UI)
  • sqlite-vec extension not loaded → throw AmicusError.EXTENSION_NOT_LOADED with clear message (developer error, app should have failed earlier)
  • Zero results from vector search → return empty array (upstream treats as "out-of-corpus" case)

Offline behavior

If proxy embed fails due to network, the entire feature is unavailable — retrieval cannot proceed without a query vector. Return AmicusError.OFFLINE so the UI can show the appropriate disabled state. No silent degraded mode.


Acceptance criteria

  • retrieve() returns up to 10 chunks for a test query in <500ms p95 on a mid-tier device
  • Current-chapter boost applied correctly (verified by unit test with mock scores)
  • Diversity filter enforces max 2 chunks per scholar
  • Profile boosts applied conditionally and do not cause score inversions
  • Embed failure returns typed error, does not throw uncaught
  • Offline mode returns typed OFFLINE error
  • Unit tests cover: boost math, diversity filter, score ordering, error paths
  • No any types; strict TypeScript passes
  • Matches existing service-layer conventions; no circular deps

Out of scope

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions