ai-partner: add sqlite-vec to scripture.db build

**Parent epic:** #1446 (Amicus — AI Study Partner v1)
**Phase:** 1 · **Size:** S · **Depends on:** #1447

Integrate the `sqlite-vec` extension into the scripture.db build pipeline so vector search runs on-device with no extra setup for the user. Merge `embeddings.db` (output of #1447) into `scripture.db` as a vec0 virtual table.

---

## Files to modify

- `_tools/build_sqlite.py` — add new phase after schema creation that loads sqlite-vec and populates the `embeddings` vec0 virtual table from `embeddings.db`
- `_tools/build_sqlite_schema.py` — add the vec0 table DDL and the companion `chunk_text` + `chunk_metadata` tables
- `_tools/build_sqlite_loaders.py` — add `populate_embeddings(conn)` loader function matching the existing loader signature pattern
- `_tools/validate_sqlite.py` — add integrity checks (see acceptance)
- `app/package.json` — add native deps needed for sqlite-vec (see "Client-side loading")

## Files to create

- `_tools/sqlite_vec_loader.py` — small helper that wraps `sqlite_vec.loadable_path()` and handles cross-platform loading for the build machine

---

## Schema additions (in `build_sqlite_schema.py`)

Add these to the SCHEMA string. The vec0 table is declared as a virtual table; companion tables hold text + metadata keyed by chunk_id.

```sql
-- Vector embeddings for Amicus retrieval
CREATE VIRTUAL TABLE embeddings USING vec0(
  embedding FLOAT[1536]
);

CREATE TABLE chunk_text (
  chunk_id     TEXT PRIMARY KEY,
  text         TEXT NOT NULL
);

CREATE TABLE chunk_metadata (
  chunk_id      TEXT PRIMARY KEY,
  source_type   TEXT NOT NULL,
  source_id     TEXT NOT NULL,
  scholar_id    TEXT,
  tradition     TEXT,
  book_id       TEXT,
  chapter_num   INTEGER,
  verse_start   INTEGER,
  verse_end     INTEGER,
  panel_type    TEXT
);

CREATE INDEX idx_chunk_metadata_source ON chunk_metadata(source_type, source_id);
CREATE INDEX idx_chunk_metadata_chapter ON chunk_metadata(book_id, chapter_num);
```

The vec0 virtual table has an implicit `rowid` that maps to chunk_id via the order of insertion. `populate_embeddings()` inserts into vec0 and chunk_text/chunk_metadata in matching order so rowid ↔ chunk_id is consistent.

## Build-time loader (`populate_embeddings`)

Match the signature + style of existing loaders (`populate_books`, `populate_verses`, etc.). Pseudocode:

```python
def populate_embeddings(conn: sqlite3.Connection) -> None:
    embeddings_db = ROOT / 'embeddings.db'
    if not embeddings_db.exists():
        print("  [WARN] embeddings.db not found — skipping embeddings table")
        print("         Run: python _tools/build_embeddings.py")
        return

    # Load sqlite-vec extension into the target conn
    conn.enable_load_extension(True)
    import sqlite_vec
    sqlite_vec.load(conn)
    conn.enable_load_extension(False)

    # Attach embeddings.db read-only; copy rows into scripture.db target tables
    src = sqlite3.connect(f"file:{embeddings_db}?mode=ro", uri=True)
    cur = conn.cursor()
    rows = src.execute("SELECT chunk_id, source_type, source_id, text, metadata_json, embedding FROM embedding_chunks").fetchall()

    for chunk_id, src_type, src_id, text, meta_json, emb_blob in rows:
        meta = json.loads(meta_json)
        cur.execute("INSERT INTO embeddings(embedding) VALUES (?)", (emb_blob,))
        cur.execute("INSERT INTO chunk_text VALUES (?, ?)", (chunk_id, text))
        cur.execute("""INSERT INTO chunk_metadata VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
            (chunk_id, src_type, src_id, meta.get('scholar_id'), meta.get('tradition'),
             meta.get('book_id'), meta.get('chapter_num'), meta.get('verse_start'),
             meta.get('verse_end'), meta.get('panel_type')))

    conn.commit()
    src.close()
    print(f"  [OK] Embeddings: {len(rows)} chunks populated")
```

## Client-side loading (`app/`)

React Native needs a native module for sqlite-vec. Add to `app/package.json`:

```
"expo-sqlite": "(latest matching Expo SDK 54)",
"@op-engineering/op-sqlite": "(if needed for extension loading, eval at implementation time)"
```

Verify at implementation which sqlite-vec approach works with Expo SDK 54. Document the chosen path in a code comment in `app/src/db/index.ts` where the scripture.db connection is opened.

## Validation (`_tools/validate_sqlite.py`)

Add a new section `6. EMBEDDINGS` that verifies:
- `embeddings` virtual table exists and has rows
- Row counts match across `embeddings`, `chunk_text`, `chunk_metadata`
- No orphan chunk_ids (every chunk_text row has matching chunk_metadata)
- Sample vector search returns results (smoke test with a canned query vector)
- Total DB size under 150MB (hard cap before we need further optimization)

---

## Acceptance criteria

- [ ] `python _tools/build_sqlite.py` completes successfully after `build_embeddings.py` has run
- [ ] `scripture.db` contains populated `embeddings`, `chunk_text`, `chunk_metadata` tables
- [ ] `python _tools/validate_sqlite.py` passes with new section 6 EMBEDDINGS green
- [ ] Smoke test query via sqlite CLI: `SELECT rowid, distance FROM embeddings WHERE embedding MATCH ? ORDER BY distance LIMIT 5` returns results
- [ ] DB size under 150MB
- [ ] Skip-with-warning behavior works when `embeddings.db` absent (doesn't break existing build)
- [ ] Client-side sqlite-vec loading documented in `app/src/db/index.ts`
- [ ] Works on Windows build machine

## Out of scope

- Runtime retrieval logic in the app — that's #1451
- Re-embedding flow for user-edited content — handled by #1447 incremental mode


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ai-partner: add sqlite-vec to scripture.db build #1448

Files to modify

Files to create

Schema additions (in `build_sqlite_schema.py`)

Build-time loader (`populate_embeddings`)

Client-side loading (`app/`)

Validation (`_tools/validate_sqlite.py`)

Acceptance criteria

Out of scope

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ai-partner: add sqlite-vec to scripture.db build #1448

Description

Files to modify

Files to create

Schema additions (in build_sqlite_schema.py)

Build-time loader (populate_embeddings)

Client-side loading (app/)

Validation (_tools/validate_sqlite.py)

Acceptance criteria

Out of scope

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Schema additions (in `build_sqlite_schema.py`)

Build-time loader (`populate_embeddings`)

Client-side loading (`app/`)

Validation (`_tools/validate_sqlite.py`)