Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -824,6 +824,13 @@ A schema-apply failure in CI fails the whole deploy on purpose — we'd
rather catch a bad migration than ship an app whose code expects
columns the DB doesn't have.

**Do NOT tell the user to run `mise run sync` after a cloud merge to
`main`.** The deploy's `sync-supabase` job applies the schema
automatically on every merge, so a "remember to sync" reminder in an
end-of-task summary is redundant noise. `mise run sync` is only worth
mentioning as the way to try a schema change against the linked project
*before* merging (step 3 above) — not as a post-merge step.

The sync job also merges the fork's Pages URL into the auth allowlist,
but that's orthogonal to schema — don't dwell on it in PR descriptions
for schema changes.
15 changes: 5 additions & 10 deletions docs/dev/embeddings.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
# Embeddings

The pipeline that vectorizes memories, thread summaries, recipes,
wiki articles, samskara substrate, and Library document chunks so
semantic search works.
wiki articles, and samskara substrate so semantic search works.
Backfill (turning `embedding is null` rows into vectors) runs
server-side on a `pg_cron` schedule behind the `venice` edge
function; the browser still embeds *search queries* synchronously at
Expand All @@ -18,10 +17,9 @@ protocol the backfill drains through.

A memory that's just been written is `embedding is null`; so is a
thread whose title or summary changed (trigger-invalidated), and the
same for recipes, wiki articles, substrate rows, and freshly-inserted
document chunks. A `pg_cron` job
same for recipes, wiki articles, and substrate rows. A `pg_cron` job
fires every 5 minutes, POSTs to the edge function's `/backfill`
route, and the function claims pending rows across all six tables,
route, and the function claims pending rows across all five tables,
asks Venice's `/embeddings` endpoint for vectors, and writes them
back under a claim guard - all server-side, no open tab required.

Expand Down Expand Up @@ -77,7 +75,7 @@ later milestone (phase 4).
the service role by decoding the bearer JWT's `role` claim and
requiring `role === 'service_role'` (the gateway's verify_jwt has
already validated the signature). It then drives `runBackfill` over
the six sources, bounded by a batch cap (50 rows) and a time budget
the five sources, bounded by a batch cap (50 rows) and a time budget
(25s) per invocation. The schedule resumes the drain next tick.
- **`runBackfill(deps, opts)`** - round-robins one claim attempt per
source per pass; embeds and saves whatever it claims; stops when a
Expand Down Expand Up @@ -167,7 +165,7 @@ under the supervisor) still does, importing `LeaseCoordinator` from

## Interactions with other features

- **Memory** - `memories` is one of the six backfill sources. The
- **Memory** - `memories` is one of the five backfill sources. The
`clear_memory_embedding_on_change` trigger reselects edited rows.
`memory_search`'s vector path reads `memories.embedding`; ILIKE
fallback covers unembedded rows. See `./memory.md`.
Expand All @@ -185,9 +183,6 @@ under the supervisor) still does, importing `LeaseCoordinator` from
- **Wiki / Samskara** - `wiki_articles` and `samskara_substrate` are
sources; the substrate claim skips unassimilated rows
(`situation is null`). See `./wiki.md`, `./samskara.md`.
- **Library** - `document_chunks` is the sixth source; chunk rows are
inserted `embedding is null` at upload and drained by the same sweep.
Chunk content embeds verbatim (no title prefix). See `./library.md`.
- **Shared config** - the function reads the project-global Venice
key from `app_config` server-side (service role). The browser's
`app.serverConfig` copy is the same row, fetched post-auth, staged
Expand Down
270 changes: 116 additions & 154 deletions docs/dev/library.md

Large diffs are not rendered by default.

25 changes: 13 additions & 12 deletions docs/user/library.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,12 @@ different jobs:
description helps both you and the assistant find it later.
4. Save it to the Library.

When you upload, Nak extracts the document's text and, a few minutes
later, indexes it for search in the background. A document shows
**Processing** until its text is extracted, and **Not searchable** if
the file had no extractable text (a scanned image with no text layer,
for example) - the original is still downloadable in that case.
When you upload, Nak extracts the document's text, which usually takes
a few seconds. A document shows **Processing** until extraction
finishes, then it's immediately searchable. A document is marked **Not
searchable** if the file had no extractable text (a scanned image with
no text layer, for example) - the original is still downloadable in
that case.

Supported files are anything text can be extracted from: plain text,
Word documents, PDFs, and similar. The original file is stored
Expand All @@ -48,21 +49,21 @@ privately and is only reachable by you.
You don't have to do anything special. Just ask. When a question would
be answered by your paperwork - "what's my deductible?", "does my
policy cover water damage?", "what does the HOA say about fences?" -
the assistant searches your Library passage-by-passage and answers from
the exact relevant section, even inside a long PDF. It cites which
document the answer came from.
the assistant searches inside your documents for the relevant section
and answers from it, even inside a long PDF. It cites which document
the answer came from.

Because search works on individual passages, a forty-page contract is
just as findable as a one-page letter - the answer doesn't get lost.
The assistant searches the full text, so a forty-page contract is just
as findable as a one-page letter - the answer doesn't get lost.

## Managing documents

From a document's page in the Library panel you can:

- **Edit the description** to clarify what it's for.
- **Download the original** file.
- **Delete** the document - this permanently removes its text, its
search index, and the stored original.
- **Delete** the document - this permanently removes its text and the
stored original.

The assistant can also help you manage the Library in conversation. It
can:
Expand Down
22 changes: 3 additions & 19 deletions src/components/LibraryList.svelte
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,8 @@
* `documentStore.query` so a search keystroke filters the listing in place.
*
* Browse order is newest-first (the Library is curated reference material,
* read most-recent-first, not browsed alphabetically). Search is the same
* debounced passage-search pipeline the assistant uses for `doc_search` -
* see `searchDocumentsSemantic` in `$lib/documents` - deduped to documents.
* read most-recent-first, not browsed alphabetically). Search is a debounced
* substring match over the user's documents (`SupabaseService.searchDocuments`).
* Clicking a row sets `route.document_id` so the main panel renders that
* document.
*/
Expand Down Expand Up @@ -42,7 +41,7 @@
debounceTimer = setTimeout(() => {
debounceTimer = null;
if (!app.supabase) return;
void runDocumentSearch(app.supabase, app.venice);
void runDocumentSearch(app.supabase);
}, SEARCH_DEBOUNCE_MS);
return () => {
if (debounceTimer !== null) {
Expand Down Expand Up @@ -103,11 +102,6 @@
>{statusLabel(d.extraction_status)}</span>
{/if}
</span>
{#if documentStore.snippets[d.id]}
<!-- Snippet shown only in search mode - the passage that matched,
so the user can see why a document came up. -->
<span class="library-row-snippet">{documentStore.snippets[d.id]}</span>
{/if}
</button>
</div>
{/each}
Expand Down Expand Up @@ -172,16 +166,6 @@
color: var(--error, #c0392b);
opacity: 0.9;
}
.library-row-snippet {
font-size: 0.76rem;
opacity: 0.6;
display: -webkit-box;
-webkit-line-clamp: 2;
line-clamp: 2;
-webkit-box-orient: vertical;
overflow: hidden;
max-width: 100%;
}
.library-list-sentinel {
min-height: 1px;
padding: 0.5rem 0;
Expand Down
12 changes: 6 additions & 6 deletions src/lib/chat-prompt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -189,12 +189,12 @@ You cannot edit wiki articles directly. When the user asks to consolidate duplic
// via search rather than auto-injection.
const LIBRARY_BLOCK = `\
The application also maintains a document Library: whole files the user has uploaded to keep as long-term reference material - insurance policies, contracts, HOA agreements, tax documents, anything text can be extracted from. Unlike message attachments (which expire), Library documents are permanent and fully searchable.
Document contents are NEVER auto-injected. To work with a long document, use the same loop you would on a large file: find the right place, then read around it.
- doc_search: fuzzy/semantic search across all documents, passage by passage. Use it when you do NOT know the exact wording - "does my policy cover water damage". Returns the most relevant passages with their source document.
- doc_grep: exact regex search inside a document (like grep -n), returning matching lines with line numbers and context. Use it when you DO know a keyword or phrase - "late fee", "quorum", a section number. This is usually the fastest way to pin down a specific clause in a big document.
- doc_read: read a range of lines by number. Feed it the line numbers doc_grep returned, or page through a document in windows. doc_get tells you the total line count so you know the range you can address.
- doc_list / doc_get: list documents, and fetch one document's metadata + line count (not its text - use doc_read for that).
Typical flow: doc_search or doc_list to land on the right document, then doc_grep for the exact clause, then doc_read the surrounding lines. Prefer doc_grep + doc_read over dumping a whole document into context.
Document contents are NEVER auto-injected. Work a document the same way you would a large source file: find which document, find the right place in it, then read around that place.
- doc_list: list the user's documents with their titles and descriptions. This is how you pick which document a question is about - read the descriptions and choose.
- doc_grep: exact regex search inside a document (like grep -n), returning matching lines with line numbers and context. This is the primary way to find a specific clause - "late fee", "quorum", a section number. Omit the document id to grep across every document at once. Broaden the regex with alternations (e.g. "water|flood|leak|seepage") when the user's wording might differ from the document's.
- doc_read: read a range of lines by number. Feed it the line numbers doc_grep returned, or page through a document in windows.
- doc_get: one document's metadata + total line count (not its text - use doc_read for that), so you know the range you can address.
Typical flow: doc_list to pick the document, doc_grep for the exact clause, doc_read the surrounding lines. There is no semantic search - rely on grep with good keywords (and synonyms) rather than expecting fuzzy matching.
To save a file the user attached to THIS conversation as a permanent document, enable the \`library\` toolbox and call doc_create (identify the file by its filename, and always give it a clear description of what it is for). Use doc_update to rename a document or fix its description, and doc_delete when the user says a document is obsolete (e.g. they changed insurers and the old policy should go).
`;

Expand Down
57 changes: 16 additions & 41 deletions src/lib/documents-store.svelte.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,30 +7,25 @@
*
* Parallel to `wiki-store.svelte.ts`, with two differences:
* - Browse order is newest-first (created_at desc), not alphabetical.
* - Search is passage-level: searchDocumentsSemantic returns chunk hits,
* which we dedupe to unique documents in relevance order and resolve back
* to full rows via getDocumentsByIds. `snippets` carries the best-matching
* passage per document so the list can show why a doc matched.
* - Search is a plain substring match over the user's documents
* (`SupabaseService.searchDocuments`) returning whole documents - there is
* no embedding/passage layer. The chat model's precise in-document search
* is doc_grep; this drawer surface is browse-by-keyword.
*/
import {
DEFAULT_LIST_PAGE_SIZE,
type Document,
type SupabaseService,
} from './supabase';
import type { VeniceClient } from './venice';
import { searchDocumentsSemantic } from './documents';

interface DocumentStore {
/**
* The documents currently shown. Browse (empty query) is an offset window
* paged newest-first by `listDocumentsPage`, grown by the sidebar's
* infinite-scroll sentinel. Search (non-empty query) is a capped, unpaged
* relevance set; `hasMore` is forced false there.
* match set; `hasMore` is forced false there.
*/
results: Document[];
/** Best-matching passage per document id, populated during a search so the
* list can show the snippet that matched. Empty in browse mode. */
snippets: Record<string, string>;
loading: boolean;
loaded: boolean;
error: string | null;
Expand All @@ -42,7 +37,6 @@ interface DocumentStore {

export const documentStore = $state<DocumentStore>({
results: [],
snippets: {},
loading: false,
loaded: false,
error: null,
Expand All @@ -52,9 +46,8 @@ export const documentStore = $state<DocumentStore>({
loadingMore: false,
});

// Match the assistant's doc_search reach so a drawer search never hides a
// document the assistant can find passages in.
const DOCUMENT_SEARCH_CHUNK_LIMIT = 60;
// Cap on the drawer's keyword-search result set.
const DOCUMENT_SEARCH_LIMIT = 100;

let currentAbort: AbortController | null = null;

Expand All @@ -65,7 +58,6 @@ async function loadDocumentsFirstPage(supabase: SupabaseService): Promise<void>
if (currentAbort) currentAbort.abort();
documentStore.loading = true;
documentStore.error = null;
documentStore.snippets = {};
try {
const page = await supabase.listDocumentsPage({
offset: 0,
Expand Down Expand Up @@ -104,14 +96,11 @@ export async function loadMoreDocuments(supabase: SupabaseService): Promise<void

/**
* Drive `documentStore` from the bound query. Empty query -> the newest-first
* browse list; non-empty -> the passage search, deduped to documents in
* relevance order. Callers should debounce - this runs immediately. Cancels
* any in-flight search so a stale result can't clobber the latest query.
* browse list; non-empty -> a substring match over the user's documents.
* Callers should debounce - this runs immediately. Cancels any in-flight load
* so a stale result can't clobber the latest query.
*/
export async function runDocumentSearch(
supabase: SupabaseService,
venice: VeniceClient | null
): Promise<void> {
export async function runDocumentSearch(supabase: SupabaseService): Promise<void> {
if (documentStore.query.trim().length === 0) {
return loadDocumentsFirstPage(supabase);
}
Expand All @@ -121,28 +110,14 @@ export async function runDocumentSearch(
documentStore.loading = true;
documentStore.error = null;
try {
const hits = await searchDocumentsSemantic(
documentStore.query.trim(),
DOCUMENT_SEARCH_CHUNK_LIMIT,
{ supabase, venice, signal: ctl.signal }
);
if (ctl.signal.aborted) return;

// Dedupe chunk hits to unique documents, preserving relevance order, and
// keep the first (best) passage per document as the snippet.
const orderedIds: string[] = [];
const snippets: Record<string, string> = {};
for (const hit of hits) {
if (!(hit.document_id in snippets)) {
orderedIds.push(hit.document_id);
snippets[hit.document_id] = hit.content;
}
}
const docs = await supabase.getDocumentsByIds(orderedIds);
const docs = await supabase.searchDocuments({
query: documentStore.query.trim(),
limit: DOCUMENT_SEARCH_LIMIT,
});
if (ctl.signal.aborted) return;
documentStore.results = docs;
documentStore.snippets = snippets;
documentStore.offset = docs.length;
// Search results are capped, not paged - close the sentinel.
documentStore.hasMore = false;
} catch (err) {
if (ctl.signal.aborted) return;
Expand Down
Loading