Full-text Search for Note Content
Build a full-text search index over note content, enabling fast keyword-based search across all notes.
Approach: Inverted Index + Vector Index
Use a dual approach:
- Inverted Index — Tokenize content, build term->doc mapping for exact keyword search
- Vector Index — Use existing
VectorIndex with text embeddings for semantic search
Inverted Index Design
Storage schema:
cf "fts":
fts:{term} -> JSON array of { note_path, positions, score }
fts:meta:{note_path} -> JSON { checksum, word_count, last_indexed }
Tokenization
- Split on whitespace and punctuation
- Lowercase all terms
- Stemming (Porter stemmer or basic)
- Stop word removal (configurable list)
- Min word length: 2 chars
- Max word length: 50 chars
- Index position for proximity search
Search Query Syntax
keyword — basic term search
"exact phrase" — phrase search
keyword1 AND keyword2 — boolean AND
keyword1 OR keyword2 — boolean OR
-keyword — exclusion
prefix* — prefix wildcard
Ranking
- TF-IDF scoring
- Boost for title matches (frontmatter title)
- Boost for heading matches
- Recent notes ranked higher (time decay)
API Endpoint
GET /search?q=query&mode=fulltext&cursor=&limit=20
Response:
{
"results": [
{
"path": "note-path",
"title": "Note Title",
"snippet": "...keyword in **context**...",
"score": 0.85,
"updated_at": "2026-05-25T10:00:00Z"
}
],
"cursor": "base64-encoded-cursor",
"total_estimate": 42
}
Acceptance Criteria
Parent Epic
#275
Full-text Search for Note Content
Build a full-text search index over note content, enabling fast keyword-based search across all notes.
Approach: Inverted Index + Vector Index
Use a dual approach:
VectorIndexwith text embeddings for semantic searchInverted Index Design
Storage schema:
Tokenization
Search Query Syntax
keyword— basic term search"exact phrase"— phrase searchkeyword1 AND keyword2— boolean ANDkeyword1 OR keyword2— boolean OR-keyword— exclusionprefix*— prefix wildcardRanking
API Endpoint
Response:
{ "results": [ { "path": "note-path", "title": "Note Title", "snippet": "...keyword in **context**...", "score": 0.85, "updated_at": "2026-05-25T10:00:00Z" } ], "cursor": "base64-encoded-cursor", "total_estimate": 42 }Acceptance Criteria
Parent Epic
#275