Skip to content

Full-text Search for Note Content #283

@ElioNeto

Description

@ElioNeto

Full-text Search for Note Content

Build a full-text search index over note content, enabling fast keyword-based search across all notes.

Approach: Inverted Index + Vector Index

Use a dual approach:

  1. Inverted Index — Tokenize content, build term->doc mapping for exact keyword search
  2. Vector Index — Use existing VectorIndex with text embeddings for semantic search

Inverted Index Design

Storage schema:

cf "fts":
  fts:{term}            -> JSON array of { note_path, positions, score }
  fts:meta:{note_path}  -> JSON { checksum, word_count, last_indexed }

Tokenization

  • Split on whitespace and punctuation
  • Lowercase all terms
  • Stemming (Porter stemmer or basic)
  • Stop word removal (configurable list)
  • Min word length: 2 chars
  • Max word length: 50 chars
  • Index position for proximity search

Search Query Syntax

  • keyword — basic term search
  • "exact phrase" — phrase search
  • keyword1 AND keyword2 — boolean AND
  • keyword1 OR keyword2 — boolean OR
  • -keyword — exclusion
  • prefix* — prefix wildcard

Ranking

  • TF-IDF scoring
  • Boost for title matches (frontmatter title)
  • Boost for heading matches
  • Recent notes ranked higher (time decay)

API Endpoint

GET /search?q=query&mode=fulltext&cursor=&limit=20

Response:

{
  "results": [
    {
      "path": "note-path",
      "title": "Note Title",
      "snippet": "...keyword in **context**...",
      "score": 0.85,
      "updated_at": "2026-05-25T10:00:00Z"
    }
  ],
  "cursor": "base64-encoded-cursor",
  "total_estimate": 42
}

Acceptance Criteria

  • Inverted index built and updated on note write
  • Basic keyword search returns relevant results
  • Phrase search returns exact matches
  • Prefix wildcard works
  • Boolean operators (AND, OR, NOT) work
  • Search 10K notes in < 100ms
  • Snippet generation with highlighted terms
  • Pagination via cursor
  • Integration tests with sample note corpus

Parent Epic

#275

Metadata

Metadata

Assignees

No one assigned

    Labels

    featnotesNote storage and indexingobsidianObsidian-like note-taking features

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions