Fix search performance: skip LLM query expansion and reranking by default#49
Merged
antoninbas merged 5 commits intomainfrom Apr 18, 2026
Merged
Fix search performance: skip LLM query expansion and reranking by default#49antoninbas merged 5 commits intomainfrom
antoninbas merged 5 commits intomainfrom
Conversation
…ault Hybrid search was running three local GGUF models on every query (1.7B query expansion + 300M embed + 0.6B reranker), making searches take 30-80s on CPU. Now defaults to fast BM25+vector hybrid by passing pre-built queries to qmd, skipping LLM inference entirely. Two new config options (both default false): knotes config set queryExpand true # enable LLM query expansion knotes config set rerank true # enable LLM reranking Also fix a race condition in getStore(): concurrent callers (background embed + user search on startup) previously created separate store instances, loading models twice in parallel. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace search overlay with a full search screen (Ctrl+K or Search button toggles it) - Search only fires on Enter or Search button click, not while typing - Per-search mode selector: BM25 / Vector / Hybrid (segmented buttons) - Query Expansion and Reranking checkboxes (Hybrid only, labelled "slow") - Results show full snippet (500 chars), score badge, clickable to open note - API: rerank and queryExpand params now accepted per-request, overriding config defaults - CLI: --rerank and --expand flags added to knotes search - MCP: rerank and queryExpand parameters added to knotes_search tool Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- notes.getNote: add case-insensitive filename fallback so search results with qmd-lowercased paths (e.g. "notes/test") resolve to actual files (e.g. "notes/Test") - search: use full bestChunk/body content in snippet (no backend truncation); CLI/MCP now receive the complete matched text - SearchView: truncate snippet display at 600 chars with ellipsis so long notes don't overwhelm the results list Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
BM25 and vector modes were using searchLex/searchVector which return the full document body. Switching to store.search() with pre-built lex/vec queries gives the same search behavior but returns bestChunk — the specific matching section — consistent with how vector search actually works (chunk-level matching, not document-level). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Performance fix
rerank: falseandqueryExpand: false(can be overridden globally viaknotes config set rerank true/knotes config set queryExpand true)getStore(): concurrent callers (background embed + user search on startup) previously created separate store instances, loading models twiceSearch UX overhaul
rerankandqueryExpandoptions exposed in CLI (--rerank,--expand) and MCP (rerank,queryExpandparams)Search result improvements
store.search()with pre-built queries, so results always return the specific matching section of the document rather than the full body from the start. This is semantically correct — vector search matches at chunk level, so the result should be the matching chunk.notes/Test→notes/test), causinggetNoteto fail. Added case-insensitive filename fallback ingetNote.Test plan
knotes search <query> --rerank --expandworks from CLIknotes_searchacceptsrerankandqueryExpandparamsconfig showdisplaysrerank: falseandqueryExpand: false🤖 Generated with Claude Code