Skip to content

Add minScore filter across CLI, HTTP, MCP, and web UI#52

Merged
antoninbas merged 1 commit intomainfrom
search-min-score
Apr 19, 2026
Merged

Add minScore filter across CLI, HTTP, MCP, and web UI#52
antoninbas merged 1 commit intomainfrom
search-min-score

Conversation

@antoninbas
Copy link
Copy Markdown
Owner

Summary

Now that bm25 / vector / hybrid all return real relevance scores (PR #51), expose a minScore floor so callers can drop low-relevance results.

Default is 0 (no filtering). A non-zero default would hide legitimate matches on small corpora, and qmd's three modes have different score scales, so there's no single good value.

Surfaces

  • Core: search({ minScore }) — post-hoc filter for bm25/vector (qmd's searchLex/searchVector don't accept it); passed through to qmd for hybrid (qmd filters against the real fused score before the 1/rank overwrite)
  • HTTP API: ?minScore=<float> — validated, 400 on garbage
  • CLI: knotes search <q> --min-score <n> with per-mode scale docs in --help
  • MCP: knotes_search tool, same description in the schema
  • Web UI: number input next to the mode selector, with a hint that updates when the mode changes

Per-mode scale documented in help text

  • bm25: sigmoid of BM25 in [0, 1) — ~0.3 weak, ~0.6 medium, ~0.9 strong
  • vector: cosine similarity in [0, 1] — ~0.3 noise floor, ~0.5 related
  • hybrid: fused RRF — much smaller, ~0.02–0.08 for good matches

Test plan

  • npx vitest run — 146/146 pass (new: minScore threshold test + HTTP validation test)
  • npx tsc --noEmit clean for both backend and frontend
  • CI green before merging

Expose a score floor for search results, wired through every consumer.
Default 0 (no filtering) because the meaningful threshold varies by
mode and by corpus — any non-zero default risks hiding legitimate
matches on small corpora. Users can opt in; help text documents the
per-mode scale.

Per-mode behavior:
- bm25: searchLex has no minScore option in qmd, so filter post-hoc.
- vector: same — searchVector has no minScore option, filter post-hoc.
- hybrid: pass minScore to store.search(). qmd applies the filter
  against the real fused RRF score before overwriting score with
  1/(rank+1), so the filter works on real relevance even though we
  surface the value from explain.rrf.totalScore.

Scale documentation (CLI help, MCP description, web UI hint):
- bm25: sigmoid of BM25 in [0, 1); ~0.3 weak, ~0.6 medium, ~0.9 strong
- vector: cosine similarity in [0, 1]; ~0.3 noise floor, ~0.5 related
- hybrid: fused RRF; ~0.02–0.08 for good matches (much smaller scale)

HTTP API and CLI validate the value (non-negative finite number) and
return 400 / throw on garbage input. Web UI shows a dynamic hint that
updates with the selected mode.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@antoninbas antoninbas merged commit 437f294 into main Apr 19, 2026
6 checks passed
@antoninbas antoninbas deleted the search-min-score branch April 19, 2026 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant