fix: split multi-word queries in standalone memory_smart_search#148
Conversation
String.includes(query) required the full query to appear as a contiguous substring, so "BEAM bi-level LLM" returned nothing even when each word existed across title/content/concepts. Split on whitespace and require every token to match (AND semantics). Single-word queries are unaffected — a one-element array with .every behaves identically to the old .includes. Fixes #147
📝 WalkthroughWalkthroughThe search filter logic in the standalone memory search function has been updated to tokenize the query into individual words and verify that all words appear in the indexed text, rather than requiring the entire query string to match as a contiguous substring. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/mcp/standalone.ts (1)
131-131: Precompute query tokens once per request.Line 131 splits
queryfor every memory row. Move tokenization outside.filter(...)to avoid repeated regex work.♻️ Suggested tweak
const query = rawQuery.trim().toLowerCase(); + const queryWords = query.split(/\s+/); const limit = parseLimit(args.limit); const all = await kvInstance.list<Record<string, unknown>>("mem:memories"); const results = all .filter((m) => { @@ - return query.split(/\s+/).every((word) => text.includes(word)); + return queryWords.every((word) => text.includes(word)); }) .slice(0, limit);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mcp/standalone.ts` at line 131, The filter currently calls query.split(/\s+/) for every memory row which repeats regex work; compute the query tokens once per request (e.g., const queryTokens = query.split(/\s+/)) outside the .filter/... callback and then replace the per-row check with queryTokens.every(word => text.includes(word)) in the function that performs the memory filtering (the location using query.split(/\s+/) and the return line). This moves tokenization out of the per-row loop and avoids repeated regex execution.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/mcp/standalone.ts`:
- Line 131: The filter currently calls query.split(/\s+/) for every memory row
which repeats regex work; compute the query tokens once per request (e.g., const
queryTokens = query.split(/\s+/)) outside the .filter/... callback and then
replace the per-row check with queryTokens.every(word => text.includes(word)) in
the function that performs the memory filtering (the location using
query.split(/\s+/) and the return line). This moves tokenization out of the
per-row loop and avoids repeated regex execution.
Fixes #147
Root cause
standalone.tsbuilds a concatenated searchable string from title, content, files, concepts, and sessionIds, then checks:A multi-word query like
"BEAM bi-level LLM heuristic"requires that exact substring to appear contiguously — it won't match when the words are spread across different fields.Fix
Each token in the query must appear somewhere in the searchable text (AND semantics). Single-word queries are unaffected.
Test
memory_smart_search({ query: "BEAM" })— ✅ still works (single word,.everyon one-element array)memory_smart_search({ query: "BEAM bi-level LLM heuristic" })— ✅ now returns matching entriesSummary by CodeRabbit
Release Notes