Skip to content

fix: split multi-word queries in standalone memory_smart_search#148

Merged
rohitg00 merged 1 commit into
mainfrom
fix/multi-word-smart-search
Apr 15, 2026
Merged

fix: split multi-word queries in standalone memory_smart_search#148
rohitg00 merged 1 commit into
mainfrom
fix/multi-word-smart-search

Conversation

@rohitg00
Copy link
Copy Markdown
Owner

@rohitg00 rohitg00 commented Apr 15, 2026

Fixes #147

Root cause

standalone.ts builds a concatenated searchable string from title, content, files, concepts, and sessionIds, then checks:

return text.includes(query);

A multi-word query like "BEAM bi-level LLM heuristic" requires that exact substring to appear contiguously — it won't match when the words are spread across different fields.

Fix

// before
return text.includes(query);

// after
return query.split(/\s+/).every((word) => text.includes(word));

Each token in the query must appear somewhere in the searchable text (AND semantics). Single-word queries are unaffected.

Test

  • memory_smart_search({ query: "BEAM" }) — ✅ still works (single word, .every on one-element array)
  • memory_smart_search({ query: "BEAM bi-level LLM heuristic" }) — ✅ now returns matching entries

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Enhanced search matching: search results now require all individual query terms to be present in the content, improving result relevance and accuracy.

String.includes(query) required the full query to appear as a contiguous
substring, so "BEAM bi-level LLM" returned nothing even when each word
existed across title/content/concepts.

Split on whitespace and require every token to match (AND semantics).
Single-word queries are unaffected — a one-element array with .every
behaves identically to the old .includes.

Fixes #147
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

The search filter logic in the standalone memory search function has been updated to tokenize the query into individual words and verify that all words appear in the indexed text, rather than requiring the entire query string to match as a contiguous substring.

Changes

Cohort / File(s) Summary
Standalone Search Logic
src/mcp/standalone.ts
Updated the search filter to split queries into individual words and check that every word appears in the searchable text using .split(/\s+/).every(...), replacing the previous .includes(query) check that required the full query as a contiguous substring.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 A rabbit hops through words so spry,
No longer chained to substrings nigh,
Each token now can find its way,
Multi-word searches dance and play! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: splitting multi-word queries in standalone memory_smart_search.
Linked Issues check ✅ Passed The code change implements the exact fix proposed in issue #147: splitting queries on whitespace and requiring every token to appear in the searchable text.
Out of Scope Changes check ✅ Passed The change is narrowly scoped to the memory_smart_search filter logic in standalone.ts, directly addressing the issue without introducing unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/multi-word-smart-search

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/mcp/standalone.ts (1)

131-131: Precompute query tokens once per request.

Line 131 splits query for every memory row. Move tokenization outside .filter(...) to avoid repeated regex work.

♻️ Suggested tweak
       const query = rawQuery.trim().toLowerCase();
+      const queryWords = query.split(/\s+/);
       const limit = parseLimit(args.limit);
       const all =
         await kvInstance.list<Record<string, unknown>>("mem:memories");
       const results = all
         .filter((m) => {
@@
-          return query.split(/\s+/).every((word) => text.includes(word));
+          return queryWords.every((word) => text.includes(word));
         })
         .slice(0, limit);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mcp/standalone.ts` at line 131, The filter currently calls
query.split(/\s+/) for every memory row which repeats regex work; compute the
query tokens once per request (e.g., const queryTokens = query.split(/\s+/))
outside the .filter/... callback and then replace the per-row check with
queryTokens.every(word => text.includes(word)) in the function that performs the
memory filtering (the location using query.split(/\s+/) and the return line).
This moves tokenization out of the per-row loop and avoids repeated regex
execution.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/mcp/standalone.ts`:
- Line 131: The filter currently calls query.split(/\s+/) for every memory row
which repeats regex work; compute the query tokens once per request (e.g., const
queryTokens = query.split(/\s+/)) outside the .filter/... callback and then
replace the per-row check with queryTokens.every(word => text.includes(word)) in
the function that performs the memory filtering (the location using
query.split(/\s+/) and the return line). This moves tokenization out of the
per-row loop and avoids repeated regex execution.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 46b9e82a-6e50-4ac0-a046-f2ee86570bd8

📥 Commits

Reviewing files that changed from the base of the PR and between 2c9911d and 9865eea.

📒 Files selected for processing (1)
  • src/mcp/standalone.ts

@rohitg00 rohitg00 merged commit 5a599c6 into main Apr 15, 2026
3 checks passed
@rohitg00 rohitg00 deleted the fix/multi-word-smart-search branch April 15, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

memory_smart_search fails for multi-word queries (String.includes on full query)

1 participant