fix: Overhaul search relevance — prioritize exact matches, eliminate false positives by shellcorpnet · Pull Request #266 · openclaw/clawhub

shellcorpnet · 2026-02-13T19:12:25Z

Problem

Searching for a skill by its exact name returns it far down the results list. For example, searching "Remind Me" shows the actual "Remind Me" skill at position #71 — the #1 result doesn't even mention "Remind Me" anywhere in its name or description.

Reported in #15.

Root Cause

Three compounding issues in the search pipeline:

matchesExactTokens only required ONE query token to match — so "Remind Me" matched any skill containing the word "me" (or any word starting with "me"). Since "me" is extremely common, nearly every skill passed the "exact match" filter, defeating its purpose entirely.
Lexical boosts were too weak relative to vector similarity scores — even when a skill's name exactly matched the query, the boost was only +1.1 to +1.4. This was easily overwhelmed by high vector cosine similarity scores from semantically-adjacent but differently-named skills.
Summary/description text wasn't factored into lexical scoring — only displayName and slug were checked for lexical boosts, missing cases where the query terms appear prominently in the skill's summary.

Changes

`convex/lib/searchText.ts` — Core matching logic

`matchesExactTokens` — now requires ALL tokens (was: any one)

- // Require at least one token to prefix-match
- return queryTokens.some((queryToken) =>
+ // Require ALL query tokens to prefix-match at least one text token
+ return queryTokens.every((queryToken) =>
    textTokens.some((textToken) => textToken.startsWith(queryToken)),
  )

Before: "Remind Me" → matches anything with "remind" OR "me"
After: "Remind Me" → only matches skills containing BOTH "remind" AND "me"

This single change eliminates the vast majority of false positives.

New: `scoreTokenMatch` — granular lexical scoring

A new scoring function that returns a numeric relevance score instead of a boolean:

Exact token match: +2 per token (e.g., query "remind" matches text token "remind")
Prefix match: +1 per token (e.g., query "remind" matches text token "reminder")
All-tokens bonus: +3 when every query token matches
Threshold: For 1-2 token queries, ALL tokens must match. For 3+ tokens, 60% must match.

This enables ranked results within the set of matching skills.

`convex/search.ts` — Search scoring weights

Lexical boost weights increased ~2x

Boost	Before	After
Slug exact match	1.4	3.0
Slug prefix match	0.8	1.5
Name exact match	1.1	2.5
Name prefix match	0.6	1.2

These higher weights ensure that a skill literally named "Remind Me" will rank above a skill that's merely semantically similar (e.g., a "Notifications" skill with high vector similarity but no lexical match).

New: Summary text matching

Added SUMMARY_MATCH_WEIGHT = 0.3 — the skill's summary is now scored using scoreTokenMatch and contributes to the final ranking. This helps surface skills where the query terms appear in the description even if the name/slug don't match exactly.

`convex/lib/searchText.test.ts` — Updated tests

Updated matchesExactTokens tests to verify ALL-token matching behavior
Added scoreTokenMatch tests verifying exact > prefix scoring and threshold behavior

`convex/search.test.ts` — All existing tests pass unchanged

The scoreSkillResult function's new summary parameter is optional, so all 10 existing search tests continue to pass without modification.

Impact

Before (searching "Remind Me")

❌ Unrelated skill (matched on "me" alone, high vector score)
❌ Another unrelated skill
...
✅ Actual "Remind Me" skill

After

✅ "Remind Me" skill (exact name match: +2.5 name boost + summary score)
Skills with "remind" and "me" in their text
Semantically similar skills via vector search

Test Results

Test Files  58 passed (58)
     Tests  393 passed (393)
  Duration  7.47s

All 393 tests pass, including 16 search-specific tests.

Fixes #15

Greptile Overview

Greptile Summary

This PR tightens lexical matching and rebalances scoring so exact-name queries rank as expected.

Updates matchesExactTokens to require all query tokens to prefix-match across a skill’s displayName/slug/summary, reducing false positives for common tokens.
Introduces scoreTokenMatch for graded lexical scoring and uses it in searchSkills as an additional (lightweight) summary-based boost.
Increases slug/name lexical boost weights so literal matches can outrank purely semantic (vector) similarity.
Updates unit tests in convex/lib/searchText.test.ts to reflect the stricter token matching and validate scoreTokenMatch behavior.

No functional regressions or runtime errors were found in the changed code paths; existing search tests remain compatible with the optional summary parameter in scoreSkillResult.

Confidence Score: 5/5

This PR is safe to merge with minimal risk.
Changes are localized to search tokenization/matching and scoring; updated tests cover the stricter matching behavior, and the search scoring update is additive (optional summary) without breaking existing call sites.
No files require special attention

_{Last reviewed commit: 025f665}

_{(2/5) Greptile learns from your feedback when you react with thumbs up/down!}

- Require ALL query tokens to match (was: only ONE), preventing false positives - Add scoreTokenMatch for granular lexical scoring (exact > prefix > partial) - Double+ lexical boost weights for slug/name matches - Add summary text matching to scoring pipeline - Update tests for new matching behavior Fixes openclaw#15

vercel · 2026-02-13T19:12:30Z

@shellcorpnet is attempting to deploy a commit to the Amantus Machina Team on Vercel.

A member of the Team first needs to authorize it.

steipete · 2026-02-14T00:45:57Z

Closing this one to keep fix scoped.\n\nNeeded behavior:\n- short exact-name queries only (1-2 tokens): require all query tokens\n- longer queries: keep semantic/vector recall; no strict lexical gate\n- skip summary-weight + broader scoring rebalance in this PR\n\nPlease re-open with a narrow patch for #15.

steipete closed this Feb 14, 2026

clawsweeper Bot mentioned this pull request Apr 28, 2026

Clawdhub search is not useful #15

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Overhaul search relevance — prioritize exact matches, eliminate false positives#266

fix: Overhaul search relevance — prioritize exact matches, eliminate false positives#266
shellcorpnet wants to merge 1 commit into
openclaw:mainfrom
shellcorpnet:fix/search-relevance-overhaul

shellcorpnet commented Feb 13, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

vercel Bot commented Feb 13, 2026

Uh oh!

steipete commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shellcorpnet commented Feb 13, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Changes

convex/lib/searchText.ts — Core matching logic

matchesExactTokens — now requires ALL tokens (was: any one)

New: scoreTokenMatch — granular lexical scoring

convex/search.ts — Search scoring weights

Lexical boost weights increased ~2x

New: Summary text matching

convex/lib/searchText.test.ts — Updated tests

convex/search.test.ts — All existing tests pass unchanged

Impact

Before (searching "Remind Me")

After

Test Results

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Uh oh!

vercel Bot commented Feb 13, 2026

Uh oh!

steipete commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shellcorpnet commented Feb 13, 2026 •

edited by greptile-apps Bot

Loading

`convex/lib/searchText.ts` — Core matching logic

`matchesExactTokens` — now requires ALL tokens (was: any one)

New: `scoreTokenMatch` — granular lexical scoring

`convex/search.ts` — Search scoring weights

`convex/lib/searchText.test.ts` — Updated tests

`convex/search.test.ts` — All existing tests pass unchanged