feat(search): field-qualified queries (kind:/lang:/path:/name:) + fuzzy typo fallback by andreinknv · Pull Request #131 · colbymchenry/codegraph

andreinknv · 2026-04-28T22:23:15Z

Summary

Two UX improvements that turn free-text search into something a user can drive precisely.

1. Field-qualified queries

A new query parser splits the raw query into structured filters and a free-text remainder:

kind:function name:auth path:src/api authenticate

becomes:

{ kinds: ['function'], nameFilters: ['auth'],
  pathFilters: ['src/api'], text: 'authenticate' }

Filters compose with the SearchOptions arg (intersection). Unknown prefixes pass through as plain text so query "TODO:" keeps working. Quoted values (path:"my dir") handle whitespace. When the user supplies only filters with no text, the search uses a filter-only candidate scan instead of bailing out.

Recognised fields:

Prefix	Value
`kind:`	any `NodeKind` value (`function`, `method`, `class`, ...)
`lang:` (alias `language:`)	any `Language` value
`path:`	case-insensitive substring of `file_path`
`name:`	case-insensitive substring of `node.name`

2. Fuzzy typo fallback

When both FTS and LIKE return nothing AND the text is at least 3 chars, scan the distinct-name set with a bounded edit distance (≤2 for ≥5-char queries, ≤1 for 4-char). Bounded edit distance early-exits once the row min exceeds maxDist, so the per-query cost stays O(distinct-names × avg-name-length) with a very low constant.

Test plan

Verified live against ollama/ollama@v0.22.0:

Query	Result
`kind:function auth`	only function-kind hits
`lang:go path:server route`	Go files under `server/`
`getUssr` (typo)	finds `getUser`, `SetUser`
`confg` (typo)	finds `Config`

npx vitest run — 380 passed
npx tsc --noEmit clean
npm run build succeeds

🤖 Generated with Claude Code

…zy typo fallback Two UX improvements that turn a free-text search into something a real user can drive precisely. 1) Field-qualified queries. A new query parser (src/search/query-parser.ts) splits the raw query into structured filters and a free-text remainder: kind:function name:auth path:src/api authenticate becomes { kinds: ['function'], nameFilters: ['auth'], pathFilters: ['src/api'], text: 'authenticate' } Filters compose with the SearchOptions arg (intersection). Unknown prefixes pass through as plain text so `query "TODO:"` keeps working. Quoted values (`path:"my dir"`) handle whitespace. When the user specifies only filters with no text, the search uses a filter-only candidate scan instead of bailing out. Recognised today: kind: any NodeKind value lang: any Language value (alias: language:) path: case-insensitive substring of file_path name: case-insensitive substring of node.name 2) Fuzzy fallback. When BOTH FTS and LIKE return nothing AND the text is at least 3 chars, the resolver scans the distinct-name set with a bounded Damerau-Levenshtein-style edit distance (≤2 for ≥5 chars, ≤1 for 4-char queries, off for shorter). Bounded edit-distance early-exits once the row min exceeds maxDist, so this stays O(distinct-names * avg-name-length) with a very low constant. Verified live against ollama/ollama@v0.22.0: query "kind:function auth" → only function-kind hits query "lang:go path:server route" → Go files under server/ query "getUssr" (typo) → finds getUser, SetUser query "confg" (typo) → finds Config Full test suite: 380 passed.

…fuzzy fan-out cap, larger filter-only over-fetch, unit tests Five fixes from independent review: - parseQuery tokenizer: quotes that appear MID-token (path:"my dir/ file") were not being recognised — only quotes at the start of a token were treated as quoted spans. The fixture path:"my dir" parsed as ['path:"my', 'dir"'] instead of ['path:"my dir"']. Tokeniser is now a single state machine that scans into a token until whitespace OR a quote, and recognises quotes anywhere within the token (skips to the matching close quote). - searchNodesFuzzy: cap the per-name follow-up SQL queries at Math.max(limit*2, 50) AFTER edit-distance filtering. Without this, a project with many similar names (getUser1, getUser2...) could fan out far beyond limit queries before the inner-loop break kicks in. - searchAllByFilters (filter-only no-text path): bumped over-fetch multiplier from 2× to 5× so a selective post-filter (e.g. path:src/very/specific/file.ts) doesn't return fewer than limit results despite the DB having matches. - 23 new unit tests in __tests__/search-query-parser.test.ts: parseQuery covers known-field filter, lang/language alias, multiple kind: ORs, quoted spans (incl. mid-token), URL passthrough, empty-value passthrough, unknown prefix passthrough, unknown value passthrough, all-filters-no-text, empty input, 20k-char input. boundedEditDistance covers identity, single insertion/deletion/substitution, length-difference shortcut, empty inputs, case-sensitivity, early-exit correctness. Full test suite: 853 passed (up from 830).

# Conflicts: # src/db/queries.ts

Adds Steps K-O to walk the new PRs in dependency order: K: bug-fix wave (clean): colbymchenry#128, colbymchenry#129 L: resolution + search: colbymchenry#130 (resolve), colbymchenry#131 (resolve) M: extraction edges: colbymchenry#134 (resolve) N: biomarker stack: colbymchenry#132, colbymchenry#133 (both resolve, on top of colbymchenry#125) O: search advanced: colbymchenry#135 (resolve, on top of colbymchenry#131) Also flips colbymchenry#125 from merge_clean to merge_resolve - it now hits a queries.ts conflict after the Phase-4 stack lands (colbymchenry#111/colbymchenry#112/colbymchenry#123/colbymchenry#124 all extend the same QueryBuilder surface, so colbymchenry#125's biomarker columns no longer apply cleanly without a resolution). Validated end-to-end against colbymchenry/main HEAD: script ran clean through all 43 PRs, npm run build succeeded, full test suite reports 877/877 passing (was 829 before this wave: +48 from new tests added by the new PRs plus the reviewer-driven follow-ups).

Convert NodeKind and Language to runtime-iterable as const arrays (NODE_KINDS, LANGUAGES) so the query parser imports the canonical list instead of duplicating it. Also fix the path: JSDoc to say substring (matches the .includes() impl). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

colbymchenry · 2026-05-08T01:35:42Z

Reviewed and merging. Pushed a small polish commit:

Derived KIND_VALUES / LANGUAGE_VALUES from new NODE_KINDS / LANGUAGES as const arrays in types.ts so the parser stays in sync if a new kind or language gets added (e.g. lang:unknown now works because the type already had it).
Fixed the path: JSDoc — it claimed prefix match but the implementation is substring.

The field-qualified syntax and the bounded-edit-distance fuzzy fallback are both clean, well-tested wins. Thanks for the contribution.

andreinknv added 2 commits April 28, 2026 18:23

andreinknv mentioned this pull request Apr 28, 2026

feat(search): signature search + callers-of / callees-of graph qualifiers #135

Closed

5 tasks

andreinknv added a commit to andreinknv/codegraph that referenced this pull request Apr 28, 2026

Merge PR colbymchenry#131: feat(search) field-qualified + fuzzy fallback

8578c48

# Conflicts: # src/db/queries.ts

andreinknv mentioned this pull request Apr 28, 2026

Merge-order guide for the open PR backlog (4 refactor PRs unblock the rest) #120

Open

This was referenced May 6, 2026

feat(search): signature search + callers-of / callees-of graph qualifiers mschreib28/codegraph#2

Open

feat(search): field-qualified queries (kind:/lang:/path:/name:) + fuzzy typo fallback mschreib28/codegraph#6

Open

colbymchenry merged commit 56f6b3b into colbymchenry:main May 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(search): field-qualified queries (kind:/lang:/path:/name:) + fuzzy typo fallback#131

feat(search): field-qualified queries (kind:/lang:/path:/name:) + fuzzy typo fallback#131
colbymchenry merged 3 commits into
colbymchenry:mainfrom
andreinknv:feat/search-fields-and-fuzzy

andreinknv commented Apr 28, 2026

Uh oh!

colbymchenry commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andreinknv commented Apr 28, 2026

Summary

1. Field-qualified queries

2. Fuzzy typo fallback

Test plan

Uh oh!

colbymchenry commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants