feat(skill-search): multi-query merge matching upstream gh skill#393
Merged
Conversation
The single content-search query missed skills like `plugins/cargowise/skills/...`
whose SKILL.md text doesn't contain the literal "cargowise" — the directory
name is part of the path, not the body. Upstream `gh skill search` solves
this by dispatching up to four parallel Code Search queries with descending
priority and merging by priority. Implements the same approach.
- `buildSearchQueries(query, owner)` returns up to four `SkillSearchQuery`
entries with explicit priorities:
P1 `filename:SKILL.md path:<pathTerm>` (+ optional user clause)
P2 `filename:SKILL.md <pathTerm>` (only when query has
spaces, i.e. pathTerm
differs from query)
P3 `filename:SKILL.md user:<query>` (only when no explicit
`--owner` AND
`couldBeOwner(query)`)
P4 `filename:SKILL.md <query>` (+ optional user clause)
pathTerm is `query.replace(/ /g, '-')`.
- `couldBeOwner(query)` exported helper — matches GitHub's login pattern
`^[a-zA-Z0-9]([a-zA-Z0-9-]{0,37}[a-zA-Z0-9])?$`.
- `searchSkills` now dispatches all queries in parallel with
`Promise.allSettled`. Non-primary failures are logged (configurable
`deps.logger`, defaults to stderr) and the surviving buckets still merge;
only a P4 failure throws. Items are concatenated in priority order so
path-bucket hits sort ahead of primary-bucket hits, then deduped by
`repo + qualifiedName` keeping the first (higher-priority) occurrence.
`total` reports the final deduped count rather than summing upstream
totals (which would multi-count across queries).
- Test suite reorganised around a small `makeFakeFetch` helper that
dispatches per Code Search query string, so the multi-query path is
exercised realistically. 15 new cases: `couldBeOwner` (5),
`buildSearchQueries` (7), and merge/dedup behaviour for the cargowise
scenario, priority ordering, P1-wins-over-P4 dedup, and the
hyphen-bucket placement for multi-word queries. The pre-existing single-
call tests are migrated to the flat-array shorthand that the helper
applies to every bucket.
`bun test`: 1232 pass / 0 fail. Focused: 31 pass / 0 fail.
Closes #392
Deploying allagents with
|
| Latest commit: |
9a8aab6
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://d3596af9.allagents.pages.dev |
| Branch Preview URL: | https://feat-392-skill-search-multi.allagents.pages.dev |
Live verification against GitHub's real Code Search API showed that the
spec's `filename:SKILL.md path:<term>` query returns zero results for the
cargowise case — `path:` is a prefix-match qualifier on path components, so
`path:cargowise` only matches files whose path starts with `cargowise/`,
not files at `plugins/cargowise/skills/...`. The right qualifier for the
substring semantics the issue actually wanted is `in:path <term>`.
Probing seven path-qualifier variants against `--owner WiseTechGlobal`:
```
path:cargowise → 0
path:plugins/cargowise → 38 (prefix works when given the full prefix)
path:**/cargowise/** → 0 (no glob support)
in:path cargowise → 103 ← substring match, what we want
```
After flipping P1 from `path:<pathTerm>` → `in:path <pathTerm>`, the
cargowise --owner WiseTechGlobal search returns 23 unique skills (up from
8 with the old query), including entire trees the content-only P4 missed:
- `WiseTechGlobal/WTG.AI.Prompts plugins/cargowise/skills/cw-gui` (+9 more
cargowise plugin skills)
- `WiseTechGlobal/WZG.Playbook.Content plugins/cargowise-customs/skills/*`
(4 skills under a cargowise-customs subtree the content query didn't
surface at all)
- `src/core/skill-search.ts`: change P1's query string from
`path:${pathTerm}` to `in:path ${pathTerm}`. Doc-comment updated to
explain the difference between the two qualifiers.
- `tests/unit/core/skill-search.test.ts`: assert the new `in:path <term>`
form everywhere we previously asserted `path:<term>`. The
`logs-and-continues` and merge/dedup matchers that dispatched on
`q.includes('path:')` now dispatch on `q.includes('in:path')`. Added a
negative assertion (`not.toContain('path:cargowise')`) on the
no-owner cargowise case to lock in the qualifier choice.
Builds clean, full test suite 1232 pass / 0 fail, focused suite 31 pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The single content-search query missed skills like
plugins/cargowise/skills/...whose SKILL.md text doesn't contain the literalcargowise— the folder name is part of the path, not the body. Upstreamgh skill searchsolves this by dispatching up to four parallel Code Search queries and merging by priority. Implements the same approach.Query set
buildSearchQueries(query, owner)returns up to four entries:pathfilename:SKILL.md path:<pathTerm>(+user:<owner>if set)hyphenfilename:SKILL.md <pathTerm>(+user:<owner>if set)query.replace(/ /g, '-'), so it differs fromquery)ownerfilename:SKILL.md user:<query>--ownerANDcouldBeOwner(query)primaryfilename:SKILL.md <query>(+user:<owner>if set)couldBeOwner(query)matches GitHub's login pattern^[a-zA-Z0-9]([a-zA-Z0-9-]{0,37}[a-zA-Z0-9])?$.Dispatch + merge
Promise.allSettled.deps.logger, defaults to stderr) and the surviving buckets still merge.priority: 4failure throws — therate-limit/ 422 paths still hit the right error kind because all queries fail together when the API itself is sick.repo + qualifiedNamekeeping the first (higher-priority) occurrence.totalreports the final deduped count rather than summing upstream totals (which would multi-count across overlapping queries).Example query shapes
Test plan
bun run build,bun run typecheckbun test tests/unit/core/skill-search.test.ts— 31 pass / 0 failbun test— 1232 pass / 0 failcouldBeOwneraccepts/rejects: plain, hyphenated, leading/trailing hyphens, slashes/spaces/dots, > 39 charsbuildSearchQueriesfor: bare single-word, single-word +--owner(P3 skipped), multi-word (P2 emitted),--owneralways skips P3, slash-containing queries skip P3, owner-shaped queries include P3, priorities sort ascendingsearchSkillscargowise path-only path (P4 empty), priority ordering (P1 ahead of P4), in-bucket dedup keeps higher-priority occurrence, distinct-repo same-name preservation, hyphen bucket placement for multi-word queries, non-primary failure → log + continue (test inspects capturedloggermessages)Closes #392