Skip to content

feat(skill-search): multi-query merge matching upstream gh skill#393

Merged
christso merged 2 commits into
mainfrom
feat/392-skill-search-multi-query
May 12, 2026
Merged

feat(skill-search): multi-query merge matching upstream gh skill#393
christso merged 2 commits into
mainfrom
feat/392-skill-search-multi-query

Conversation

@christso
Copy link
Copy Markdown
Contributor

Summary

The single content-search query missed skills like plugins/cargowise/skills/... whose SKILL.md text doesn't contain the literal cargowise — the folder name is part of the path, not the body. Upstream gh skill search solves this by dispatching up to four parallel Code Search queries and merging by priority. Implements the same approach.

Query set

buildSearchQueries(query, owner) returns up to four entries:

Pri Label Query Conditions
1 path filename:SKILL.md path:<pathTerm> (+ user:<owner> if set) always
2 hyphen filename:SKILL.md <pathTerm> (+ user:<owner> if set) only when the query contains spaces (pathTerm = query.replace(/ /g, '-'), so it differs from query)
3 owner filename:SKILL.md user:<query> only when no --owner AND couldBeOwner(query)
4 primary filename:SKILL.md <query> (+ user:<owner> if set) always

couldBeOwner(query) matches GitHub's login pattern ^[a-zA-Z0-9]([a-zA-Z0-9-]{0,37}[a-zA-Z0-9])?$.

Dispatch + merge

  • All queries fire in parallel via Promise.allSettled.
  • Non-primary failures are logged (configurable deps.logger, defaults to stderr) and the surviving buckets still merge.
  • Only a priority: 4 failure throws — the rate-limit / 422 paths still hit the right error kind because all queries fail together when the API itself is sick.
  • Items are concatenated in priority order so the path bucket wins over the primary bucket when both match the same skill.
  • Dedup is by repo + qualifiedName keeping the first (higher-priority) occurrence.
  • total reports the final deduped count rather than summing upstream totals (which would multi-count across overlapping queries).

Example query shapes

$ allagents skill search cargowise --owner WiseTechGlobal
  P1: filename:SKILL.md path:cargowise user:WiseTechGlobal      ← finds the path
  P4: filename:SKILL.md cargowise      user:WiseTechGlobal      ← content (often empty)

$ allagents skill search "build worker"
  P1: filename:SKILL.md path:build-worker
  P2: filename:SKILL.md build-worker
  P4: filename:SKILL.md build worker

$ allagents skill search octocat
  P1: filename:SKILL.md path:octocat
  P3: filename:SKILL.md user:octocat
  P4: filename:SKILL.md octocat

Test plan

  • bun run build, bun run typecheck
  • bun test tests/unit/core/skill-search.test.ts — 31 pass / 0 fail
  • bun test — 1232 pass / 0 fail
  • 15 new cases beyond skill search: namespace-aware dedup and ranking #390's coverage:
    • couldBeOwner accepts/rejects: plain, hyphenated, leading/trailing hyphens, slashes/spaces/dots, > 39 chars
    • buildSearchQueries for: bare single-word, single-word + --owner (P3 skipped), multi-word (P2 emitted), --owner always skips P3, slash-containing queries skip P3, owner-shaped queries include P3, priorities sort ascending
    • searchSkills cargowise path-only path (P4 empty), priority ordering (P1 ahead of P4), in-bucket dedup keeps higher-priority occurrence, distinct-repo same-name preservation, hyphen bucket placement for multi-word queries, non-primary failure → log + continue (test inspects captured logger messages)

Closes #392

The single content-search query missed skills like `plugins/cargowise/skills/...`
whose SKILL.md text doesn't contain the literal "cargowise" — the directory
name is part of the path, not the body. Upstream `gh skill search` solves
this by dispatching up to four parallel Code Search queries with descending
priority and merging by priority. Implements the same approach.

- `buildSearchQueries(query, owner)` returns up to four `SkillSearchQuery`
  entries with explicit priorities:
    P1 `filename:SKILL.md path:<pathTerm>`           (+ optional user clause)
    P2 `filename:SKILL.md <pathTerm>`                (only when query has
                                                      spaces, i.e. pathTerm
                                                      differs from query)
    P3 `filename:SKILL.md user:<query>`              (only when no explicit
                                                      `--owner` AND
                                                      `couldBeOwner(query)`)
    P4 `filename:SKILL.md <query>`                   (+ optional user clause)
  pathTerm is `query.replace(/ /g, '-')`.

- `couldBeOwner(query)` exported helper — matches GitHub's login pattern
  `^[a-zA-Z0-9]([a-zA-Z0-9-]{0,37}[a-zA-Z0-9])?$`.

- `searchSkills` now dispatches all queries in parallel with
  `Promise.allSettled`. Non-primary failures are logged (configurable
  `deps.logger`, defaults to stderr) and the surviving buckets still merge;
  only a P4 failure throws. Items are concatenated in priority order so
  path-bucket hits sort ahead of primary-bucket hits, then deduped by
  `repo + qualifiedName` keeping the first (higher-priority) occurrence.
  `total` reports the final deduped count rather than summing upstream
  totals (which would multi-count across queries).

- Test suite reorganised around a small `makeFakeFetch` helper that
  dispatches per Code Search query string, so the multi-query path is
  exercised realistically. 15 new cases: `couldBeOwner` (5),
  `buildSearchQueries` (7), and merge/dedup behaviour for the cargowise
  scenario, priority ordering, P1-wins-over-P4 dedup, and the
  hyphen-bucket placement for multi-word queries. The pre-existing single-
  call tests are migrated to the flat-array shorthand that the helper
  applies to every bucket.

`bun test`: 1232 pass / 0 fail. Focused: 31 pass / 0 fail.

Closes #392
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 12, 2026

Deploying allagents with  Cloudflare Pages  Cloudflare Pages

Latest commit: 9a8aab6
Status: ✅  Deploy successful!
Preview URL: https://d3596af9.allagents.pages.dev
Branch Preview URL: https://feat-392-skill-search-multi.allagents.pages.dev

View logs

Live verification against GitHub's real Code Search API showed that the
spec's `filename:SKILL.md path:<term>` query returns zero results for the
cargowise case — `path:` is a prefix-match qualifier on path components, so
`path:cargowise` only matches files whose path starts with `cargowise/`,
not files at `plugins/cargowise/skills/...`. The right qualifier for the
substring semantics the issue actually wanted is `in:path <term>`.

Probing seven path-qualifier variants against `--owner WiseTechGlobal`:

```
path:cargowise                  →   0
path:plugins/cargowise          →  38   (prefix works when given the full prefix)
path:**/cargowise/**            →   0   (no glob support)
in:path cargowise               → 103   ← substring match, what we want
```

After flipping P1 from `path:<pathTerm>` → `in:path <pathTerm>`, the
cargowise --owner WiseTechGlobal search returns 23 unique skills (up from
8 with the old query), including entire trees the content-only P4 missed:

- `WiseTechGlobal/WTG.AI.Prompts plugins/cargowise/skills/cw-gui` (+9 more
  cargowise plugin skills)
- `WiseTechGlobal/WZG.Playbook.Content plugins/cargowise-customs/skills/*`
  (4 skills under a cargowise-customs subtree the content query didn't
  surface at all)

- `src/core/skill-search.ts`: change P1's query string from
  `path:${pathTerm}` to `in:path ${pathTerm}`. Doc-comment updated to
  explain the difference between the two qualifiers.
- `tests/unit/core/skill-search.test.ts`: assert the new `in:path <term>`
  form everywhere we previously asserted `path:<term>`. The
  `logs-and-continues` and merge/dedup matchers that dispatched on
  `q.includes('path:')` now dispatch on `q.includes('in:path')`. Added a
  negative assertion (`not.toContain('path:cargowise')`) on the
  no-owner cargowise case to lock in the qualifier choice.

Builds clean, full test suite 1232 pass / 0 fail, focused suite 31 pass.
@christso christso merged commit 8ab5987 into main May 12, 2026
1 check passed
@christso christso deleted the feat/392-skill-search-multi-query branch May 12, 2026 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

skill search: multi-query approach matching upstream gh skill

1 participant