Skip to content

skill search: multi-query approach matching upstream gh skill #392

@christso

Description

@christso

Summary

skill search only runs a single content search query. The upstream gh skill search runs 4 parallel queries and merges results. This means our search misses skills where the query matches the file path but not the file content — which is the common case for domain-specific skills (e.g. "cargowise" in plugins/cargowise/skills/code-review/SKILL.md).

Problem

Current query: cargowise filename:SKILL.md path:SKILL.md

This searches for "cargowise" inside SKILL.md files. But domain skills like CargoWise have generic SKILL.md content (e.g. "Deep code review...") — the domain context is in the repo path, not the file content.

Upstream gh skill approach (from cli/cli trunk)

The upstream searchByKeyword function runs up to 4 parallel queries:

  1. Primary (content): filename:SKILL.md <query> — searches inside SKILL.md
  2. Path: filename:SKILL.md path:<pathTerm> — matches file path (spaces → hyphens)
  3. Owner (auto-detected): filename:SKILL.md user:<query> — when query looks like a GitHub username (no explicit --owner set)
  4. Hyphen (multi-word): filename:SKILL.md <pathTerm> — when query has spaces, also try hyphenated form ("mcp apps" → "mcp-apps")

Merge priority: path > hyphen > owner > primary content. Then deduplicate by repo/qualifiedName.

Key helper: couldBeOwner(query) — returns true if the query looks like a valid GitHub username (alphanumeric + dashes, ≤ 39 chars). This is what triggers the auto-detected owner search.

Proposal

Match upstream's multi-query approach in src/core/skill-search.ts:

  1. Add buildSearchQueries(query, owner) function that returns an array of { q: string, priority: number } objects:

    • Priority 1 (highest): filename:SKILL.md path:<pathTerm> where pathTerm = query.replace(/ /g, '-')
    • Priority 2: filename:SKILL.md <pathTerm> (hyphenated content search, only if query has spaces)
    • Priority 3: filename:SKILL.md user:<query> (only if no explicit --owner and couldBeOwner(query) is true)
    • Priority 4 (lowest): filename:SKILL.md <query> (primary content search)
  2. Add couldBeOwner(query) helper — regex ^[a-zA-Z0-9]([a-zA-Z0-9-]{0,37}[a-zA-Z0-9])?$ (GitHub username rules)

  3. Run queries in parallel using Promise.all / Promise.allSettled. Each query is an independent fetch call to api.github.com/search/code.

  4. Merge results by priority order (path > hyphen > owner > primary), then deduplicate by repo + qualifiedName (existing dedup logic).

  5. Error handling: if a non-primary query fails, log and continue (don't fail the whole search). Only fail if the primary query fails.

  6. Rate limit consideration: 4 parallel requests vs 1. Authenticated = 30 req/min, so 4 parallel is fine. Unauthenticated = 10 req/min — 4 parallel uses 4 of 10. Acceptable.

Scope

  • src/core/skill-search.ts — add query builder, parallel execution, merge logic
  • tests/unit/core/skill-search.test.ts — test each query type, merge order, dedup

Why

Without this, searching "cargowise" in WiseTechGlobal returns 0 results despite 37+ CargoWise skills existing at plugins/cargowise/skills/. The path-based search is the primary way users find domain-specific skills.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions