Description-based skill matching: what's the activation threshold? #1409

K8Rayner · 2026-05-24T16:34:12Z

K8Rayner
May 24, 2026

Environment

Copilot CLI: 1.0.36-0
Python SDK: github_copilot_sdk==0.3.0
OS: Windows 11
Discovery method: skill_directories=[...] on Session(...)

Question

What is the matching algorithm / threshold for description-based skill body injection? Natural user prompts with moderate keyword overlap don't trigger it, but heavily keyword-loaded prompts do. Is there guidance on how to write description fields that activate reliably for the kinds of prompts real users type?

What we observed

We have 6 skills with keyword-rich description fields (all under the 1024-char cap, all visible in skills.list()). We tested 12 sessions across 5 domains and found that description matching works, but the activation threshold is high:

Prompt style	Body injected?	`SKILL_INVOKED` fires?	Sessions
Explicit `/skill-name ...`	✅ Yes	✅ Yes	2
Named in natural language (`Use the X skill to…`)	✅ Yes	✅ Yes	1
Natural question, ~3–6 keyword overlaps with `description`	❌ No	❌ No	7
Same question + appended clause stuffed with 12+ `description` keywords	✅ Yes	✅ Yes	2

We confirmed injection / non-injection by crafting prompts targeting content that exists only in the SKILL.md body (not in our base instructions, tool definitions, or model training data). When the body was injected, responses contained our unique workflow guidance and tool-specific parameter values. When it wasn't, responses gave generic advice from training data, and context token usage matched a bare prompt.

The gap

The natural-language prompts that real users type (row 3) sit well below the activation threshold. These prompts contain relevant keywords that appear in the description field, but the overlap isn't dense enough to trigger injection. Users only get the benefit of our skill content if they either:

Know the skill name and invoke it explicitly, or
Happen to write an unusually keyword-dense prompt

This makes the description field hard to tune. We don't know whether the matching is cosine similarity, keyword count, TF-IDF, or something else — so we can't optimize our descriptions for natural activation.

What would help

Any of the following would make skills more useful for natural prompts:

Document the matching algorithm — even a high-level description (e.g. "semantic similarity above threshold X" or "N keyword matches required") would let skill authors tune their descriptions.
Lower the threshold for description-based injection, so natural prompts with moderate keyword overlap trigger it.
Expose a confidence score — e.g. on skills.list() or a new skills.match(prompt) RPC — so hosts can see how close a prompt came to triggering a skill and surface that in their UI.

What we expected

We expected that a natural prompt sharing several keywords with a skill's description (e.g. 3–6 matches across a description that contains 50+ domain keywords) would be sufficient to trigger body injection. The current threshold seems to require near-exhaustive keyword overlap, which users won't naturally produce.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Description-based skill matching: what's the activation threshold? #1409

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Description-based skill matching: what's the activation threshold? #1409

Uh oh!

K8Rayner May 24, 2026

Environment

Question

What we observed

The gap

What would help

What we expected

Replies: 0 comments

K8Rayner
May 24, 2026