A design-structure idea for large Skill libraries: sparse activation + missed-case sweep + budgeted references #1288
Replies: 3 comments
-
|
A small follow-up after thinking more about this design idea. The part that may be more useful than a simple "top-k skills" framing is the networked, point-to-point neighbor activation layer. In other words, a large Skill library does not have to behave like either:
A more scalable structure may be: The key is that the graph is not traversed broadly. Each Skill or Skill-region can expose only a small set of typed neighbors, for example:
This is closer to a clinical workflow pattern: first identify the most likely problem, but still run a low-budget check for red flags and adjacent missed cases. It is also analogous to sparse expert activation: the expensive experts/references are not all loaded; only the local neighborhood around the routed hit is inspected. So the proposal is not just "add more metadata" or "retrieve fewer skills." It is a runtime-aware Skill graph structure: I tried to sketch this as a concrete public Skill package here: https://github.com/9s5bz2jvd2-lang/sparse-book-to-skill-distillation Structure diagram: The most important design principle from this experiment is:
Or, phrased another way:
This may be especially useful for large, high-risk Skill libraries where missing a nearby exception is worse than spending a few extra tokens on a focused neighbor sweep. |
Beta Was this translation helpful? Give feedback.
-
|
您好,我是刘思思。您的邮件已收到,我会尽快给您回复。谢谢!
|
Beta Was this translation helpful? Give feedback.
-
|
One more extension of the analogy “large Skill ≈ small external task model”: A mature Skill should probably have a proposal-based self-evolution loop. Not fully automatic mutation — that would be risky, because a Skill contains executable judgment: triggers, safety boundaries, references, scripts, and templates. But every invocation can leave small evidence about how the Skill behaved: So the Skill can “learn” from use, but in a reviewable way:
This makes the Skill graph cyclic rather than one-way: use produces logs; logs produce patches; patches make the next route/cache hit cheaper and safer. I added a short note and diagram extension in the example repo: The principle I would use is:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, thank you for publishing and maintaining this Skills repository. The recent public writing from Anthropic and Perplexity helped me understand Skills as more than prompt snippets: a Skill can be a structured folder with
SKILL.md, scripts, references, assets, configuration, and progressive disclosure.I would like to share a design-structure idea for large Skill libraries. This is not a bug report, and I am not asking for an immediate Claude Code runtime change. It is a discussion proposal about how Skills might be structured when a library grows from a few Skills to many domain-specific Skills.
My inspiration comes from two places:
I am not suggesting that Skills are equivalent to model-weight experts. The analogy is at the Skill-design level: many Skills may exist, but each task should activate only a few relevant ones, while a small shared layer and a missed-case checklist help preserve safety and completeness.
The current foundation
From the public guidance, the foundation already seems clear:
SKILL.mdshould act as the entry point.The question I am interested in is:
Proposed structure: Skill as a graph node
For large libraries, each Skill could optionally be treated as a node in a Skill graph rather than only as an isolated folder.
One possible structure:
These file names are only illustrative. The deeper idea is that a Skill can declare:
Two-stage Skill use pattern
Stage 1: sparse-first activation
Use trigger terms, descriptions, semantic matching, or a router Skill to activate only the most likely Skill region(s).
The leading Skill(s) receive the main attention/context budget.
Stage 2: associative missed-case sweep
After selecting the main route, run a short checklist over neighboring Skills and safety gates.
Examples:
This comes from clinical workflow: first consider the most likely problem, but still check the details that would be costly or unsafe to miss.
Budgeted references and role-based context
Not every Skill or reference needs the same amount of context.
shared-corerouted-highrouted-lowmissed-case-sweepheavy-referenceThis is slightly different from only saying “make Skills shorter.” It is a design habit for deciding how much of each Skill is needed for a specific task.
Example: nutrition / medical education Skills
A nutrition Skill library might have:
Shared always-on layer
Routed expert Skills
Missed-case sweep
This keeps safety and evidence active while avoiding loading every disease/task Skill for every request.
Possible documentation-only MVP
This repository might be able to explore the idea without any runtime change:
SKILL.mdcan remain a minimal entry while heavy material lives inreferences/.Why I am sharing this
I noticed that many discussions around Skills focus on context pressure, progressive disclosure, and router Skills. Those are important. My suggestion is just a design-structure framing that connects those concerns with sparse activation and clinical-style missed-case checking.
In one sentence:
This may already overlap with internal best practices or planned work. If so, I would appreciate any pointer. Thank you again for the public Skills examples and guidance.
Beta Was this translation helpful? Give feedback.
All reactions