-
-
Notifications
You must be signed in to change notification settings - Fork 3
Closed
Labels
featpriority:mustMust Have - 반드시 필요, 없으면 릴리즈 불가Must Have - 반드시 필요, 없으면 릴리즈 불가skillNew skill addition to .ai-rules/skills/New skill addition to .ai-rules/skills/sub-issue상위 이슈의 하위 작업상위 이슈의 하위 작업
Description
Parent: #738
Purpose
Write the core SKILL.md file for skill-creator. Includes the complete workflow for all 4 modes (Create/Eval/Improve/Benchmark), rewritten for the codingbuddy multi-tool context.
File Location
packages/rules/.ai-rules/skills/skill-creator/SKILL.md
Frontmatter
---
name: skill-creator
description: >-
Create new skills, modify and improve existing skills,
and measure skill performance with eval pipeline.
Use when creating a skill from scratch, editing or optimizing
an existing skill, running evals to test a skill,
or benchmarking skill performance.
disable-model-invocation: true
argument-hint: [create|eval|improve|benchmark] [skill-name]
---disable-model-invocation: true: Skill creation has significant side effects, so only users can invoke via/skill-creatorargument-hint: Accepts mode and skill name as arguments
Body Structure (target: under 500 lines)
1. Overview Section
- 1-2 sentence description of what skill-creator is
- Summary table of the 4 modes
- Link to codingbuddy skill structure rules (referencing existing 29 skill patterns)
2. Create Mode
Workflow:
- Capture Intent — Understand what the skill should do, trigger conditions, output format
- Interview & Research — Edge cases, success criteria, check for existing similar skills
- Write SKILL.md — Apply Progressive Disclosure 3 levels:
- Level 1: Metadata (~100 words) — name + description, always loaded into context
- Level 2: SKILL.md body (<500 lines) — loaded when skill is triggered
- Level 3: Bundled resources (unlimited) — loaded on demand
- Generate Directory — Scaffold using
scripts/init_skill.sh - Create Test Cases — Define 2-3 realistic test prompts
codingbuddy skill writing rules (extracted from existing patterns):
- "Core principle" one-liner required
- "Iron Law" code block recommended
- "When to Use" / "When NOT to Use" sections required
- Step-by-step procedures structured as Phase or Step
- Examples: reference existing
security-audit,test-driven-development, etc.
v2.0 Frontmatter Guide:
- Decision tree for which fields to set
→ references/frontmatter-guide.mdlink
Multi-tool Compatibility:
- Per-tool skill loading differences
→ references/multi-tool-compat.mdlink
3. Eval Mode
Workflow:
- Define Test Cases — Define test prompts + expected results using
evals/evals.jsonschema - Spawn Runs — Compare with-skill / baseline (without-skill or previous version)
- Draft Assertions — Write objectively verifiable assertions
- Grade — Grade using
agents/grader.mdagent, generategrading.json - Aggregate — Compute pass_rate, tokens, time stats with
scripts/aggregate_benchmark.py - Launch Viewer — Open HTML viewer with
eval-viewer/generate_review.py
Core Principles:
- Subjective skills (design, writing) get qualitative evaluation only
- Assertions must be "objectively verifiable"
- Each run executes in an independent agent (prevents context contamination)
JSON Schema Reference:
→ references/schemas.mdlink (evals.json, eval_metadata.json, grading.json, timing.json, feedback.json)
4. Improve Mode
Workflow:
- Read Feedback — Read
feedback.jsoncollected from viewer - Generalize — Generalize improvements, not just for specific test cases
- Apply Changes — Modify the skill
- Re-run Evals — Save new results in
iteration-<N+1>/ - Compare — Blind A/B comparison with
agents/comparator.md - Analyze — Pattern analysis + improvement suggestions with
agents/analyzer.md
Improvement Principles:
- Generalize from feedback (skill for 1M use cases, not just this example)
- Keep prompts concise (remove ineffective instructions)
- Explain "why" (theory of mind instead of overusing MUST)
- Consider bundling repetitive tasks into
scripts/
Iteration Exit Conditions:
- User is satisfied
- All feedback is empty
- No meaningful improvements remain
5. Benchmark Mode
Workflow:
- Generate Trigger Queries — should-trigger (8-10) + should-not-trigger (8-10) = 20
- Review with User — Review with
assets/eval_review.htmlviewer - Run Optimization Loop — Optimize description with
scripts/run_loop.py- 60/40 train/test split
- Measure trigger rate → generate improved description → select best
- Apply Result — Apply optimized description to SKILL.md frontmatter
6. Additional Resources Section
Supporting file reference links at the end of SKILL.md:
## Additional resources
- For eval/benchmark JSON schemas, see [references/schemas.md](references/schemas.md)
- For v2.0 frontmatter field guide, see [references/frontmatter-guide.md](references/frontmatter-guide.md)
- For multi-tool compatibility matrix, see [references/multi-tool-compat.md](references/multi-tool-compat.md)
- For grading instructions, see [agents/grader.md](agents/grader.md)
- For analysis patterns, see [agents/analyzer.md](agents/analyzer.md)
- For blind comparison setup, see [agents/comparator.md](agents/comparator.md)Writing Principles
- Do not directly copy Anthropic original — Self-rewrite using codingbuddy patterns
- Follow existing codingbuddy skill patterns — Core principle, Iron Law, When to Use, etc.
- Under 500 lines — Separate detailed schemas/guides into references/
- Multi-tool perspective — Note per-tool support status for Claude Code-specific features
Acceptance Criteria
- SKILL.md file created (
packages/rules/.ai-rules/skills/skill-creator/SKILL.md) - v2.0 frontmatter included (name, description, disable-model-invocation, argument-hint)
- Complete workflow for all 4 modes included (Create, Eval, Improve, Benchmark)
- Progressive Disclosure 3-level explanation included
- codingbuddy skill writing rules reflected (Core principle, Iron Law, When to Use)
- Additional resources section with supporting file links
- Under 500 lines
- No scope overlap with existing skills (rule-authoring, agent-design, prompt-engineering)
Dependencies
- None (parallelizable with C)
- B (agents/) recommended to proceed after this issue is complete
References
- Anthropic skill-creator SKILL.md
- Existing codingbuddy skill patterns:
packages/rules/.ai-rules/skills/security-audit/SKILL.md,test-driven-development/SKILL.md, etc.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
featpriority:mustMust Have - 반드시 필요, 없으면 릴리즈 불가Must Have - 반드시 필요, 없으면 릴리즈 불가skillNew skill addition to .ai-rules/skills/New skill addition to .ai-rules/skills/sub-issue상위 이슈의 하위 작업상위 이슈의 하위 작업