"A true master teaches not by telling, but by refining." - The Skill Sensei
Sensei automates the improvement of Agent Skills frontmatter compliance using the Ralph loop pattern - iteratively improving skills until they reach Medium-High compliance with all tests passing.
- Overview
- Quick Start
- Prerequisites
- How It Works
- Configuration
- Scoring Criteria
- Examples
- Troubleshooting
- Contributing
Skills without proper frontmatter lead to skill collision - agents invoking the wrong skill for a given prompt. Common issues include:
- No triggers - Agent doesn't know when to activate the skill
- No anti-triggers - Agent doesn't know when NOT to use the skill
- Brief descriptions - Not enough context for accurate matching
- Token bloat - Oversized skills waste context window
Sensei implements the "Ralph Wiggum" technique:
- Read - Load the skill's current state and token count
- Score - Evaluate frontmatter compliance
- Improve - Add triggers, anti-triggers, compatibility
- Verify - Run tests to ensure changes work
- Check Tokens - Analyze token usage, gather suggestions
- Summary - Display before/after with suggestions
- Prompt - Ask user: Commit, Create Issue, or Skip?
- Repeat - Until target score reached
Run sensei on my-skill-name
Run sensei on my-skill-name --fast
Run sensei on skill-a, skill-b, skill-c
Run sensei on all Low-adherence skills
Run sensei on all skills
# Count tokens in all markdown files
npm run tokens -- count
# Count tokens in specific files
npm run tokens -- count SKILL.md references/*.md
# Check files against token limits
npm run tokens -- check
# Check with strict mode (exits 1 if limits exceeded)
npm run tokens -- check --strict
# Get optimization suggestions
npm run tokens -- suggest
# Compare with previous commit
npm run tokens -- compare HEAD~1| Flag | Description |
|---|---|
--fast |
Skip tests for faster iteration |
--skip-integration |
Skip integration tests (unit + trigger tests only) |
⚠️ Note: Using--fastspeeds up the loop significantly but may miss issues. Consider running full tests before final commit.
-
Node.js 18+ - For running token management scripts
node --version
-
Git - For commits and comparisons
git --version
- Test Framework - Jest, pytest, or similar for trigger tests
# Clone to your skills directory
git clone https://github.com/spboyer/sensei.git ~/.copilot/skills/sensei
# Install token CLI dependencies
cd ~/.copilot/skills/sensei/scripts && npm installThe skill is now available in Copilot CLI. Invoke with:
Run sensei on my-skill-name
For project-specific installation:
# From your project root
mkdir -p .github/skills
git clone https://github.com/spboyer/sensei.git .github/skills/sensei
# Install dependencies
cd .github/skills/sensei/scripts && npm install# Test the token CLI
cd ~/.copilot/skills/sensei # or your install path
npm run tokens -- check
# Should output token counts for all markdown files┌─────────────────────────────────────────────────────────┐
│ START: User invokes "Run sensei on {skill-name}" │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ 1. READ: Load skills/{skill-name}/SKILL.md │
│ Load tests/{skill-name}/ (if exists) │
│ Count tokens (baseline for comparison) │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ 2. SCORE: Run rule-based compliance check │
│ • Check description length (> 150 chars?) │
│ • Check for trigger phrases ("USE FOR:") │
│ • Check for anti-triggers ("DO NOT USE FOR:") │
│ • Check for compatibility field │
└─────────────────────┬───────────────────────────────────┘
▼
┌───────────────┐
│ Score >= M-H │──YES──▶ COMPLETE ✓
│ AND tests pass│ (next skill)
└───────┬───────┘
│ NO
▼
┌─────────────────────────────────────────────────────────┐
│ 3. SCAFFOLD: If tests/{skill-name}/ missing: │
│ Create tests from references/test-templates/ │
│ Creates prompts.md and framework-specific tests │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ 4. IMPROVE FRONTMATTER: │
│ • Add "USE FOR:" with trigger phrases │
│ • Add "DO NOT USE FOR:" with anti-triggers │
│ • Add compatibility if applicable │
│ • Keep description under 1024 chars │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ 5. IMPROVE TESTS: │
│ • Update shouldTriggerPrompts (5+ prompts) │
│ • Update shouldNotTriggerPrompts (5+ prompts) │
│ • Match prompts to new frontmatter triggers │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ 6. VERIFY: Run tests for the skill │
│ • If tests fail → fix and retry │
│ • If tests pass → continue │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ 7. CHECK TOKENS: │
│ npm run tokens count {skill}/SKILL.md │
│ Verify under 500 token soft limit │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ 8. SUMMARY: Display before/after comparison │
│ • Score change (Low → Medium-High) │
│ • Token delta (+/- tokens) │
│ • Unimplemented suggestions │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ 9. PROMPT USER: Choose action │
│ [C] Commit changes │
│ [I] Create GitHub issue with suggestions │
│ [S] Skip (discard changes) │
└─────────────────────┬───────────────────────────────────┘
▼
┌───────────────┐
│ Iteration < 5 │──YES──▶ Go to step 2
└───────┬───────┘
│ NO
▼
TIMEOUT (move to next skill)
When running on multiple skills:
- Skills are processed sequentially
- Each skill goes through the full loop
- User prompted after each skill: Commit, Create Issue, or Skip
- Summary report at the end shows all results
| Setting | Default | Description |
|---|---|---|
| Skills directory | skills/ or .github/skills/ |
Where SKILL.md files live |
| Tests directory | tests/ |
Where test files live |
| Max iterations | 5 | Per-skill iteration limit before moving on |
| Target score | Medium-High | Minimum compliance level |
| Token soft limit | 500 | SKILL.md target token count |
| Token hard limit | 5000 | SKILL.md maximum token count |
| User prompt | After each skill | Commit, Create Issue, or Skip |
| Continue on failure | Yes | Process remaining skills if one fails |
Override defaults in your prompt:
Run sensei on my-skill with skills in src/ai/skills/ and tests in spec/
| Level | Description | Criteria |
|---|---|---|
| Low | Basic description | No explicit triggers, no anti-triggers, often < 150 chars |
| Medium | Has trigger keywords | Description > 150 chars, implicit or explicit trigger phrases |
| Medium-High | Has triggers + anti-triggers | "USE FOR:" present AND "DO NOT USE FOR:" present |
| High | Full compliance | Medium-High + routing clarity (INVOKES/FOR SINGLE OPERATIONS) |
-
Name validation
- Lowercase + hyphens only
- Matches directory name
- ≤ 64 characters
-
Description length
- Minimum: 150 characters (effective)
- Maximum: 1024 characters (spec limit)
-
Trigger phrases
- Contains "USE FOR:", "TRIGGERS:", or "Use this skill when"
- Lists specific keywords and phrases
-
Anti-triggers
- Contains "DO NOT USE FOR:" or "NOT FOR:"
- Lists scenarios that should use other skills
-
Routing clarity (for High score)
- Skill type prefix:
**WORKFLOW SKILL**,**UTILITY SKILL**, or**ANALYSIS SKILL** INVOKES:lists tools/MCP servers the skill callsFOR SINGLE OPERATIONS:guidance for when to bypass skill
- Skill type prefix:
To reach Medium-High, a skill must have:
- ✅ Description > 150 characters
- ✅ Explicit trigger phrases ("USE FOR:" or equivalent)
- ✅ Anti-triggers ("DO NOT USE FOR:" or clear scope limitation)
- ✅ SKILL.md < 500 tokens (soft limit, monitored)
To reach High, add routing clarity:
- ✅ All Medium-High criteria
- ✅ Skill type prefix (
**WORKFLOW SKILL**, etc.) - ✅
INVOKES:listing tools/MCP servers used - ✅
FOR SINGLE OPERATIONS:bypass guidance
When a skill's description contains INVOKES:, Sensei performs additional checks based on the Skills, Tools & MCP Development Guide:
| Check | Purpose |
|---|---|
| MCP Tools Used table | Documents tool dependencies in skill body |
| Prerequisites section | Lists required tools and permissions |
| CLI fallback pattern | Provides fallback when MCP unavailable |
| Name collision detection | Warns when skill name matches MCP tool |
MCP Integration Score (0-4 points):
- 4/4 = Excellent MCP integration
- 3/4 = Good (minor gaps)
- 2/4 = Fair (needs improvement)
- 0-1/4 = Poor (missing key patterns)
See references/mcp-integration.md for detailed patterns.
- SKILL.md: < 500 tokens (soft), < 5000 (hard)
- references/*.md: < 2000 tokens each
- Check with:
npm run tokens -- check - Get suggestions:
npm run tokens -- suggest
---
name: pdf-processor
description: 'Process PDF files for various tasks'
---Problems:
- Only 37 characters
- No trigger phrases
- No anti-triggers
- Agent doesn't know when to activate
---
name: pdf-processor
description: |
Process PDF files including text extraction, rotation, and merging.
USE FOR: "extract PDF text", "rotate PDF", "merge PDFs", "split PDF",
"PDF to text", "combine PDF files".
DO NOT USE FOR: creating new PDFs (use document-creator), extracting
images (use image-extractor), or OCR on scanned documents (use ocr-processor).
---Improvements:
- ~350 characters (informative but under limit)
- Clear description of purpose
- Explicit trigger phrases
- Anti-triggers prevent collision with related skills
---
name: azure-deploy
description: |
**WORKFLOW SKILL** - Orchestrates deployment through preparation, validation,
and execution phases for Azure applications.
USE FOR: "deploy to Azure", "azd up", "push to Azure", "publish to Azure".
DO NOT USE FOR: preparing new apps (use azure-prepare), validating before
deploy (use azure-validate), Azure Functions specifically (use azure-functions).
INVOKES: azure-azd MCP (up, deploy, provision), azure-deploy MCP (plan_get).
FOR SINGLE OPERATIONS: Use azure-azd MCP directly for single azd commands.
---High score achieved with:
- Skill type prefix (
**WORKFLOW SKILL**) INVOKES:lists MCP tools usedFOR SINGLE OPERATIONS:guides when to bypass skill
Before (empty):
const shouldTriggerPrompts = [];
const shouldNotTriggerPrompts = [];After:
const shouldTriggerPrompts = [
'Extract text from this PDF',
'Rotate this PDF 90 degrees',
'Merge these PDF files together',
'Split this PDF into pages',
'Convert PDF to text',
];
const shouldNotTriggerPrompts = [
'Create a new PDF document',
'Extract images from this PDF',
'OCR this scanned document',
'What is the weather today?',
'Help me with AWS S3',
];Ensure shouldTriggerPrompts match "USE FOR:" phrases and shouldNotTriggerPrompts match "DO NOT USE FOR:" scenarios.
Common causes: description > 1024 chars, anti-triggers not using "DO NOT USE FOR:" format, or conflicting triggers with other skills.
git reset --soft HEAD~1 # Undo last commit- Edit
SKILL.mdfor instruction changes - Edit
references/*.mdfor documentation changes - Test tokens:
npm run tokens -- check - Test on a sample skill before committing
- Document the rule in
references/scoring.md - Add examples in
references/examples.md - Update scoring criteria in SKILL.md
- Create template in
references/test-templates/{framework}.md - Document usage in references/configuration.md
Sensei supports Waza for trigger accuracy testing. See references/test-templates/waza.md.
Open an issue with skill name, starting state, and git log --oneline -10.
- Ralph Loop Pattern - Original Ralph loop implementation
- Anthropic Skills Documentation - Writing guidance
- Skills, Tools & MCP Development Guide - MCP integration best practices
- Waza Testing Framework - Skill trigger accuracy testing
- skill-creator - For creating new skills from scratch
Sensei - "The path to compliance begins with a single trigger." 🥋