skills-forge

A clean-architecture toolkit for crafting high-quality Claude Code skills.

Why

Writing a Claude Code skill is easy. Writing a good one is not. Skills that trigger unreliably, consume too much context, or try to do everything at once make Claude less effective, not more.

skills-forge applies software engineering principles (SRP, OCP, DIP) to the skill authoring process. It gives you a CLI to scaffold, lint, and install skills — with built-in validators that catch common anti-patterns before they reach production.

New here? Read docs/getting-started.md for a step-by-step walkthrough of the full authoring loop.

Quick start

pip install -e ".[dev]"

# Initialize a workspace
skills-forge init

# 1. Scaffold a skill
skills-forge create \
  --name python-tdd \
  --category development \
  --description "Use for TDD with Python. Triggers: pytest, test-first, red-green-refactor, .py files." \
  --emoji 🔴

# 2. Open the generated SKILL.md and write the actual content
#    (principles, workflow, constraints, hints, references...)
$EDITOR output_skills/development/python-tdd/SKILL.md

# 3. Lint until clean (fix every error, then warnings)
skills-forge lint output_skills/development/python-tdd

# 4. Install — default (Claude Code, global)
skills-forge install output_skills/development/python-tdd

# Or: universal project install — works with Gemini CLI, Codex, VS Code Copilot too
skills-forge install output_skills/development/python-tdd --target agents --scope project

# Or: write to every supported tool at once
skills-forge install output_skills/development/python-tdd --target all

# 5. Iterate: edit → re-lint → Claude picks up changes instantly

# 6. Bundle and share with your team
skills-forge pack output_skills/development/python-tdd
skills-forge publish ./python-tdd-0.1.0.skillpack \
  --registry ~/code/skill-registry \
  --base-url https://raw.githubusercontent.com/ficiverson/skill-registry/main \
  --push

# Or try installing a real pack from the live registry right now:
skills-forge install https://raw.githubusercontent.com/ficiverson/skill-registry/main/packs/evaluation/ai-eng-evaluator-1.0.0.skillpack \
  --sha256 10d16ba0db7b768219d0adb6c3dd8ea68b62e9f719a0132fdcd2bcf10271c0e6

The install command creates a symlink from your skill directory into the target tool's skills directory. Because it's a symlink, you edit the source in output_skills/, re-lint, and the installed version updates without reinstalling.

All targets share the same SKILL.md format — the agentskills.io open standard adopted by Claude Code, Gemini CLI, OpenAI Codex, VS Code Copilot, and 20+ other tools.

`--target`	Global path	Project path
`claude` (default)	`~/.claude/skills/`	`.claude/skills/`
`gemini`	`~/.gemini/skills/`	`.gemini/skills/`
`codex`	`~/.codex/skills/`	`.codex/skills/`
`vscode`	(no global)	`.github/skills/`
`agents`	`~/.agents/skills/`	`.agents/skills/`
`all`	all of the above	all of the above

agents is the recommended target for shared repos — every conforming tool scans .agents/skills/ at project scope, so teammates on Gemini CLI, Codex, or VS Code Copilot all pick up the same skills without any per-tool setup.

Authoring workflow

The authoring loop has five steps. Step 2 (the actual writing) is the one that matters most.

Scaffold with skills-forge create. This writes a starter SKILL.md plus empty companion directories (references/, examples/, assets/, scripts/) when the relevant fields are present.
Author the content. Open SKILL.md and fill in the description, principles, workflow, and constraints. Drop reference docs into references/, sample outputs into examples/, and static files into assets/.
Lint with skills-forge lint <path>. Fix every error and warning. The linter checks both the SKILL.md content (description length, vague language, token budget) and the filesystem (do all linked files actually exist?).
Install with skills-forge install <path>. Use --scope project for a project-local install, or omit it for global. Add --target agents to install into .agents/skills/ — the universal cross-vendor path that works for every agentskills.io-conforming tool (Gemini CLI, Codex, VS Code Copilot, etc.). Use --target all to write to every supported tool at once.
Test and iterate. Open Claude Code, trigger the skill with a realistic prompt, and watch how it activates. Tweak the description and triggers until activation is reliable.

A complete minimal SKILL.md

This is what you should aim for when authoring a basic skill — copy this as a starting point:

---
name: python-tdd
description: |
  Use this skill when writing Python code with a test-first workflow.
  Triggers on: pytest, unittest, TDD, red-green-refactor, test-driven,
  .py files, "write tests first", "add a failing test".
---

STARTER_CHARACTER = 🔴

## Principles

- Write a failing test before any production code
- One assertion per test where possible
- Keep the red-green-refactor cycle under 5 minutes

## Workflow

1. Write the smallest failing test that captures the next behavior
2. Run pytest and confirm it fails for the right reason
3. Write the minimum code to make it pass
4. Refactor with the test as a safety net
5. Commit on green

## Constraints

- Never write production code without a failing test
- Never commit on red
- Never skip the refactor step

Writing a good description

The description is the skill's interface — Claude uses it to decide whether to activate. Aim for 30–150 tokens (≈ 20–100 words) and follow this formula:

Use this skill when [situation]. Triggers on: [keywords, file extensions, action verbs].

Good:

Use this skill when reviewing pull requests for security issues. Triggers on: PR review, security audit, OWASP, SQL injection, XSS, .py, .js, .ts files.

Bad:

A skill for code stuff. Use it for any kind of code review or whatever.

The bad example trips four validators at once: too short, vague language ("stuff", "whatever"), overly broad ("any kind"), and missing triggers (no extensions or action verbs).

For AI agents authoring skills

If you are an AI agent (e.g. Claude) creating a skill on a user's behalf, follow this deterministic playbook:

Clarify the skill's single responsibility. Ask the user what trigger condition should activate the skill and what concrete action it should produce. If the answer covers more than one responsibility, split it into multiple skills.
Pick a category and a kebab-case name. Names must contain no spaces. Categories live under output_skills/<category>/<name>/.
Run skills-forge create with --name, --category, --description, and --emoji. The description must follow the formula above (30–150 tokens, includes triggers).
Write the SKILL.md body using the minimal example as a structural template. Required sections: Principles, Workflow (or Instructions), Constraints. Optional but recommended: Hints, References, Examples, Assets.
Add companion files when referenced from SKILL.md. Every link in ## References, ## Examples, ## Assets, and ## Scripts must resolve to a real file on disk — the linter will fail otherwise.
Run skills-forge lint <path> and fix every issue. Iterate until the linter reports clean. Do not stop on the first warning — fix all of them.
Stop and report. Hand the skill back to the user with the path and a one-line summary. Do not install it without explicit user permission.

Definition of done

A skill is ready when all of these are true:

skills-forge lint <path> reports clean (zero errors, zero warnings)
Description is 30–150 tokens and follows the Use when... Triggers on... formula
At least three concrete trigger keywords (file extensions, action verbs, or domain terms)
Principles section has 3–7 imperative bullets
Workflow has numbered steps that a junior could follow without guessing
Constraints lists what the skill must never do
Every link in References/Examples/Assets/Scripts resolves to a real file
Total token estimate is under 1200 (warning threshold)

When to use which section

Principles — universal rules that always apply ("write tests first")
Workflow — the ordered steps Claude should follow when the skill activates
Constraints — hard "never" rules ("never commit on red")
Hints — situational branching ("if no tests directory exists, score code_quality ≤ 4"). Use hints, not principles, when guidance is conditional.
References — long-form docs Claude reads only when a workflow step needs them. Use this to keep SKILL.md under the token budget.
Examples — sample outputs that calibrate Claude's quality bar. The single highest-leverage thing you can add.
Assets — static files (CSVs, templates, configs) that scripts or Claude reference at runtime.
depends_on — declare another skill that should also be loaded when this one activates.

Architecture

The project follows clean architecture with four layers:

src/skill_forge/
├── domain/           # Core: models, validators, ports (zero dependencies)
├── application/      # Use cases: create, lint, install, pack/unpack, publish, install-from-url
├── infrastructure/   # Adapters: filesystem, markdown, symlinks, zip packer, git registry, http fetcher
└── cli/              # Entry point: typer CLI + composition root

Dependency rule: dependencies point inward. Domain knows nothing about infrastructure. Use cases depend on ports (abstractions), not adapters (implementations). The CLI's factory.py is the composition root that wires everything together.

Skill anatomy

A skill is a directory with this structure:

my-skill/
├── SKILL.md              # Core: description, principles, workflow, constraints
├── references/           # On-demand docs loaded only when a step needs them
├── scripts/              # Executable automation (generators, validators)
├── examples/             # Sample outputs showing expected format and quality
└── assets/               # Static files (CSVs, templates, images, config)

The SKILL.md frontmatter supports these fields:

---
name: my-skill
description: |
  What this skill does and when to trigger it.
depends_on: other-skill (reason for dependency)
---

The body supports these sections: Principles, Workflow/Instructions, Constraints, Hints, References, Examples, Assets.

Hints

The ## Hints section contains conditional guidance that Claude applies only when relevant. Unlike principles (always apply) or constraints (hard rules), hints are situational branching logic:

## Hints

- If the repo has no tests directory, score code_quality <= 4
- If the project uses TypeScript, look for tsconfig.json instead of mypy
- If this is a monorepo, evaluate each service separately

depends_on

Skills can declare dependencies on other skills. This tells Claude to also read the dependency's SKILL.md when the skill activates:

depends_on: pdf (PDF generation for the final report)

Multiple dependencies use comma separation:

depends_on: pdf (report output), xlsx (data export)

Examples

The ## Examples section links to sample outputs in examples/. These are the single most useful calibration tool for Claude — an example output is worth more than a page of specification:

## Examples

- [Sample evaluation JSON](examples/example-eval.json)
- [Example test report](examples/example-report.md)

Assets

The ## Assets section links to static files that scripts or Claude can reference at runtime:

## Assets

- [Level thresholds](assets/level-thresholds.csv)
- [Docker reference](assets/docker-cypress.md)

Clean principles applied to skills

The key insight: the same principles that make code maintainable also make skills effective.

Principle	Applied to skills
SRP	One skill, one responsibility. Split broad skills.
OCP	Extend via `references/`, don't bloat `SKILL.md`.
ISP	Keep the description lean — it's the skill's interface.
DIP	Skills define principles, not tool-specific commands.

See docs/clean-principles-for-skills.md for the full guide, and docs/getting-started.md for a step-by-step walkthrough of the complete workflow (create → validate → install → test → iterate).

Validators

The linter runs two types of validators:

Pure validators (check the Skill object):

Rule	Severity	What it checks
`description-too-short`	warning	Description < 30 tokens
`description-too-long`	error	Description > 150 tokens
`description-vague-language`	warning	Vague words like "stuff", "things"
`description-overly-broad`	error	Phrases like "any task", "everything"
`description-missing-triggers`	warning	No file extensions or action verbs
`missing-principles`	warning	No principles section
`context-budget-exceeded`	error	Total tokens > 2000
`context-budget-high`	warning	Total tokens > 1200
`reference-too-deep`	warning	Nested references (> 1 level)
`possible-srp-violation`	info	Instructions > 800 words
`missing-starter-character`	info	No STARTER_CHARACTER defined
`missing-examples`	info	Has scripts but no example outputs
`invalid-dependency-name`	error	Malformed depends_on entry

Path-aware validators (check the filesystem):

Rule	Severity	What it checks
`broken-reference-link`	error	Reference file doesn't exist on disk
`broken-example-link`	error	Example file doesn't exist on disk
`broken-asset-link`	error	Asset file doesn't exist on disk
`broken-script-link`	error	Script file doesn't exist on disk

Sharing skills across teams

Once a skill is good, you'll want to share it with your team. skills-forge ships with a .skillpack format — a single zip file containing one or more skills plus a JSON manifest — so you can distribute skills via Slack, Notion, email, GitHub releases, or any other channel that can move a file.

Per-skill versioning

Each skill carries its own semantic version in frontmatter:

---
name: ai-eng-evaluator
version: 1.0.0
description: |
  ...
---

The pack command auto-derives its version from the skill itself, so you don't need to pass --version for single-skill packs. Bump the skill's frontmatter version when you ship a change, and the next pack will use the new value. Use skills-forge create --version 0.1.0 to set an initial version when scaffolding.

# Bundle a single skill — version auto-derived from frontmatter
skills-forge pack output_skills/evaluation/ai-eng-evaluator \
  --output ./packs/
# → ./packs/ai-eng-evaluator-1.0.0.skillpack

# Bundle multiple skills together (explicit pack version recommended)
skills-forge pack \
  output_skills/evaluation/ai-eng-evaluator \
  output_skills/evaluation/user-story-test-cases \
  --name evaluation-bundle \
  --version 1.0.0 \
  --output evaluation-bundle.skillpack

# A teammate receives the file and unpacks it
skills-forge unpack evaluation-bundle.skillpack --output output_skills/

# Then lints and installs as usual
skills-forge lint output_skills/evaluation/ai-eng-evaluator
skills-forge install output_skills/evaluation/ai-eng-evaluator

Pack version precedence: an explicit --version always wins; otherwise a single-skill pack takes the skill's own version; multi-skill bundles without --version fall back to the default.

A .skillpack is just a zip you can inspect with any zip tool. The manifest at the root looks like:

{
  "format_version": "1",
  "name": "evaluation-bundle",
  "version": "1.0.0",
  "author": "me@fernandosouto.dev",
  "created_at": "2026-04-06T13:43:36+00:00",
  "description": "AI engineering evaluator + user-story test-case generator",
  "tags": ["evaluation", "ai-engineering", "test-cases"],
  "owner": {"name": "Fernando Souto", "email": "me@fernandosouto.dev"},
  "skills": [
    {"category": "evaluation", "name": "ai-eng-evaluator", "version": "1.0.0"},
    {"category": "evaluation", "name": "user-story-test-cases", "version": "0.1.0"}
  ]
}

Each skill records its own version in the manifest, so a multi-skill bundle can mix and match. The optional description, tags, owner, and deprecated fields travel with the pack and become the defaults when you publish it to a registry — passing the same flags on publish overrides them. Older packs without these fields still unpack and install fine; the codec fills in safe defaults on read.

The packer excludes __pycache__/, .DS_Store, .git/, and *.pyc files by default. Unpack rejects archives with ../ paths to defend against zip-slip attacks.

For broader distribution, drop the .skillpack into a shared Git repo with CI running skills-forge lint on every PR. That gives you version control, code review, and rollback for free without standing up any extra infrastructure.

Publishing to a git-backed registry

skills-forge publish turns any git repo into a free, CDN-backed skill registry. No GitHub Actions, no releases, no API server — just a normal repo where each pack lives at a stable raw URL. Teammates install directly from that URL.

A live example registry built with skills-forge lives at github.com/ficiverson/skill-registry — every URL in this section points at it, so you can curl the index, install a real pack, and see exactly what your own registry will look like.

The registry repo layout is fixed:

skill-registry/                 ← any git repo (GitHub, GitLab, self-hosted)
├── index.json                  ← machine catalog (auto-maintained)
└── packs/
    └── <category>/
        └── <name>-<version>.skillpack

One-time setup — create the registry repo and clone it locally:

git clone git@github.com:ficiverson/skill-registry.git

Publish a pack — point at the local clone and the public base URL:

skills-forge pack output_skills/evaluation/ai-eng-evaluator
# → ./ai-eng-evaluator-1.0.0.skillpack

skills-forge publish ./ai-eng-evaluator-1.0.0.skillpack \
  --registry ~/code/skill-registry \
  --base-url https://raw.githubusercontent.com/ficiverson/skill-registry/main \
  --message "ai-eng-evaluator 1.0.0" \
  --push

Output:

✔ Published ai-eng-evaluator v1.0.0
  path:    packs/evaluation/ai-eng-evaluator-1.0.0.skillpack
  sha256:  10d16ba0…
  git:     committed
  git:     pushed

  Install URL:
  https://raw.githubusercontent.com/ficiverson/skill-registry/main/packs/evaluation/ai-eng-evaluator-1.0.0.skillpack

  Teammates can install with:
    skills-forge install https://raw.githubusercontent.com/ficiverson/skill-registry/main/packs/evaluation/ai-eng-evaluator-1.0.0.skillpack --sha256 10d16ba0db7b768219d0adb6c3dd8ea68b62e9f719a0132fdcd2bcf10271c0e6

The publisher copies the pack into packs/<category>/<name>-<version>.skillpack, regenerates index.json, commits, and (with --push) pushes. Drop --push if you'd rather review the diff first; the commit is already on your local branch.

Install from a URL — skills-forge install accepts a URL alongside the existing local-path form:

# Direct URL — works for any https:// pointing at a .skillpack
skills-forge install https://raw.githubusercontent.com/ficiverson/skill-registry/main/packs/evaluation/ai-eng-evaluator-1.0.0.skillpack

# With sha256 verification (recommended — digest from index.json)
skills-forge install https://raw.githubusercontent.com/ficiverson/skill-registry/main/packs/evaluation/ai-eng-evaluator-1.0.0.skillpack \
  --sha256 10d16ba0db7b768219d0adb6c3dd8ea68b62e9f719a0132fdcd2bcf10271c0e6

# Local install still works exactly as before
skills-forge install output_skills/evaluation/ai-eng-evaluator

Behind the scenes the URL form fetches the pack to a temp file, verifies the sha256 if you supplied one, unpacks it via the existing unpack flow, and then installs each contained skill into ~/.claude/skills/ (or .claude/skills/ with --scope project).

Multi-platform install (--target) — skills-forge supports every major agent-CLI tool. All targets use the same SKILL.md format; only the destination path differs:

# Install into Gemini CLI (global)
skills-forge install output_skills/development/python-tdd --target gemini

# Install into OpenAI Codex (global)
skills-forge install output_skills/development/python-tdd --target codex

# Universal project path — works for all tools (recommended for shared repos)
skills-forge install output_skills/development/python-tdd --target agents --scope project

# VS Code Copilot (project-only — no global skills dir in VS Code)
skills-forge install output_skills/development/python-tdd --target vscode --scope project

# Write to every supported tool at once
skills-forge install output_skills/development/python-tdd --target all

Export to chatbot / API platforms (export --format) — agent-CLI tools (Claude Code, Gemini CLI, Codex, VS Code) load SKILL.md natively via install --target. For platforms that have no file-system skill directory, use export to render the skill in their native format:

`--format`	Output file	Target platform
`system-prompt` (default)	`<name>.system-prompt.md`	Any chat UI system-prompt field
`gpt-json`	`<name>.gpt.json`	OpenAI Custom GPT / Assistants API
`gem-txt`	`<name>.gem.txt`	Google Gemini Gems
`bedrock-xml`	`<name>.bedrock.xml`	AWS Bedrock agent prompt template
`mcp-server`	`<name>-mcp-server.py`	Any MCP-capable host (Claude Desktop, Cursor, …)

# Plain system prompt — paste into any chat UI
skills-forge export ./packs/productivity-1.0.0.skillpack

# OpenAI Custom GPT config JSON
skills-forge export ./packs/productivity-1.0.0.skillpack --format gpt-json

# Gemini Gem instructions
skills-forge export ./packs/productivity-1.0.0.skillpack --format gem-txt

# AWS Bedrock XML prompt template
skills-forge export ./packs/productivity-1.0.0.skillpack --format bedrock-xml

# Self-contained Python MCP Prompts server
skills-forge export ./packs/productivity-1.0.0.skillpack --format mcp-server -o ./exports/
# → Run with: uvx --from "mcp[cli]" mcp run ./exports/productivity-mcp-server.py

The MCP server format deserves special mention: it generates a single runnable Python file that exposes the skill as an MCP Prompts primitive. Any MCP-compatible host (Claude Desktop, Cursor, VS Code, OpenAI desktop app) can connect via stdio and inject the skill at inference time — no installation step on the end user's machine. The generated file includes a ready-to-paste mcpServers configuration for Claude Desktop using uvx.

Private repos — set GITHUB_TOKEN in your environment and the fetcher will pass it as a token Authorization header on raw.githubusercontent.com requests, so private registries work without any extra configuration.

Integrity — every published pack records its sha256 in index.json, and install --sha256 ... verifies the download against that digest before unpacking. The fetcher also caps downloads at 50 MB by default to refuse runaway responses.

Why this beats the alternatives

vs gh release: no per-version release noise, plain file URLs are simpler to share, and the registry is one browsable folder.
vs S3 / R2: no AWS account, no IAM, no boto3 dependency. Free for public registries.
vs Slack uploads: discoverable. New teammates find every published skill in one place instead of digging through channel history.
vs synced folders: works across orgs and works for open-source distribution, not just intra-team.

Templates

Four templates in templates/:

minimal — Just frontmatter, principles, instructions, constraints
with-references — Adds references, examples, hints, and depends_on
with-scripts — Adds scripts, examples, assets, and validation
full-featured — All sections: workflow steps, references, examples, assets, hints, depends_on, validation scripts

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests (178 tests)
pytest

# Lint code
ruff check src/ tests/

# Type check
mypy src/

License

MIT

I decided to create this repo after reading this eferro post on encoding experience into AI skills.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

skills-forge

Why

Quick start

Authoring workflow

A complete minimal SKILL.md

Writing a good description

For AI agents authoring skills

Definition of done

When to use which section

Architecture

Skill anatomy

Hints

depends_on

Examples

Assets

Clean principles applied to skills

Validators

Sharing skills across teams

Per-skill versioning

Publishing to a git-backed registry

Templates

Development

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.agents/skills		.agents/skills
.claude/skills		.claude/skills
.github/workflows		.github/workflows
docs		docs
exports		exports
output_skills/evaluation		output_skills/evaluation
packs/evaluation		packs/evaluation
src/skill_forge		src/skill_forge
templates		templates
test		test
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
skills-forge.png		skills-forge.png
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

skills-forge

Why

Quick start

Authoring workflow

A complete minimal SKILL.md

Writing a good description

For AI agents authoring skills

Definition of done

When to use which section

Architecture

Skill anatomy

Hints

depends_on

Examples

Assets

Clean principles applied to skills

Validators

Sharing skills across teams

Per-skill versioning

Publishing to a git-backed registry

Templates

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages