dev-plugins

A Claude Code plugin marketplace for development tooling — with built-in evaluation harnesses for each plugin.

Designed as a reference implementation demonstrating how to build Claude Code plugins with rigorous, eval-driven development.

Quick Demo

# 1. Install dependencies
npm install

# 2. Set your API key (used by eval harness)
echo "ANTHROPIC_API_KEY=your-key-here" > .env

# 3. Run evals for one plugin and view results
npm run eval:readiness
npx promptfoo view

How Evals Work

┌──────────┐    ┌───────────┐    ┌──────────────────┐    ┌─────────┐
│   Task   │───▶│  Trial    │───▶│     Graders      │───▶│ Outcome │
│ (test    │    │  (single  │    │ • deterministic  │    │ pass@k  │
│  case in │    │  prompt-  │    │ • llm-rubric     │    │ pass^k  │
│  suite)  │    │  foo run) │    │ • transcript     │    │ scores  │
└──────────┘    └───────────┘    └──────────────────┘    └─────────┘

See BASELINE.md for current eval metrics and docs/EVAL_TAXONOMY.md for how our eval concepts map to the Anthropic "Demystifying Evals" article.

Plugins

frontend-dev

React component scaffolding, accessibility audits, responsive design checks, component refactoring, and design system compliance.

Commands:

/frontend-dev:scaffold-component — Scaffold a React component with props, types, tests, and story
/frontend-dev:a11y-audit — WCAG 2.1 AA compliance audit using axe-core patterns
/frontend-dev:responsive-check — Responsive design audit (media queries, viewport, touch targets)
/frontend-dev:refactor — React component refactoring (decompose, extract hooks, reduce complexity)
/frontend-dev:design-system — Design system compliance (tokens vs hardcoded values)

ai-readiness

Assess a repository and its git history for AI-coding assistant readiness — comprehensive audits covering code quality, security, testing, architecture, git health, and API design.

Commands:

/ai-readiness:full-audit — 10-section comprehensive AI readiness audit
/ai-readiness:git-health — 71 git anti-patterns with DORA-based severity scoring
/ai-readiness:code-review — 7-category weighted code review and static analysis
/ai-readiness:architecture — 6-category architecture review with SOLID principles
/ai-readiness:security — 6-category security review (OWASP, auto-fail on critical)
/ai-readiness:testing — Test quality: patterns, desiderata, pyramid analysis
/ai-readiness:api-review — 7-category API design and contract review

Project Structure

dev-plugins/
├── plugins/           # What ships to users (commands, skills, agents, hooks)
│   ├── frontend-dev/
│   └── ai-readiness/
├── evals/             # Per-plugin eval suites, graders, fixtures (stays in repo)
│   ├── frontend-dev/
│   └── ai-readiness/
├── eval-infra/        # Shared eval utilities, scripts, rubric templates
└── docs/              # Contributor and learner guides

Getting Started

# Install dependencies
npm install

# Set your Anthropic API key in .env (gitignored)
echo "ANTHROPIC_API_KEY=your-key-here" > .env

Run evals

# Single plugin
npm run eval:frontend
npm run eval:readiness

# All plugins
npm run eval:all

View results

# Interactive web viewer
npx promptfoo view

# Compute pass@k metrics
python eval-infra/scripts/compute-pass-at-k.py --results evals/ai-readiness/.promptfoo/output.json --k 1 3 5

See docs/GETTING_STARTED.md for detailed setup instructions.

Tooling

Tool	Purpose
Promptfoo	Eval harness + LLM grading
ESLint	Code-based grading (lint)
Prettier	Code-based grading (format)
axe-core	Accessibility assertion engine
Vite	Test fixture builds (frontend-dev)

Documentation

Getting Started — Setup and first eval run
Eval Philosophy — Principles of eval-driven development
Eval Taxonomy — Maps Anthropic article concepts to this repo
Writing Evals — How to write test suites
Grader Guide — Grader types and implementation patterns
Adding a Plugin — Step-by-step guide for new plugins

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.claude-plugin		.claude-plugin
.claude		.claude
.github/workflows		.github/workflows
docs		docs
eval-infra		eval-infra
evals		evals
plugins		plugins
.env.example		.env.example
.gitignore		.gitignore
BASELINE.md		BASELINE.md
CLAUDE.md		CLAUDE.md
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dev-plugins

Quick Demo

How Evals Work

Plugins

frontend-dev

ai-readiness

Project Structure

Getting Started

Run evals

View results

Tooling

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dev-plugins

Quick Demo

How Evals Work

Plugins

frontend-dev

ai-readiness

Project Structure

Getting Started

Run evals

View results

Tooling

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages