Skillcheck

An agent skill verification framework for testing AI coding assistants.

What is Skillcheck?

Skillcheck is a framework for verifying that AI coding assistants can successfully execute specific skills. It provides:

Standardized test definitions - YAML-based skill test specifications
Automated verification - Compare agent output against expected criteria
Detailed reporting - JSON and Markdown reports for analysis
Extensible architecture - Easy to add new skills and verifiers

Supported Skills

Skill	Description	Status
`docx_create`	Create Word documents from scratch	Available

Supported Agents

Skillcheck can test any AI coding assistant that can execute prompts. Tested with:

Claude Code
Codex CLI
Gemini CLI
OpenCrawl

Methodology

Skillcheck uses a baseline comparison approach:

Define - Create a test definition with a prompt and expected criteria
Execute - Give the prompt to an AI agent and capture its output
Verify - Compare the output against the expected criteria
Report - Generate detailed pass/fail reports with metrics

This allows objective measurement of agent capabilities across different skills.

Quick Start

# Install
pip install -e .

# Show available skills
skillcheck --list-skills

# Run a skill test
cd skills/docx_create
python run_test.py --show-prompt  # See what to ask the agent
python run_test.py --output agent_output.docx --agent "Claude Code"

See docs/QUICK_START.md for detailed setup instructions.

Project Structure

skillcheck/
├── skillcheck/          # Core framework
│   ├── models.py        # Data models
│   ├── verifier.py      # Base verification
│   ├── runner.py        # Test orchestration
│   └── reporter.py      # Report generation
│
├── skills/              # Skill test suites
│   └── docx_create/     # DOCX creation skill
│
├── docs/                # Documentation
└── tests/               # Framework tests

Note on Skills Architecture: Each skill in the skills/ directory is designed as a standalone test suite. Skills are intentionally not packaged as a Python module (no __init__.py) to keep them decoupled and independently distributable. Each skill's run_test.py serves as its entry point. The CLI auto-discovers skills by scanning for directories containing test_definition.yaml.

Writing Skill Tests

See docs/SKILL_TESTS.md for a guide on creating new skill tests.

Reports

Skillcheck generates two report formats:

JSON - Machine-readable for automation and dashboards
Markdown - Human-readable for review

See docs/REPORT_TEMPLATE.md for example reports.

Badge System (Planned)

Future versions will include a badge system for verified skills:

[Claude Code] DOCX Create: VERIFIED
[Codex CLI] DOCX Create: VERIFIED

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
skillcheck		skillcheck
skills/docx_create		skills/docx_create
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Skillcheck

What is Skillcheck?

Supported Skills

Supported Agents

Methodology

Quick Start

Project Structure

Writing Skill Tests

Reports

Badge System (Planned)

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

adrianmoses/skillcheck

Folders and files

Latest commit

History

Repository files navigation

Skillcheck

What is Skillcheck?

Supported Skills

Supported Agents

Methodology

Quick Start

Project Structure

Writing Skill Tests

Reports

Badge System (Planned)

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages