SkillProof

SkillProof is an open-source coding-agent skill for verifying whether another coding-agent skill helps in a real repository on a real task slice with acceptable risk.

Instead of trusting README copy or prompt vibes, SkillProof is designed to make an agent:

inspect the candidate skill
inspect the target repo
normalize a task suite and deterministic validators
run baseline without the skill
run treatment with the skill
compare outcomes fairly
analyze compatibility and risk
emit an auditable report inside .skillproof/

Install In 2 Minutes

A skill is just a directory with a SKILL.md entrypoint. The fastest path is to symlink this repo into your agent's skill folder.

Set this once:

export SKILLPROOF_DIR="/absolute/path/to/SkillProof"

Or skip manual symlinks and use the installer script:

python "$SKILLPROOF_DIR/scripts/install_skillproof.py" --agent codex

Codex personal install

Codex-style personal skills install under $CODEX_HOME/skills by convention. The default is ~/.codex/skills.

Fastest path:

python "$SKILLPROOF_DIR/scripts/install_skillproof.py" --agent codex

export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
mkdir -p "$CODEX_HOME/skills"
ln -sfn "$SKILLPROOF_DIR" "$CODEX_HOME/skills/skillproof"
test -f "$CODEX_HOME/skills/skillproof/SKILL.md"

Then use it explicitly in a prompt:

Use $skillproof to verify whether this candidate skill helps in this repo on these tasks with acceptable risk.

Claude Code personal install

Claude Code personal skills live at ~/.claude/skills/<skill-name>/SKILL.md.

Fastest path:

python "$SKILLPROOF_DIR/scripts/install_skillproof.py" --agent claude-personal

mkdir -p "$HOME/.claude/skills"
ln -sfn "$SKILLPROOF_DIR" "$HOME/.claude/skills/skillproof"
test -f "$HOME/.claude/skills/skillproof/SKILL.md"

Then invoke it directly:

/skillproof Verify whether this candidate skill helps in this repo on these tasks with acceptable risk.

Or ask naturally:

Verify whether this candidate skill helps in this repo on these tasks with acceptable risk.

Claude Code project-local install

If you want the skill available only in one repository:

Fastest path:

python "$SKILLPROOF_DIR/scripts/install_skillproof.py" --agent claude-project --project-dir /path/to/repo

mkdir -p .claude/skills
ln -sfn "$SKILLPROOF_DIR" .claude/skills/skillproof
test -f .claude/skills/skillproof/SKILL.md

This is the cleanest option when you want SkillProof committed or shared as a project-specific workflow.

First Successful Run

Once the skill is installed, the shortest path to a real report is:

Scaffold .skillproof/ into the target repo:

python "$SKILLPROOF_DIR/scripts/bootstrap_skillproof.py" /path/to/target-repo

Edit /path/to/target-repo/.skillproof/verification.yaml.
Edit /path/to/target-repo/.skillproof/tasks.yaml.
Run the verification loop:

python "$SKILLPROOF_DIR/scripts/run_skillproof.py" --repo /path/to/target-repo

Open the report:

open /path/to/target-repo/.skillproof/reports/skill-report.md

If you only want to try SkillProof once, you can skip agent installation entirely and run the two bundled scripts directly.

What Users Actually Need To Know

Most users do not need to understand the internals first.

They need to know:

where to install the skill
how to invoke it
how to get the first report
where the outputs land

That is the whole adoption path.

What This Is

SkillProof is a skill repository first.

The primary interface is SKILL.md. The bundled Python scripts are just reliability tooling that help the skill scaffold and execute the verification loop.

The recommended user path is:

install or symlink the repo into the agent's skill folder
invoke skillproof
let the bundled scripts create and run .skillproof/

Repository Layout

.
├── SKILL.md
├── PRD.md
├── README.md
├── LICENSE
├── .gitignore
├── agents/
│   └── openai.yaml
├── references/
│   ├── artifact-contract.md
│   ├── compatibility-model.md
│   ├── risk-model.md
│   └── verification-method.md
├── scripts/
│   ├── bootstrap_skillproof.py
│   ├── run_skillproof.py
│   └── skillproof_core.py
└── tests/

Default Invocation

Use $skillproof to verify whether this candidate skill helps in this repo on these tasks with acceptable risk.

For Claude Code, direct invocation via /skillproof should also work because the skill name is skillproof.

Output Files

SkillProof leaves durable evidence inside the target repo:

.skillproof/reports/skill-report.md
.skillproof/reports/skill-report.json
.skillproof/reports/run-manifest.json
.skillproof/reports/task-results.json
.skillproof/risk/risk-summary.json
.skillproof/compatibility/compatibility-summary.json

If the user can find those files, the run succeeded.

Config Format

For v1, the .yaml files are JSON-compatible YAML so they can be parsed without third-party dependencies.

Example .skillproof/verification.yaml:

{
  "adapter": "shell-command-template",
  "candidate_skill": "../skills/my-skill",
  "baseline_command_template": "codex exec --repo {repo} --task {task_prompt}",
  "treatment_command_template": "codex exec --repo {repo} --task {task_prompt} --skill {candidate_skill}",
  "timeout_seconds": 600,
  "repeats": 1,
  "artifact_root": ".skillproof"
}

Example .skillproof/tasks.yaml:

{
  "tasks": [
    {
      "id": "bugfix-001",
      "prompt": "Fix the bug and preserve the existing contract.",
      "tags": ["backend", "bugfix"],
      "scope": ["src/"],
      "validation_commands": [
        "pytest tests/test_contract.py -q"
      ]
    }
  ]
}

Command Template Notes

SkillProof v1 supports one adapter: shell-command-template.

The command templates can interpolate:

{repo}
{task_id}
{task_prompt}
{task_scope}
{task_tags}
{candidate_skill}
{python}
{condition}

Codex example

codex exec --repo {repo} --task {task_prompt}
codex exec --repo {repo} --task {task_prompt} --skill {candidate_skill}

Claude Code example

claude-code run --cwd {repo} --prompt {task_prompt}
claude-code run --cwd {repo} --prompt {task_prompt} --append-skill {candidate_skill}

SkillProof does not ship hard-coded agent integrations in v1. The shell template is the portability layer.

Troubleshooting

The skill does not seem to load

Verify the symlink or copied directory ends in skillproof.
Verify the target path contains SKILL.md.
Re-run the installer script if the symlink target moved.
For Codex, invoke it explicitly with Use $skillproof ....
For Claude Code, invoke it explicitly with /skillproof ....

The report never appears

Check that verification.yaml points to a real candidate skill path.
Check that the baseline and treatment command templates actually run in your environment.
Check that each task has at least one deterministic validation_command.
Run python "$SKILLPROOF_DIR/scripts/run_skillproof.py" --repo /path/to/target-repo directly to isolate install issues from runtime issues.

A user wants the lowest-friction path

Tell them to do exactly this:

symlink the repo into the agent's skill folder
run bootstrap_skillproof.py
fill in verification.yaml and tasks.yaml
run run_skillproof.py
open skill-report.md

Verification Model

SkillProof makes the same judgment along three axes described in the PRD:

utility
compatibility
risk

The recommendation is derived from those axes, but the underlying evidence remains visible.

Status

The current repository includes:

a publishable skill definition
reference docs for the verification method
bootstrap and run tooling
tests for bootstrapping, risk analysis, compatibility analysis, and end-to-end reporting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SkillProof

Install In 2 Minutes

Codex personal install

Claude Code personal install

Claude Code project-local install

First Successful Run

What Users Actually Need To Know

What This Is

Repository Layout

Default Invocation

Output Files

Config Format

Command Template Notes

Codex example

Claude Code example

Troubleshooting

The skill does not seem to load

The report never appears

A user wants the lowest-friction path

Verification Model

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agents		agents
references		references
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
PRD.md		PRD.md
README.md		README.md
SKILL.md		SKILL.md

Folders and files

Latest commit

History

Repository files navigation

SkillProof

Install In 2 Minutes

Codex personal install

Claude Code personal install

Claude Code project-local install

First Successful Run

What Users Actually Need To Know

What This Is

Repository Layout

Default Invocation

Output Files

Config Format

Command Template Notes

Codex example

Claude Code example

Troubleshooting

The skill does not seem to load

The report never appears

A user wants the lowest-friction path

Verification Model

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages