Skip to content

gangj277/SkillProof

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SkillProof

SkillProof is an open-source coding-agent skill for verifying whether another coding-agent skill helps in a real repository on a real task slice with acceptable risk.

Instead of trusting README copy or prompt vibes, SkillProof is designed to make an agent:

  • inspect the candidate skill
  • inspect the target repo
  • normalize a task suite and deterministic validators
  • run baseline without the skill
  • run treatment with the skill
  • compare outcomes fairly
  • analyze compatibility and risk
  • emit an auditable report inside .skillproof/

Install In 2 Minutes

A skill is just a directory with a SKILL.md entrypoint. The fastest path is to symlink this repo into your agent's skill folder.

Set this once:

export SKILLPROOF_DIR="/absolute/path/to/SkillProof"

Or skip manual symlinks and use the installer script:

python "$SKILLPROOF_DIR/scripts/install_skillproof.py" --agent codex

Codex personal install

Codex-style personal skills install under $CODEX_HOME/skills by convention. The default is ~/.codex/skills.

Fastest path:

python "$SKILLPROOF_DIR/scripts/install_skillproof.py" --agent codex
export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
mkdir -p "$CODEX_HOME/skills"
ln -sfn "$SKILLPROOF_DIR" "$CODEX_HOME/skills/skillproof"
test -f "$CODEX_HOME/skills/skillproof/SKILL.md"

Then use it explicitly in a prompt:

Use $skillproof to verify whether this candidate skill helps in this repo on these tasks with acceptable risk.

Claude Code personal install

Claude Code personal skills live at ~/.claude/skills/<skill-name>/SKILL.md.

Fastest path:

python "$SKILLPROOF_DIR/scripts/install_skillproof.py" --agent claude-personal
mkdir -p "$HOME/.claude/skills"
ln -sfn "$SKILLPROOF_DIR" "$HOME/.claude/skills/skillproof"
test -f "$HOME/.claude/skills/skillproof/SKILL.md"

Then invoke it directly:

/skillproof Verify whether this candidate skill helps in this repo on these tasks with acceptable risk.

Or ask naturally:

Verify whether this candidate skill helps in this repo on these tasks with acceptable risk.

Claude Code project-local install

If you want the skill available only in one repository:

Fastest path:

python "$SKILLPROOF_DIR/scripts/install_skillproof.py" --agent claude-project --project-dir /path/to/repo
mkdir -p .claude/skills
ln -sfn "$SKILLPROOF_DIR" .claude/skills/skillproof
test -f .claude/skills/skillproof/SKILL.md

This is the cleanest option when you want SkillProof committed or shared as a project-specific workflow.

First Successful Run

Once the skill is installed, the shortest path to a real report is:

  1. Scaffold .skillproof/ into the target repo:
python "$SKILLPROOF_DIR/scripts/bootstrap_skillproof.py" /path/to/target-repo
  1. Edit /path/to/target-repo/.skillproof/verification.yaml.

  2. Edit /path/to/target-repo/.skillproof/tasks.yaml.

  3. Run the verification loop:

python "$SKILLPROOF_DIR/scripts/run_skillproof.py" --repo /path/to/target-repo
  1. Open the report:
open /path/to/target-repo/.skillproof/reports/skill-report.md

If you only want to try SkillProof once, you can skip agent installation entirely and run the two bundled scripts directly.

What Users Actually Need To Know

Most users do not need to understand the internals first.

They need to know:

  • where to install the skill
  • how to invoke it
  • how to get the first report
  • where the outputs land

That is the whole adoption path.

What This Is

SkillProof is a skill repository first.

The primary interface is SKILL.md. The bundled Python scripts are just reliability tooling that help the skill scaffold and execute the verification loop.

The recommended user path is:

  1. install or symlink the repo into the agent's skill folder
  2. invoke skillproof
  3. let the bundled scripts create and run .skillproof/

Repository Layout

.
├── SKILL.md
├── PRD.md
├── README.md
├── LICENSE
├── .gitignore
├── agents/
│   └── openai.yaml
├── references/
│   ├── artifact-contract.md
│   ├── compatibility-model.md
│   ├── risk-model.md
│   └── verification-method.md
├── scripts/
│   ├── bootstrap_skillproof.py
│   ├── run_skillproof.py
│   └── skillproof_core.py
└── tests/

Default Invocation

Use $skillproof to verify whether this candidate skill helps in this repo on these tasks with acceptable risk.

For Claude Code, direct invocation via /skillproof should also work because the skill name is skillproof.

Output Files

SkillProof leaves durable evidence inside the target repo:

  • .skillproof/reports/skill-report.md
  • .skillproof/reports/skill-report.json
  • .skillproof/reports/run-manifest.json
  • .skillproof/reports/task-results.json
  • .skillproof/risk/risk-summary.json
  • .skillproof/compatibility/compatibility-summary.json

If the user can find those files, the run succeeded.

Config Format

For v1, the .yaml files are JSON-compatible YAML so they can be parsed without third-party dependencies.

Example .skillproof/verification.yaml:

{
  "adapter": "shell-command-template",
  "candidate_skill": "../skills/my-skill",
  "baseline_command_template": "codex exec --repo {repo} --task {task_prompt}",
  "treatment_command_template": "codex exec --repo {repo} --task {task_prompt} --skill {candidate_skill}",
  "timeout_seconds": 600,
  "repeats": 1,
  "artifact_root": ".skillproof"
}

Example .skillproof/tasks.yaml:

{
  "tasks": [
    {
      "id": "bugfix-001",
      "prompt": "Fix the bug and preserve the existing contract.",
      "tags": ["backend", "bugfix"],
      "scope": ["src/"],
      "validation_commands": [
        "pytest tests/test_contract.py -q"
      ]
    }
  ]
}

Command Template Notes

SkillProof v1 supports one adapter: shell-command-template.

The command templates can interpolate:

  • {repo}
  • {task_id}
  • {task_prompt}
  • {task_scope}
  • {task_tags}
  • {candidate_skill}
  • {python}
  • {condition}

Codex example

codex exec --repo {repo} --task {task_prompt}
codex exec --repo {repo} --task {task_prompt} --skill {candidate_skill}

Claude Code example

claude-code run --cwd {repo} --prompt {task_prompt}
claude-code run --cwd {repo} --prompt {task_prompt} --append-skill {candidate_skill}

SkillProof does not ship hard-coded agent integrations in v1. The shell template is the portability layer.

Troubleshooting

The skill does not seem to load

  • Verify the symlink or copied directory ends in skillproof.
  • Verify the target path contains SKILL.md.
  • Re-run the installer script if the symlink target moved.
  • For Codex, invoke it explicitly with Use $skillproof ....
  • For Claude Code, invoke it explicitly with /skillproof ....

The report never appears

  • Check that verification.yaml points to a real candidate skill path.
  • Check that the baseline and treatment command templates actually run in your environment.
  • Check that each task has at least one deterministic validation_command.
  • Run python "$SKILLPROOF_DIR/scripts/run_skillproof.py" --repo /path/to/target-repo directly to isolate install issues from runtime issues.

A user wants the lowest-friction path

Tell them to do exactly this:

  1. symlink the repo into the agent's skill folder
  2. run bootstrap_skillproof.py
  3. fill in verification.yaml and tasks.yaml
  4. run run_skillproof.py
  5. open skill-report.md

Verification Model

SkillProof makes the same judgment along three axes described in the PRD:

  • utility
  • compatibility
  • risk

The recommendation is derived from those axes, but the underlying evidence remains visible.

Status

The current repository includes:

  • a publishable skill definition
  • reference docs for the verification method
  • bootstrap and run tooling
  • tests for bootstrapping, risk analysis, compatibility analysis, and end-to-end reporting

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages