feat: skill pinning v2 — warn-on-drift (opt-in, non-blocking)#90
Open
Xaik89 wants to merge 5 commits intonode9-ai:devfrom
Open
feat: skill pinning v2 — warn-on-drift (opt-in, non-blocking)#90Xaik89 wants to merge 5 commits intonode9-ai:devfrom
Xaik89 wants to merge 5 commits intonode9-ai:devfrom
Conversation
…AST 07) Extends the MCP tool pinning primitive (v1.5.0 / PR node9-ai#81) to agent skill repositories. On the first tool call of a session, SHA-256 hashes are recorded for every skill file in known roots (~/.claude/skills/, ~/.claude/CLAUDE.md, ~/.claude/rules/, project .claude/CLAUDE.md, .claude/CLAUDE.local.md, .claude/rules/, .cursor/rules/, AGENTS.md, CLAUDE.md). On subsequent sessions the hook re-verifies; any drift quarantines the session and blocks every tool call until a human reviews via `node9 skill pin update <rootKey>`. One feature, two threats covered in a single primitive — AST 02 Supply Chain Compromise and AST 07 Update Drift — at the skills layer that no competitor currently defends at runtime. Security properties: - Fail-closed on corrupt pin file - Symlink-safe (never follows symlinks out of the tree) - Size-capped at 5000 files / 50 MB per root - Path-traversal-safe session IDs (/^[A-Za-z0-9_-]{1,128}$/) - Atomic writes, mode 0o600 for ~/.node9/skill-pins.json and flags CLI: - `node9 skill pin list` — show pinned roots, hashes, file counts - `node9 skill pin update <rootKey> [--yes]` — diff + re-pin - `node9 skill pin reset` — clear pins AND wipe session flags Config: - `policy.skillRoots: string[]` extends the default root set (absolute, `~/`-prefixed, or cwd-relative; relative paths require absolute cwd). Tests (TDD, all green): - src/__tests__/skill-pin.unit.test.ts (36 tests) - src/__tests__/skill-pin-cli.integration.test.ts (7 tests, spawnSync) - src/__tests__/check-skill-pin.integration.test.ts (8 tests, spawnSync) - src/__tests__/skill-roots-config.spec.ts (4 tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Feedback: ~2k lines was too much. Trim to the security-essential surface while preserving the AST 02 + AST 07 coverage. Dropped (not essential to the rug-pull defense): - computePinDiff + fileManifest tracking in pin entries - @inquirer/prompts dependency (no interactive review prompt) - per-file diff display in `skill pin update` - `update` now mirrors `mcp pin update` exactly: just removes the pin so the next session re-pins with current state - GC loop for session flags older than 7 days (not security-essential) - 4 integration tests covering short-circuit / unchanged-session / quarantine-persists / relative-cwd edge cases — the 4 load-bearing scenarios remain: first-call pins, drift blocks, corrupt fails closed, missing session_id skips Net: -945 / +258 lines. PR now ~1150 lines (down from ~1835). All security properties preserved: fail-closed on corrupt pin file, symlink-safe, size-capped, path-traversal-safe session IDs, atomic writes with mode 0o600, per-session memoisation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Feedback: section was too long. Collapse to one paragraph that cross-references the MCP pinning section above (same pattern, same defense, different target) instead of restating the full rationale. 14 lines → 3. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses PR node9-ai#89 review feedback: - enabled: false by default (opt-in only) - mode 'warn' (default): /dev/tty notification on drift, tool allowed (exit 0) - mode 'block': hard quarantine on drift, tool blocked (exit 2) — for installed/registry skills where changes are genuinely suspicious Config: policy.skillPinning: { enabled, mode: 'warn'|'block', roots } replaces the flat policy.skillRoots field. Session flags now have three states: 'verified', 'warned', 'quarantined'. Warn mode never writes 'quarantined' — warn + allow + memoize so the /dev/tty notification shows once per session, not on every tool call. Core hashing, CLI, and security properties are unchanged from v1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Most skills are installed from registries/shared repos, not authored by the user — making them closer to MCP tools in threat profile. Updated the README paragraph to lead with this framing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v2 rework addressing feedback from PR #89 (closed). Same core hashing + security properties; different default behavior.
What changed from v1:
enabled: falseby default — opt-in only, per reviewer requestmode: 'warn'(default when enabled) — shows/dev/ttynotification on drift, tool call allowed (exit 0). No JSON block payload. The warning appears once per session (memoised via 'warned' flag).mode: 'block'(opt-in strict) — hard quarantine on drift for installed/registry skills where changes are genuinely suspicious (same as v1 behavior)policy.skillPinning: { enabled, mode: 'warn'|'block', roots }replaces flatskillRoots: string[]dev, notmainWhat stayed from v1 (unchanged):
src/skill-pin.ts— SHA-256 hashing, symlink-safe, size-capped, atomic writes, mode 0o600src/cli/commands/skill-pin.ts—list | update <rootKey> | reset~/.node9/skill-sessions/User insight incorporated: many skills are installed from registries, not authored in-house. For those,
mode: 'block'gives strict enforcement. For user-edited files like CLAUDE.md,mode: 'warn'avoids the false-positive problem the v1 review identified.Test plan
skill-pin.unit.test.ts(16): hash contract, pin ops, fail-closedskill-pin-cli.integration.test.ts(6): list/update/resetcheck-skill-pin.integration.test.ts(6): disabled skips, warn exits 0 + flag 'warned', warn memoised, block exits 2 + quarantine, corrupt fail-closedskill-roots-config.spec.ts(4): enabled/mode/roots defaults, merge, dedup, partial overridenpm test→ 1193 / 1194 pass (one pre-existing hud flake)npm run typecheck,lint,format:check,build— all clean🤖 Generated with Claude Code