Skip to content

feat: skill pinning v2 — warn-on-drift (opt-in, non-blocking)#90

Open
Xaik89 wants to merge 5 commits intonode9-ai:devfrom
Xaik89:feat/skills-pinning-v2
Open

feat: skill pinning v2 — warn-on-drift (opt-in, non-blocking)#90
Xaik89 wants to merge 5 commits intonode9-ai:devfrom
Xaik89:feat/skills-pinning-v2

Conversation

@Xaik89
Copy link
Copy Markdown

@Xaik89 Xaik89 commented Apr 14, 2026

Summary

v2 rework addressing feedback from PR #89 (closed). Same core hashing + security properties; different default behavior.

What changed from v1:

  • enabled: false by default — opt-in only, per reviewer request
  • mode: 'warn' (default when enabled) — shows /dev/tty notification on drift, tool call allowed (exit 0). No JSON block payload. The warning appears once per session (memoised via 'warned' flag).
  • mode: 'block' (opt-in strict) — hard quarantine on drift for installed/registry skills where changes are genuinely suspicious (same as v1 behavior)
  • Config shape: policy.skillPinning: { enabled, mode: 'warn'|'block', roots } replaces flat skillRoots: string[]
  • Targets dev, not main

What stayed from v1 (unchanged):

  • src/skill-pin.ts — SHA-256 hashing, symlink-safe, size-capped, atomic writes, mode 0o600
  • src/cli/commands/skill-pin.tslist | update <rootKey> | reset
  • Session memoisation in ~/.node9/skill-sessions/
  • Path-traversal-safe session IDs

User insight incorporated: many skills are installed from registries, not authored in-house. For those, mode: 'block' gives strict enforcement. For user-edited files like CLAUDE.md, mode: 'warn' avoids the false-positive problem the v1 review identified.

Test plan

  • 32 new tests across 4 files — all pass
    • skill-pin.unit.test.ts (16): hash contract, pin ops, fail-closed
    • skill-pin-cli.integration.test.ts (6): list/update/reset
    • check-skill-pin.integration.test.ts (6): disabled skips, warn exits 0 + flag 'warned', warn memoised, block exits 2 + quarantine, corrupt fail-closed
    • skill-roots-config.spec.ts (4): enabled/mode/roots defaults, merge, dedup, partial override
  • Full suite: npm test1193 / 1194 pass (one pre-existing hud flake)
  • npm run typecheck, lint, format:check, build — all clean

🤖 Generated with Claude Code

Xaik89 and others added 5 commits April 14, 2026 22:10
…AST 07)

Extends the MCP tool pinning primitive (v1.5.0 / PR node9-ai#81) to agent skill
repositories. On the first tool call of a session, SHA-256 hashes are
recorded for every skill file in known roots (~/.claude/skills/,
~/.claude/CLAUDE.md, ~/.claude/rules/, project .claude/CLAUDE.md,
.claude/CLAUDE.local.md, .claude/rules/, .cursor/rules/, AGENTS.md,
CLAUDE.md). On subsequent sessions the hook re-verifies; any drift
quarantines the session and blocks every tool call until a human
reviews via `node9 skill pin update <rootKey>`.

One feature, two threats covered in a single primitive — AST 02
Supply Chain Compromise and AST 07 Update Drift — at the skills
layer that no competitor currently defends at runtime.

Security properties:
- Fail-closed on corrupt pin file
- Symlink-safe (never follows symlinks out of the tree)
- Size-capped at 5000 files / 50 MB per root
- Path-traversal-safe session IDs (/^[A-Za-z0-9_-]{1,128}$/)
- Atomic writes, mode 0o600 for ~/.node9/skill-pins.json and flags

CLI:
- `node9 skill pin list` — show pinned roots, hashes, file counts
- `node9 skill pin update <rootKey> [--yes]` — diff + re-pin
- `node9 skill pin reset` — clear pins AND wipe session flags

Config:
- `policy.skillRoots: string[]` extends the default root set (absolute,
  `~/`-prefixed, or cwd-relative; relative paths require absolute cwd).

Tests (TDD, all green):
- src/__tests__/skill-pin.unit.test.ts         (36 tests)
- src/__tests__/skill-pin-cli.integration.test.ts (7 tests, spawnSync)
- src/__tests__/check-skill-pin.integration.test.ts (8 tests, spawnSync)
- src/__tests__/skill-roots-config.spec.ts     (4 tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Feedback: ~2k lines was too much. Trim to the security-essential
surface while preserving the AST 02 + AST 07 coverage.

Dropped (not essential to the rug-pull defense):
- computePinDiff + fileManifest tracking in pin entries
- @inquirer/prompts dependency (no interactive review prompt)
- per-file diff display in `skill pin update`
- `update` now mirrors `mcp pin update` exactly: just removes the pin
  so the next session re-pins with current state
- GC loop for session flags older than 7 days (not security-essential)
- 4 integration tests covering short-circuit / unchanged-session /
  quarantine-persists / relative-cwd edge cases — the 4 load-bearing
  scenarios remain: first-call pins, drift blocks, corrupt fails
  closed, missing session_id skips

Net: -945 / +258 lines. PR now ~1150 lines (down from ~1835).

All security properties preserved: fail-closed on corrupt pin file,
symlink-safe, size-capped, path-traversal-safe session IDs, atomic
writes with mode 0o600, per-session memoisation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Feedback: section was too long. Collapse to one paragraph that
cross-references the MCP pinning section above (same pattern, same
defense, different target) instead of restating the full rationale.
14 lines → 3.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses PR node9-ai#89 review feedback:
- enabled: false by default (opt-in only)
- mode 'warn' (default): /dev/tty notification on drift, tool allowed (exit 0)
- mode 'block': hard quarantine on drift, tool blocked (exit 2) — for
  installed/registry skills where changes are genuinely suspicious

Config: policy.skillPinning: { enabled, mode: 'warn'|'block', roots }
replaces the flat policy.skillRoots field.

Session flags now have three states: 'verified', 'warned', 'quarantined'.
Warn mode never writes 'quarantined' — warn + allow + memoize so the
/dev/tty notification shows once per session, not on every tool call.

Core hashing, CLI, and security properties are unchanged from v1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Most skills are installed from registries/shared repos, not authored
by the user — making them closer to MCP tools in threat profile.
Updated the README paragraph to lead with this framing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant