Socratic pre-emission gating layer for AI coding agents.
A Claude Code plugin that forces the host agent to demonstrate domain-specific mastery — via a Socratic probe graded by a separate model — before it's permitted to emit code against the gated domain. Three layers working together: a skill for content, an MCP server for the probe, and a PreToolUse hook for deterministic enforcement.
/plugin marketplace add patrickmvla/femto
/plugin install femto@femto-marketplace
Opt-in per repo by committing a .femto/config.yaml. Disabling femto on a
repo requires a committed config change — visible in code review — so a
pressured engineer cannot silently skip the gate.
Work where the stakes justify the friction: multi-tenancy, auth, data handling, safety-adjacent systems, onboarding engineers to domains they haven't seen before.
For compliance-grade work (HIPAA / PCI / SOC2 / NIST SP800 controls /
internal engineering policies) femto composes with packs you author in your
own repo under <repo>/.femto/packs/*/ — same schema, same structural
guarantees, your content, your trust boundary. See "Three content tiers" on
the Concepts page for the design.
Explicitly not for high-time-pressure emergencies — the incentive gradient points at disconnection and the org will not mandate the tool.
Femto does not claim guaranteed understanding. It commits to:
- No designed-in bypass. No skip-pedagogy toggle, no "I'm an expert" mode.
- Stacked grading. The probe and grader run in separate contexts on
different models; the hook reads
grader.mdfrom disk rather than trusting the probe's own summary. - Measured, published per-domain false-pass rate. Every shipped pack has a harness, test cases, and numbers (AUC / FPR / TPR) you can inspect and replicate. Disclosure labels sit next to every number.
- No "guaranteed understanding" language. Bounded probability, not absolute.
- Engineer describes the problem.
- Main agent recognises the domain, calls
femto_start_session. - Reader subagent (Haiku) writes the pack's required reads with inline citations.
- Probe (MCP server) loops over required KCs until every one hits its turn minimum.
- Grader subagent (Opus, separate context) scores the serialized probe log
and writes
grader.mdwith per-turn correctness + per-KC mastery. - PreToolUse hook reads
grader.md, checks per-KC mastery against the pack threshold, and either allowsEdit/Write/Bashor blocks with a structured message.
Session state lives at .femto/session-<id>/. Gitignored by template;
reaped after 30 days by default.
.claude-plugin/
plugin.json Plugin manifest (userConfig: solo_dev_port, retention, idle timeout)
marketplace.json Marketplace listing for /plugin install
.mcp.json MCP stdio server declaration with ${user_config.*} substitution
agents/
femto-grader/ Grader subagent (model: opus, tools: Read Write)
femto-reader/ Reader subagent (model: haiku, tools: Read Grep Glob Write)
skills/
femto-status/ /femto:status diagnostic
femto-retry-reads/ /femto:retry-reads recovery
femto-retry-probe/ /femto:retry-probe recovery
hooks/
hooks.json Hook manifest (SessionStart / PreToolUse / SessionEnd)
session-start.sh CC-session-id binding bootstrap
pre-tool-use.sh The gate (bash + inline python3, 60s fail-closed)
session-end.sh Cleanup
packages/
pack-schema/ Zod schemas + femto-validate-pack CLI
mcp-server/ 6 femto_* tools, session lifecycle, solo-dev HTTP (127.0.0.1 only)
harness/ Calibration via `claude -p` grader → AUC/FPR/TPR
site/ 6-page Astro site (public metrics dashboard)
hook-tests/ End-to-end hook integration tests
packs/
multi-tenancy/ Reference pack (alpha, row-level-security + tenant-scoping)
templates/
femto.gitignore Ship this in adopter repos to gitignore .femto/
docs/
calibration.md `claude setup-token` flow + 3 mandatory disclosure conditions
scripts/
bump-version.sh Keep plugin.json / marketplace.json / .mcp.json versions in sync
Requires: Bun (latest), Claude Code CLI,
and CLAUDE_CODE_OAUTH_TOKEN set in .env for harness calibration runs.
bun install
bun run --filter '*' test # 206 tests: pack-schema 15, mcp-server 110,
# hook-tests 13, harness 56, site 12
bun run --filter '*' typecheck
Site dev server:
cd packages/site && bun run dev
Harness calibration against the reference pack:
bun run packages/harness/src/cli.ts run packs/multi-tenancy
See docs/calibration.md for the token setup and the three mandatory
disclosure conditions.
Two paths — pick the one that matches where your pack will live.
Shipped packs — PR to packs/*/ in this repo. Operator-curated,
harness-calibrated, reliability published on the site's /metrics page.
OWASP CheatSheetSeries precedent: editor-moderated, PR-driven,
citation-required. Every claim bullet needs an inline link to an
authoritative source; unsourced assertions fail schema validation.
Org-local packs — author in your own repo under <repo>/.femto/packs/*/.
Same schema (@femto/pack-schema validates both), same structural enforcement,
adopter-owned trust boundary. Compliance-grade rules live here. These packs
do not appear on the hosted /metrics page — reliability is the adopter's
responsibility.
Full authoring guide: the Contributing page on the site.
scripts/bump-version.sh 0.2.0 # updates plugin.json + marketplace.json + .mcp.json
git add .claude-plugin/*.json .mcp.json
git commit -m "chore: bump femto to v0.2.0"
git tag v0.2.0
git push origin main v0.2.0
Adopters pick up the new version via /plugin marketplace update femto-marketplace.
MIT.