Claude/setup project structure 3 yei t#4
Merged
Conversation
Required by the badge-style buttons on docs/index.md
({ .md-button .md-button--primary } syntax). Was emitting a silent
no-op before the extension was registered.
Closed Disposition StrEnum (ALLOW / DENY / ESCALATE) for the policy output. Explicit transition table encoded as a dict[Posture, frozenset[Posture]]: INTERACTIVE goes anywhere; AUTONOMOUS and DRY_RUN go to INTERACTIVE or LOCKED; LOCKED only unlocks to INTERACTIVE. Identity transitions are always allowed. evaluate() is a pure function applying posture rules: 1. Posture allow-list check (None means no constraint). 2. LOCKED and DRY_RUN: only READ permitted; everything else denies. 3. AUTONOMOUS: require_confirmation tools fail closed (DENY). 4. INTERACTIVE: require_confirmation tools ESCALATE. 5. Otherwise: ALLOW. Posture, Disposition, transition, evaluate added to __all__. Manifest schema's Posture import remains the only cross-module reference; posture.py uses TYPE_CHECKING for Decision and ToolDefinition to avoid an import cycle. 47 unit tests cover every transition cell, every posture × effect combination for evaluate, allow-list rules, and confirmation handling.
Frozen + slotted + kw-only @DataClass carrying tool, arguments, effects tuple, dominant effect, rationale, posture, disposition, and the require_confirmation flag echoed from the manifest definition. Three methods make it content-addressable: - to_canonical_dict() returns a sorted-keys dict with enum values as strings. - to_canonical_json() emits a compact, sort-keys-true, ensure-ascii-false JSON encoding. Byte-stable across runs, platforms, and argument insertion order. - content_hash() returns the SHA-256 hex digest of the canonical JSON. Same receipt → same hash. Different receipt → different hash with near-certainty. No clocks, no randomness, no I/O — wall-clock metadata that needs to live alongside a receipt belongs in the hook layer. Receipt added to __all__. 21 unit tests plus three hypothesis property tests at 1,000 examples each cover byte-stability, hash determinism, and the SHA-256 identity sha256(to_canonical_json) == content_hash.
run_hook() is the testable core: takes a manifest, a JSON payload (str or bytes), and a posture, returns (Receipt, exit_code). Validates the payload's shape (top-level object with a non-empty string tool field and an optional object-typed arguments field), runs classify -> evaluate -> Receipt, returns the canonical exit code. main() is the I/O wrapper: reads stdin, calls run_hook, writes the receipt's canonical JSON to stdout, and returns the exit code. On ManifestError or HookError, writes a structured JSON error payload (so the host always parses a JSON object) and a one-line stderr message. Exit-code contract: 0 EXIT_ALLOW disposition is ALLOW 1 EXIT_DENY disposition is DENY 2 EXIT_ESCALATE disposition is ESCALATE 64 EXIT_HOOK_ERROR payload protocol violation 65 EXIT_MANIFEST_ERROR tool not declared in manifest 21 unit tests cover the happy paths (allow/deny/escalate per posture combination), every payload error mode (invalid JSON, non-object, missing tool field, empty/non-string tool, non-object arguments, undeclared tool), and end-to-end byte-stability across two runs with identical input. Hook stays I/O-thin; the pure modules stay pure.
Three new subcommands round out spine-lite as a usable runtime: - validate-manifest <path>: parse + validate, exit 0 / 1 - classify <tool> --manifest <path>: emit a JSON Decision - hook --manifest <path> [--posture POSTURE]: stdin/stdout adapter delegating to spine_lite.hook.main Annotated-style typer parameters used throughout so B008 stays clean. Path is kept as a top-level import (TC003 ignored for cli.py only) because typer introspects it at runtime for path validation. Tests in three layers: 1. CliRunner smoke + integration tests for every subcommand and every posture × disposition combination on the hook subcommand 2. Error-path coverage: invalid JSON, undeclared tool, invalid manifest schema, invalid posture name, nonexistent manifest path 3. End-to-end subprocess tests against `python -m spine_lite.cli`, exercising the same code path Claude Code uses when wiring the PreToolUse hook. Five posture × tool combinations + byte-stability across runs + version smoke. Closest equivalent in this sandbox to the blueprint's "install in fresh venv, wire as Claude Code hook" smoke. S603 ignored for tests/* (subprocess args constructed in-process from fixtures, not user input). 100% coverage on every runtime module including cli.py.
pyproject.toml and __init__.py bumped to 0.3.0a0; smoke test pinned to the new version. CHANGELOG [0.3.0a0] section enumerates the posture state machine, Disposition, Receipt, hook adapter, full CLI surface, the integration + E2E test layer, and the attr_list mkdocs hygiene. README status grid marks Phase 3 shipped. New docs/history/phase-3.md narrates the build; mkdocs nav extended. RECEIPTS.md gains the Phase 3 exit receipt with the full gate table. 11 of 11 exit-gate items clear in the sandbox; CI verification on push remains operator-side. PyPI publish is the operator decision the blueprint reserves for this gate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Disposition+evaluateReceiptwith deterministic serialization + content hashrun_hook,main)