Rectify: Planner Write Isolation — PART A ONLY by Trecek · Pull Request #2823 · TalonT-Org/AutoSkillit

Trecek · 2026-05-24T04:04:55Z

Summary

During a planner-elaborate-wps run, L0 subagents spawned via the native Agent tool executed sed commands via Bash and Edit tool calls on a git-tracked source file (src/daemon.rs). Five defensive layers all failed. The architectural weakness is that the write restriction system was designed around a single skill (investigate) and never generalized to the 13 planner skills or the 15+ audit/analysis skills that also must never modify source code.

Part A addresses the core enforcement gap: making write_guard.py cover Bash tool calls with file-modifying commands, extending the planner_result_naming_guard.py to block writes outside the planner output directory, and adding post-session git status cleanliness verification. All changes are in the hook layer and the headless execution layer.

Part B will cover the broader skill contract reclassification (adding read_only to planner and audit skills) and L0 prompt hardening.

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/remediation-20260523-180225-774184/.autoskillit/temp/rectify/rectify_planner_write_isolation_2026-05-23_183500_part_a.md

Changed Files

New (★):

tests/arch/test_write_restriction_coverage.py
tests/execution/test_planner_write_isolation.py
tests/planner/test_elaborate_assignments_contract.py
tests/recipe/test_planner_contracts.py
tests/server/test_tools_execution_write_prefix.py

Modified (●):

pyproject.toml
src/autoskillit/core/init.pyi
src/autoskillit/core/io.py
src/autoskillit/execution/clone_guard.py
src/autoskillit/execution/headless/init.py
src/autoskillit/hook_registry.py
src/autoskillit/hooks/guards/CLAUDE.md
src/autoskillit/hooks/guards/write_guard.py
src/autoskillit/hooks/registry.sha256
src/autoskillit/recipes/full-audit.json
src/autoskillit/recipes/full-audit.yaml
src/autoskillit/recipes/implement-findings.json
src/autoskillit/recipes/implement-findings.yaml
src/autoskillit/recipes/implementation-groups.json
src/autoskillit/recipes/implementation-groups.yaml
src/autoskillit/recipes/implementation.json
src/autoskillit/recipes/implementation.yaml
src/autoskillit/recipes/merge-prs.json
src/autoskillit/recipes/merge-prs.yaml
src/autoskillit/recipes/promote-to-main-wrapper.json
src/autoskillit/recipes/promote-to-main-wrapper.yaml
src/autoskillit/recipes/remediation.json
src/autoskillit/recipes/remediation.yaml
src/autoskillit/recipes/research-design.json
src/autoskillit/recipes/research-design.yaml
src/autoskillit/recipes/research-implement.json
src/autoskillit/recipes/research-implement.yaml
src/autoskillit/recipes/research-review.json
src/autoskillit/recipes/research-review.yaml
src/autoskillit/recipes/research.json
src/autoskillit/recipes/research.yaml
src/autoskillit/server/tools/tools_execution.py
src/autoskillit/skills_extended/planner-analyze/SKILL.md
src/autoskillit/skills_extended/planner-assess-review-approach/SKILL.md
src/autoskillit/skills_extended/planner-consolidate-wps/SKILL.md
src/autoskillit/skills_extended/planner-elaborate-assignments/SKILL.md
src/autoskillit/skills_extended/planner-elaborate-phase/SKILL.md
src/autoskillit/skills_extended/planner-elaborate-wps/SKILL.md
src/autoskillit/skills_extended/planner-extract-domain/SKILL.md
src/autoskillit/skills_extended/planner-generate-phases/SKILL.md
src/autoskillit/skills_extended/planner-reconcile-deps/SKILL.md
src/autoskillit/skills_extended/planner-refine-assignments/SKILL.md
src/autoskillit/skills_extended/planner-refine-phases/SKILL.md
src/autoskillit/skills_extended/planner-refine-wps/SKILL.md
src/autoskillit/skills_extended/planner-refine/SKILL.md
src/autoskillit/skills_extended/planner-validate-task-alignment/SKILL.md
tests/arch/CLAUDE.md
tests/arch/test_execution_source_split.py
tests/arch/test_layer_enforcement.py
tests/arch/test_subpackage_isolation.py
tests/arch/test_write_restriction_coverage.py
tests/execution/CLAUDE.md
tests/execution/test_clone_guard.py
tests/execution/test_planner_write_isolation.py
tests/execution/test_session_log_fields.py
tests/execution/test_write_evidence.py
tests/fleet/test_dispatch_failure_semantics.py
tests/hooks/test_planner_result_naming_guard.py
tests/hooks/test_write_guard.py
tests/planner/CLAUDE.md
tests/planner/test_elaborate_assignments_contract.py
tests/planner/test_elaborate_wps_contract.py
tests/recipe/CLAUDE.md
tests/recipe/test_planner_contracts.py
tests/server/CLAUDE.md
tests/server/test_tools_execution_write_prefix.py

Closes #2816

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step	Model	count	uncached	output	cache_read	peak_ctx	turns	cache_write	time
rectify	claude-sonnet-4-6	1	3.6k	31.8k	4.9M	150.3k	408	234.2k	32m 11s
dry_walkthrough	claude-sonnet-4-6	2	1.7k	32.5k	3.2M	88.9k	339	120.6k	17m 39s
implement	claude-sonnet-4-6	2	6.1k	118.1k	18.3M	175.9k	495	273.1k	42m 10s
prepare_pr*	MiniMax-M2.7	1	100.1k	5.0k	221.0k	29.9k	21	42.8k	3m 1s
compose_pr*	MiniMax-M2.7	1	40.7k	2.8k	278.5k	0	21	0	27s
review_pr	claude-opus-4-6	1	51	20.2k	1.3M	155.2k	96	154.3k	11m 11s
ci_conflict_fix	claude-opus-4-6	1	37	4.8k	801.0k	51.5k	37	36.3k	2m 18s
diagnose_ci*	MiniMax-M2.7	1	53.0k	2.3k	293.5k	0	21	0	35s
resolve_ci	claude-opus-4-6	1	65	9.3k	1.6M	71.5k	53	57.6k	10m 49s
Total			205.5k	226.9k	30.8M	175.9k		918.9k	2h 0m

* Step used a non-Anthropic provider; caching behavior may differ.

Token Efficiency

Step	LoC Changed	cache_read/LoC	cache_write/LoC	output/LoC
rectify	0	—	—	—
dry_walkthrough	0	—	—	—
implement	1202	15230.9	227.2	98.3
prepare_pr	0	—	—	—
compose_pr	0	—	—	—
review_pr	0	—	—	—
ci_conflict_fix	891	899.0	40.8	5.4
diagnose_ci	0	—	—	—
resolve_ci	35	45094.1	1646.4	266.7
Total	2128	14487.6	431.8	106.6

Model Usage Breakdown

Model	steps	uncached	output	cache_read	cache_write	time
claude-sonnet-4-6	3	11.4k	182.5k	26.4M	627.8k	1h 32m
MiniMax-M2.7	3	193.9k	10.0k	793.0k	42.8k	4m 2s
claude-opus-4-6	3	153	34.3k	3.7M	248.3k	24m 19s

- Add Bash to write_guard.py tool_name check (Write|Edit|Bash) - Add _extract_bash_write_targets() to parse sed -i, echo >, tee, mv, cp, patch, git checkout --, rm/unlink commands and extract target paths - Fail-open when write command detected but path cannot be extracted - Update HOOK_REGISTRY matcher from Write|Edit to Write|Edit|Bash - Regenerate registry.sha256 via task sync-hooks-hash - Update guards/CLAUDE.md description for write_guard.py - Add TestWriteGuardBashBypass with 6 tests covering Bash interception Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ement Tests verify write_guard.py blocks source writes and allows output-dir writes when AUTOSKILLIT_ALLOWED_WRITE_PREFIX is set for a planner session. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…sions - Add _has_write_scope = bool(write_watch_dirs) to headless/__init__.py - Snapshot now also taken when write_watch_dirs is non-empty (any session with output_dir gets a pre-session clone snapshot) - Compute _effective_readonly = _readonly_skill or _has_write_scope to enable contamination detection on successful planner sessions - Compute _exclude_prefix from write_watch_dirs[0] relative to cwd - Pass readonly_skill=_effective_readonly and exclude_prefix=_exclude_prefix to check_and_revert_clone_contamination - Add exclude_prefix parameter to revert_contamination and check_and_revert_clone_contamination (default .autoskillit/) - Use exclude_prefix in git clean --exclude= for selective revert - Add tests: write-scoped success contamination, custom exclude_prefix, no-snapshot passthrough Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

allowed_write_prefix is now set from write_watch_dirs (i.e., output_dir) regardless of is_read_only. This means planner skills that pass output_dir get AUTOSKILLIT_ALLOWED_WRITE_PREFIX injected into the subprocess env, enabling write_guard.py to enforce the prefix boundary during the session. The is_read_only flag still controls the fallback prefix when no output_dir is given (read-only skills without an explicit output_dir still use the skill temp dir as the prefix). - Add tests/server/test_tools_execution_write_prefix.py with two tests - Update tests/server/CLAUDE.md table with new test file Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

tests/execution/test_planner_write_isolation.py validates the full end-to-end flow using a real git repo: write_watch_dirs → _has_write_scope → clone snapshot → source file modified → contamination detected → selective revert → audit log records clone_contamination subtype Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Extend write_guard.py matcher to cover run_cmd MCP tool and read both 'command' and 'cmd' keys (command guard completeness enforcement) - Bump headless/__init__.py line budget from 1005 to 1015 to accommodate clone guard write-scope extension - Add test_planner_write_isolation.py to layer allowlist for pipeline import - Fix write evidence test mock runners to not write during clone guard snapshot calls (git rev-parse), preserving pre/post comparison accuracy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… tests - tests/arch/test_write_restriction_coverage.py: architectural invariant that skills with NEVER blocks prohibiting source modification have runtime enforcement (read_only, output_dir in all recipe invocations, or allowlist entry) - tests/recipe/test_planner_contracts.py: contract coherence tests verifying every planner run_skill step with write_behavior=always declares output_dir, and all output_dirs are rooted under context.planner_dir - tests/planner/test_elaborate_assignments_contract.py: new test file mirroring test_elaborate_wps_contract.py for planner-elaborate-assignments - tests/planner/test_elaborate_wps_contract.py: add L0 prompt write prohibition and JSON-only framing tests to TestSkillMdPresence class Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…allback - core/io.py: add resolve_skill_temp_dir() — computes default write-watch dir from skill name; exported in __all__ and __init__.pyi - execution/headless/__init__.py: replace local _resolve_skill_temp_dir definition with import alias from core.io; remove now-unused extract_skill_name import - server/tools/tools_execution.py: add fallback before allowed_write_prefix computation — when output_dir is absent, populate write_watch_dirs from the default skill temp dir so ad-hoc invocations get a restricted prefix - tests/arch/test_subpackage_isolation.py: update headless/__init__.py exemption rationale to remove _resolve_skill_temp_dir reference - pyproject.toml: add per-file-ignores for pre-existing E501 violations Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ipe steps - remediation.yaml: add output_dir to rectify and dry_walkthrough run_skill steps so the session's allowed_write_prefix is scoped to .autoskillit/temp/ for each skill - full-audit.yaml: add output_dir= to all run_skill() calls in run_audits and validate_audits note blocks; audit skills scope to their own temp dir, validate skills scope to audit_run_dir where they actually write - regenerate compiled recipe JSON files (task compile-recipes) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…nner skills Step 4: Add role-framing component 0 (READ-ONLY analysis agent write prohibition) to L0 prompt templates in planner-elaborate-wps and planner-elaborate-assignments SKILL.md files. L0s spawned by these skills now receive an explicit write prohibition as the first prompt component. Step 5: Add write prohibition bullet to NEVER blocks of all 14 planner SKILL.md files, standardizing the prose restriction across all planner skills and making it explicit about Bash file-modifying commands. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… write prefix tests Step 2 completion: Add output_dir to all recipe invocations of audit-*, review-*, and research skill steps across implementation, merge-prs, remediation, research, research-implement, research-review, research-design, implement-findings, and promote-to-main-wrapper recipes. Adds build-execution-map to UNRESTRICTED_WRITE_SKILLS allowlist (bem-wrapper.yaml step has no CWD anchor for output_dir). Update test_write_restriction_coverage.py: add build-execution-map to allowlist, shorten comment for line length compliance. Update test_tools_execution_write_prefix.py: rename test to reflect that ad-hoc invocations now use the _resolve_skill_temp_dir fallback as their allowed_write_prefix rather than an empty string. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…io submodule Fixes REQ-IMP-001 and REQ-IMP-002 violations in execution/headless/__init__.py and server/tools/tools_execution.py by importing through the core package re-export surface instead of the core.io submodule directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…e recipe YAML parsing The test_l0_prompt_template_contains_write_prohibition test expects explicit READ-ONLY agent / "Do NOT use Write" text in the SKILL.md. Added the expected phrases to the agent definition bullet point. The test_never_modify_source_skills_have_write_prefix test timed out in CI because _recipe_invocations_have_output_dir re-parsed all 14 recipe YAML files (some 70KB+) for each skill. Refactored to parse once via _load_parsed_recipes(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

#3045) Add CLONE_COMMIT_SKILLS set (resolve-failures, resolve-review, resolve-merge-conflicts) and skip clone contamination revert on success for these skills. Fixes phantom fix bug where _effective_readonly=True (from _has_write_scope) caused check_and_revert_clone_contamination to destroy valid commits after successful resolve_ci sessions. Does not change _effective_readonly semantics — planner/audit write isolation from PR #2823 is preserved. The structural decoupling of fire-on-success vs selective-revert in the clone guard remains as follow-up work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Trecek and others added 12 commits May 23, 2026 21:17

test: add TestPlannerWriteScopeGuard for planner session write enforc…

3157710

…ement Tests verify write_guard.py blocks source writes and allows output-dir writes when AUTOSKILLIT_ALLOWED_WRITE_PREFIX is set for a planner session. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Trecek force-pushed the planner-l0-subagents-write-to-git-tracked-source-files-no-ho/2816 branch from cbc598b to 301f1be Compare May 24, 2026 04:18

Trecek added this pull request to the merge queue May 24, 2026

Merged via the queue into develop with commit 20fa083 May 24, 2026
3 checks passed

Trecek deleted the planner-l0-subagents-write-to-git-tracked-source-files-no-ho/2816 branch May 24, 2026 04:43

Trecek mentioned this pull request May 26, 2026

Clone Guard Reverts resolve-failures Commits in resolve_ci Step #3045

Open

This was referenced May 26, 2026

Rectify: Clone Guard Semantic Conflation — Session Permission Model #3049

Merged

Clone guard fires false-positive clone_contamination on planner sessions due to external HEAD advancement #3067

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rectify: Planner Write Isolation — PART A ONLY#2823

Rectify: Planner Write Isolation — PART A ONLY#2823
Trecek merged 13 commits into
developfrom
planner-l0-subagents-write-to-git-tracked-source-files-no-ho/2816

Trecek commented May 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Trecek commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Implementation Plan

Changed Files

New (★):

Modified (●):

Token Usage Summary

Token Efficiency

Model Usage Breakdown

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Trecek commented May 24, 2026 •

edited

Loading