feat: ZO e2e validated — wrapper fix, README, MNIST project complete#7
Merged
Conversation
- --cwd is not a valid claude CLI flag → use --add-dir for delivery repo access - --teammate-mode tmux doesn't exist → removed (teams created internally via TeamCreate) - Added --dangerously-skip-permissions for non-interactive agent execution - Updated target file to absolute path for worktree compatibility - MNIST project initialized with memory scaffold First successful live run: ZO executed Phase 1 of MNIST project. Agent team produced data_loader.py, 32 passing tests, EDA report, data quality report in the delivery repo. Zero ZO artifacts leaked. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
First successful end-to-end run of Zero Operators on a real project. MNIST digit classifier built autonomously across 5 phases: - Phase 1: Data pipeline, EDA, DataLoaders (32 tests) - Phase 3: CNN architecture (2 conv + BN + 2 FC), training loop (51 tests) - Phase 4: Training to 99.00% test accuracy (Tier 1 = 95%), oracle passed - Phase 5: GradCAM, ablation, significance testing, reproducibility - 98 tests passing in delivery repo, lint clean Delivery repo (mnist-delivery/) contains: - src/model.py, train.py, inference.py, data_loader.py - models/best_model.pt (trained checkpoint) - oracle/eval.py + confusion matrix + evaluation report - xai/gradcam.py + error analysis + saliency/GradCAM plots - experiments/ablation, significance testing, reproducibility - 8 test files, pyproject.toml, clean git history (4 commits) - Zero ZO artifacts — clean delivery Total cost: ~$11 across all sessions. Session logs preserved in logs/comms/ and logs/wrapper/. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SamPlvs
added a commit
that referenced
this pull request
Apr 10, 2026
Add automated documentation consistency validation with enforcement hooks. When the 17th agent (Research Scout) was added, 10+ files had stale counts, version numbers, and model tiers. Root cause: CLAUDE.md cascade protocol existed as text but had zero enforcement. Layer 1 — scripts/validate-docs.sh: 7 programmatic checks (agent count, names, commands, version, tiers, tests, setup.sh literal). Runs in <2s. Exits non-zero on failure. Layer 2 — Claude Code hooks (.claude/settings.json): PreToolUse on git commit: blocks if validation fails PostToolUse on Write|Edit: cascade reminders for trigger files Stop: checks for uncommitted changes in trigger paths Layer 3 — Documentation: CLAUDE.md cascade protocol replaced with file-to-file mappings PR-005 added to PRIORS.md (missing_rule → enforcement) commit command updated with validation step Also fixes: - Agent count 16→17 across 10 files (Research Scout was undocumented) - Version 1.0.0→1.0.1 in pyproject.toml and __init__.py - Test badge 295→298, setup checks 11→10 - Command count 23→24 in STATE.md - Model Builder and Backend Engineer tiers: "Sonnet/Opus"→"Opus" - Research Scout added to specs/agents.md roster (#7), phase-in renumbered - PRD.md: 6 launch→7 launch, 10→11 project delivery agents Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 tasks
SamPlvs
added a commit
that referenced
this pull request
Apr 30, 2026
feat: ZO e2e validated — wrapper fix, README, MNIST project complete
SamPlvs
added a commit
that referenced
this pull request
Apr 30, 2026
Add automated documentation consistency validation with enforcement hooks. When the 17th agent (Research Scout) was added, 10+ files had stale counts, version numbers, and model tiers. Root cause: CLAUDE.md cascade protocol existed as text but had zero enforcement. Layer 1 — scripts/validate-docs.sh: 7 programmatic checks (agent count, names, commands, version, tiers, tests, setup.sh literal). Runs in <2s. Exits non-zero on failure. Layer 2 — Claude Code hooks (.claude/settings.json): PreToolUse on git commit: blocks if validation fails PostToolUse on Write|Edit: cascade reminders for trigger files Stop: checks for uncommitted changes in trigger paths Layer 3 — Documentation: CLAUDE.md cascade protocol replaced with file-to-file mappings PR-005 added to PRIORS.md (missing_rule → enforcement) commit command updated with validation step Also fixes: - Agent count 16→17 across 10 files (Research Scout was undocumented) - Version 1.0.0→1.0.1 in pyproject.toml and __init__.py - Test badge 295→298, setup checks 11→10 - Command count 23→24 in STATE.md - Model Builder and Backend Engineer tiers: "Sonnet/Opus"→"Opus" - Research Scout added to specs/agents.md roster (#7), phase-in renumbered - PRD.md: 6 launch→7 launch, 10→11 project delivery agents Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wrapper Fix (critical)
--cwd→--add-dir(correct claude CLI flag for delivery repo access)--teammate-mode tmuxremoved (doesn't exist; teams created internally via TeamCreate)--dangerously-skip-permissionsfor non-interactive agent executionREADME Rewrite
MNIST End-to-End Validation
ZO autonomously built a complete MNIST digit classifier:
Delivery repo (
mnist-delivery/):Test plan
zo init mnist-digit-classifierscaffolds correctlyzo build plans/mnist-digit-classifier.mdexecutes successfully🤖 Generated with Claude Code