feat: ACHIEVE role — autonomy engineer for zero human intervention#111
feat: ACHIEVE role — autonomy engineer for zero human intervention#111
Conversation
…cies 5th role for the unified daemon. Measures autonomy score (0-100) across 4 categories (self-healing, self-directing, self-validating, self-improving), identifies the highest-impact human dependency, and fixes it with production-grade changes. - docs/prompt/achieve.md: 288-line prompt following blueprint-first patterns (identity, context, 7 rules, 10-step process, anti-slop quality gates) - docs/autonomy/README.md: score framework documentation - Wired into unified.md scoring (triggers when autonomy < 70 or needs-human issues) - Added to prompt guard, role extraction, format-stream markers - evolve-auto.md gets ACHIEVE context - CLAUDE.md documents the 5th role - Max once per 5 sessions to prevent over-introspection Prompt designed using Anthropic's 12-dimension framework: clarity 10, role 10, data separation 9, output format 10, chain of thought 9, examples (via scorecard), hallucination prevention 9, structure 10, polish 9.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2a5f67c9a3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| **Hard constraints:** | ||
| - STRATEGIZE max once per 10 sessions (cap prevents hiding in strategy mode) | ||
| - ACHIEVE max once per 5 sessions (autonomy work is high-value but infrequent) |
There was a problem hiding this comment.
Make ACHIEVE cooldown computable before enforcing it
The new hard constraint ACHIEVE max once per 5 sessions is not operationalized by the signal set in PHASE 1, which means the agent has no explicit sessions_since_achieve value to apply this cap during scoring. In low-score states (e.g., no autonomy report + eval gate active), ACHIEVE can be repeatedly selected across consecutive sessions, starving BUILD/REVIEW despite the intended cooldown. Add a tracked signal (from docs/sessions/index.md) and apply it in the scoring/constraint logic so this limit is enforceable rather than advisory.
Useful? React with 👍 / 👎.
…one) Queue before: 72 pending + 9 wontfix-in-active-dir Queue after: 65 pending + 0 wontfix (all converted to done for archiving) Merged into primary tasks (5 closures): - #175 -> #174: both add tests to TestAuthFailureDetection, same PR - #163 -> #162: both are scoring module tests from PR #158 review, same PR - #124 -> #122: both validate doc snapshot consistency, same PR scope - #196 -> #173: both add entries to PROMPT_GUARD_FILES in lib-agent.sh - #180 -> #179: both touch _is_valid_eval_file() in pick-role.py, same PR Closed as obsolete (1): - #78: references non-existent "evolve.md Step 8" and the multi-agent review panel replaced by unified review in PR #107 Closed as low-value (1): - #230: _DELEGATION_ROLE_MAP covers all 8 current agent types; new agent types require major framework work making the map update obvious Converted wontfix -> done for archiving (9): - #77, #80, #107, #111, #115, #119, #127, #129, #134 All had wontfix status with rationale already documented; changed to done so daemon's archive_done_tasks() housekeeping removes them
Summary
5th role for the unified daemon: ACHIEVE — measures autonomy score (0-100), identifies human dependencies, eliminates the highest-impact one each session.
Autonomy Score Framework (20 checks, 5 points each)
Anti-slop quality gates
Every fix must pass: Linus Test, New Hire Test, 3 AM Test, Pride Test
Files
docs/prompt/achieve.md— 288-line prompt (blueprint-first pattern)docs/autonomy/README.md— score framework docsTest plan
make checkpasses