docs: rewrite ROLE-SCORING.md for v2 brain-delegates-to-sub-agents architecture#197
docs: rewrite ROLE-SCORING.md for v2 brain-delegates-to-sub-agents architecture#197
Conversation
…ents architecture Remove the v1 PHASE 1/2/3 manual scoring process that was misleading in v2. The document now accurately describes that pick-role.py computes an advisory recommendation automatically, and the brain reads it as one input among several. All 8 roles (build, review, oversee, strategize, achieve, security-check, evolve, audit) are documented with their scoring rules sourced directly from pick-role.py. Examples updated to show the v2 brain checkpoint analysis flow. Removed all references to EXECUTING ROLE: and manual SKILL.md reading instructions. Closes task #203.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8a22476098
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ``` | ||
| BUILD: 80 (50 +30 eval healthy) | ||
| REVIEW: 60 (10 +40 consecutive >= 5 +10 review >= 5) | ||
| OVERSEE: 100 (5 +60 pending >= 80... wait: 62 < 80, so +45 pending >= 50 with so >= 3 = 50) |
There was a problem hiding this comment.
Fix Example 2 advisory math to match pick-role.py
This example claims OVERSEE: 100 and an oversee recommendation for pending_tasks: 62, but compute_scores() in .recursive/engine/pick-role.py gives oversee=50 for that range and build=80, so pick_role() would choose build. Because this file is presented as the scoring reference, this mismatch will mislead anyone validating daemon role selection against the documented examples.
Useful? React with 👍 / 👎.
| STRATEGIZE: 5 (5 base, strategy < 10) | ||
| ACHIEVE: 5 (5 base) | ||
| SECURITY: 5 (5 base) | ||
| EVOLVE: 80 (5 +50 friction >=5... wait: 4 < 5 so +0; +30 friction >= 3 AND se >= 5 = 35) |
There was a problem hiding this comment.
Correct Example 4 impossible evolve tie scenario
The line states EVOLVE: 80 and a tie with BUILD, but with the shown signals (friction_entries=4, sessions_since_evolve=8) the EVOLVE formula in compute_scores() only reaches 35, so this advisory path cannot occur as written. Keeping an impossible selector outcome in the v2 reference can cause false incident/debugging conclusions when operators compare real advisory output to this doc.
Useful? React with 👍 / 👎.
- Add "(elif, mutually exclusive)" to ACHIEVE autonomy_score < 90 bonus to clarify it cannot stack with the < 70 bonus (code uses elif) - Fix Example 2: update signals (pending=85, stale=6, so=4, tracker_moved=true) and JSON score (100->90) to match breakdown math; OVERSEE=90 correctly beats BUILD=80 - Fix Example 3: ACHIEVE score 20->25 (auto=72 triggers elif auto<90: 5+20=25); add sessions_since_achieve to signals to prevent hard-cap ineligibility - Fix Example 4: friction_entries 4->5 so EVOLVE=85 (5+50+30) beats BUILD=80; add tracker_moved=true to prevent STRATEGIZE from incorrectly winning at 95; fix JSON score (80->85); fix ACHIEVE to -1 (ineligible per hard cap) - Fix Example 1: OVERSEE 10->5 (so defaults 0, hard cap applies); add sessions_since_achieve to signals to prevent hard-cap ineligibility
Summary
.recursive/ops/ROLE-SCORING.mdto accurately describe the v2 architecture wherepick-role.pycomputes an advisory recommendation automatically -- not a manual scoring process for the brain agentpick-role.py'scompute_scores()functionEXECUTING ROLE:and directSKILL.mdreading instructionsChanges
.recursive/ops/ROLE-SCORING.md-- complete rewrite.recursive/tasks/0203.md-- status: pending -> doneTest plan
make checkpasses (all 882 tests, ruff, mypy, ASCII-clean, install.sh refs)nightshift/files modified (framework-zone only)pick-role.pyscoring algorithmEXECUTING ROLE:or manual SKILL.md reading