feat: add sessions_since_eval signal and brain eval cadence rule#244
feat: add sessions_since_eval signal and brain eval cadence rule#244
Conversation
The previous cleanup_worktrees only removed worktrees marked 'prunable' by git, but agent-* directories left by completed or crashed sessions are never prunable (their directories still exist). Added Pass 1 that enumerates all .claude/worktrees/agent-* entries via git worktree list --porcelain and force-removes each, skipping the current executing worktree as a safety guard. git worktree prune runs last to clear any orphaned git metadata. This runs in both housekeeping (start of cycle, catches crash remnants) and post-session cleanup (end of cycle).
- Add `sessions_since_eval()` to signals.py: reads the latest eval file from .recursive/evaluations/, extracts its date/mtime, then counts session index rows after that timestamp. Returns a scalar int. - Surface the signal prominently in the dashboard Health section next to Eval score, with STALE annotation when >= 5 sessions. - Add alert in the Alerts section when sessions_since_eval >= 5, pointing the brain to delegate a Phractal eval run. - Add eval cadence rule to brain.md Delegation Protocol: when eval_staleness alert fires, brain SHOULD include eval run as a delegation. - Add 8 unit tests in test_signals.py and 6 in test_dashboard.py covering all branches of the new signal and alert logic. - All 1156 tests pass (make check green). Closes task #242.
| # sessions_since_eval tests (task #0242) | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
| _VALID_EVAL_CONTENT = ( |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 312c978493
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Use file mtime for more precise timestamp (YYYY-MM-DD HH:MM) | ||
| from datetime import datetime as _dt | ||
|
|
||
| eval_ts = _dt.fromtimestamp(latest.stat().st_mtime).strftime("%Y-%m-%d %H:%M") |
There was a problem hiding this comment.
Base eval staleness on report date, not file mtime
Using latest.stat().st_mtime as the primary comparison timestamp makes sessions_since_eval return 0 after a fresh clone/reset, because Git sets file mtimes to checkout time rather than the report’s actual run date. In that common environment, an old eval report (with many newer session rows) appears “up to date,” so the new eval_staleness alert never fires and the cadence rule is bypassed. This regresses the core purpose of task #242 whenever the working tree is recreated or eval files are touched without a new eval run.
Useful? React with 👍 / 👎.
Summary
sessions_since_eval()to.recursive/engine/signals.py: reads the latest eval file, extracts its date/mtime, counts session rows after that timestamp. Returns a scalar int signal.[STALE -- rerun recommended]annotation when >= 5 sessions stale.sessions_since_eval >= 5, directing the brain to delegate a Phractal eval run..recursive/agents/brain.mdDelegation Protocol: when theeval_stalenessalert fires, the brain SHOULD include an eval run as one of its delegations.test_signals.pyandtest_dashboard.pycovering all branches.make checkgreen).Addresses
Task #242. Fixes the broken build-measure-build feedback loop: the brain now has a visible signal and alert when evals go stale, and a protocol for triggering a rerun.
Test plan
make checkpasses (1156 tests, all green)Eval staleness:line in Health section