Conversation
…code In full-stack repos, agents gravitate toward React components. This adds a three-part system to detect and correct frontend-heavy exploration: 1. classify_repo_dirs() classifies top-level dirs as frontend/backend using dir name matching with extension-sampling fallback 2. build_backend_escalation() injects a directive naming specific backend dirs when the last N cycles all touched frontend paths 3. New config option backend_forcing_cycle (default 3) controls threshold Loop 1 item #4 complete. 20/21 components done (95%). +19 tests (9 classification, 7 escalation, 3 prompt integration). 189 tests passing, make check clean.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 00b4696b59
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| threshold = int(config.get("backend_forcing_cycle", 3)) | ||
| if cycle < threshold: | ||
| return "" | ||
| recent = state["recent_cycle_paths"] | ||
| if len(recent) < threshold: |
There was a problem hiding this comment.
Clamp backend forcing threshold to stored history
build_backend_escalation() accepts any backend_forcing_cycle, but it gates on len(state["recent_cycle_paths"]) >= threshold while state tracking keeps only the last 4 paths (append_cycle_state() truncates to [-4:]). That means any configured threshold above 4 can never trigger backend forcing, so the new config appears valid but silently disables the feature for those values.
Useful? React with 👍 / 👎.
| backend_esc = build_backend_escalation( | ||
| cycle=cycle_number, | ||
| config=config, | ||
| state=state, | ||
| repo_dir=repo_dir, |
There was a problem hiding this comment.
Classify backend dirs from the active worktree
The cycle loop computes backend escalation using repo_dir, but all accepted changes happen in worktree_dir; after cycle 1 those directory trees can diverge. In that case the escalation block may suggest backend directories that were renamed/removed in the worktree or miss new backend dirs created earlier in the shift, producing inaccurate guidance for subsequent cycles.
Useful? React with 👍 / 👎.
- Move CLASSIFY_SKIP_DIRS to constants.py (no hardcoded data in logic) - Filter "(none)" from prior paths in build_backend_escalation - Sort __all__ exports alphabetically - Export new constants and functions from __init__.py
5-agent audit identified task selection as mesa-optimization: agent optimizes session success over project progress. 5 more agents audited the fix and found 7 issues, all resolved. Phase 1 — Prompt fixes: - Remove "smaller in scope" incentive from evolve-auto.md - Queue order authoritative, handoff advisory - "Tasks I Did NOT Pick and Why" in every handoff - Tracker delta required in session reports - Staleness multiplier (5+ sessions = 2x priority) - All-integration-tasks edge case handled Phase 2 — Task queue: - environment: internal | integration tags - blocked_reason subtypes: environment, dependency, design - .next-id for atomic task ID allocation - archive/ for done tasks (daemon auto-archives) - Tagged #12, #28, #29 blocked-environment - Fixed #43 broken frontmatter (duplicate of #40) Phase 3 — Overseer avoidance detection: - 6 new checks: stale tasks, cherry-picking, stuck integration, weak blocks, max attempts, skip accountability Learnings index: - INDEX.md with categorized one-line summaries (31/31 matched) - Agent reads index, opens files only when relevant - Updated CLAUDE.md, evolve.md Step 1 and Step 6l Meta layer vision (8 new tasks #46-#53): - #46 Healer — between-session trend observer - #47 Multi-agent PR review panel - #48 Human escalation (gh issue create + webhook) - #49 Self-evaluation loop against real repos - #50 Prompt self-refinement via strategist - #51 Cross-session cost intelligence - #52 Codebase world model (MODULE_MAP.md) - #53 Agent generates its own tasks across all dimensions Also: OPERATIONS.md stale task queue + --squash example fixed
Summary
classify_repo_dirs()classifies top-level dirs as frontend/backend using dir name matching with extension-sampling fallback for ambiguous dirs likesrc/build_backend_escalation()injects a "Backend exploration directive" naming specific backend dirs when the last N cycles all touched frontend-classified pathsbackend_forcing_cycle(default 3) controls the thresholdTest plan
classify_repo_dirs(frontend/backend by name, mixed repo, ambiguous dirs by extension, hidden dirs skipped, node_modules skipped, empty repo, unclassifiable dirs)build_backend_escalation(before/at/after threshold, backend visited, no backend dirs, not enough history, custom threshold, names specific dirs)make checkpasses (mypy strict, ruff, 189 pytest, dry-runs, ASCII, install.sh refs)bash scripts/validate-docs.shpasses (doc consistency)