Recusive · fazxes · Apr 5, 2026 · Apr 5, 2026 · Apr 5, 2026
diff --git a/docs/changelog/v0.0.7.md b/docs/changelog/v0.0.7.md
@@ -34,6 +34,8 @@ Security hardening for running against untrusted target repositories.
 
 - **[feat]** Human escalation channel (`notify_human`). When the daemon hits a situation requiring human attention (circuit breaker tripped, budget limit reached, healer critical pattern), it creates a GitHub issue with the `needs-human` label and optionally fires a webhook. Configurable via `notification_webhook` in `.nightshift.json`. Wired into all 3 looping daemon circuit breakers, builder budget stop, and healer prompt. Fails silently -- never crashes the daemon. New `needs-human` label on repo. (`scripts/lib-agent.sh`: `notify_human()`; `scripts/daemon.sh`, `scripts/daemon-review.sh`, `scripts/daemon-overseer.sh`; `nightshift/types.py`, `nightshift/constants.py`, `nightshift/config.py`; task #0048)
 
+- **[meta]** Generate Work step (Step 6n) in evolve.md. After each session, the agent now scans the system across 7 dimensions (meta pipeline, code quality, repo health, architecture, agent DX, vision progress, security) and creates 1-5 follow-up tasks. Prevents the agent from being a passive task runner -- it actively identifies work. Enforces dimension diversity, duplicate checking, specific acceptance criteria, and a per-session cap of 5 tasks. Moves Meta-Prompt "Priority engine" from 0% to 50%. (task #0053)
+
 ## Fixed
 
 - **[cost]** Codex/OpenAI sessions now produce non-zero cost estimates. Added pricing for gpt-5.4 ($2.50/$15.00 per MTok), gpt-5.4-mini ($0.75/$4.50), and gpt-5.4-nano ($0.20/$1.25) to `MODEL_PRICING`. `parse_session_tokens()` now handles Codex `turn.completed` events (field mapping: `cached_input_tokens` -> cache_read, input adjusted to exclude cached). `record_session()` uses `AGENT_DEFAULT_MODELS` as model fallback when logs lack model identifiers. (`nightshift/constants.py`, `nightshift/costs.py`; task #0039)

diff --git a/docs/handoffs/0030.md b/docs/handoffs/0030.md
@@ -0,0 +1,74 @@
+# Handoff #0030
+**Date**: 2026-04-05
+**Version**: v0.0.7 in progress
+
+## What I Built
+- **Task #0053** (Agent generates its own tasks across all dimensions): Added Step 6n "Generate Work (ALWAYS)" to `docs/prompt/evolve.md`. After each session, the agent now scans the system across 7 dimensions (meta pipeline, code quality, repo health, architecture, agent DX, vision progress, security) and creates 1-5 new tasks. Includes constraints: max 5 per session, duplicate checking, dimension diversity, specific acceptance criteria required, honest priority. Also updated the Step 12 report template to include a "Generated tasks" section.
+- Files modified: `docs/prompt/evolve.md`
+- Generated tasks this session: #0057 (task queue summary command), #0058 (task file frontmatter validator)
+
+## Decisions Made
+- Placed the new step as Step 6n (subsection of "Update Every Document") rather than a new top-level step to avoid renumbering all subsequent steps and breaking cross-references in CLAUDE.md and daemon scripts
+- Used a table format for the 7 dimensions to make scanning fast for LLMs
+- Set max 5 tasks per session -- enough to be useful, not enough to flood the queue
+- Required spanning at least 2 different dimensions to prevent tunnel vision
+- Priority engine marked at 50% (not 100%) because the current implementation is prompt-based; a true priority engine would also analyze historical patterns programmatically
+
+## Known Issues
+- Task #0012 (Phractal re-validation) still pending -- needs API access
+- v0.0.6 release not yet tagged
+- Codex `.git/` sandbox issue untested
+- OpenAI pricing should be re-verified periodically; rates change
+- Healer has not been tested in a real daemon run yet
+- `notify_human` has not been tested with a live webhook
+- Tasks #0024 and #0036 have malformed YAML frontmatter (missing/broken status field) -- #0058 would catch this
+
+## Current State
+- Loop 1: 100% (22/22)
+- Loop 2: 63% (7/11) -- unchanged
+- Self-Maintaining: 59% -- unchanged
+- Meta-Prompt: 68% (was 61%) -- Priority engine 0% -> 50%
+- Overall: 78% (was 77%)
+- Version: v0.0.7 in progress
+- Test count: 663
+
+## Tracker delta: 77% -> 78% (Meta-Prompt 61% -> 68%)
+
+## Evaluate
+Run evaluation against Phractal for the changes merged this session.
+
+## Tasks I Did NOT Pick and Why
+- #0012: blocked (environment: integration -- needs API access)
+- #0018: low priority, v0.0.6 target
+- #0028: blocked (environment: integration)
+- #0029: blocked (environment: integration)
+- #0031: normal priority, v0.0.7 -- task #0053 was urgent, took precedence
+- #0032: environment: integration -- skipped per rules
+- #0033: normal priority, v0.0.7 -- task #0053 was urgent, took precedence
+- #0036: appears already done (malformed frontmatter shows `## status: done`)
+- #0038: low priority, v0.0.8
+- #0040: normal priority, v0.0.7 -- task #0053 was urgent, took precedence
+- #0041: low priority, v0.0.8
+- #0042: low priority, v0.0.8
+- #0044: low priority, v0.0.8
+- #0045: low priority, v0.0.8
+- #0047: normal priority, v0.0.8
+- #0049: normal priority, v0.0.8
+- #0050: normal priority, v0.0.8
+- #0051: low priority, v0.0.9
+- #0052: normal priority, v0.0.8
+- #0054: normal priority, v0.0.8
+- #0055: low priority, v0.0.8
+- #0056: low priority, v0.0.8
+
+## Next Session Should
+Tasks: #0031, #0033, #0040
+1. **Task #0031** (normal, v0.0.7) -- Task queue vision-alignment check. Prevents consecutive tasks from all targeting the same vision section.
+2. **Task #0033** (normal, v0.0.7) -- Learnings verification. Require agents to quote specific learnings in status reports.
+3. **Task #0040** (normal, v0.0.7) -- Create CONTRIBUTING.md for agent-to-agent collaboration.
+
+## Where to Look
+- `docs/prompt/evolve.md` lines ~283-330 -- new Step 6n "Generate Work"
+- `docs/prompt/evolve.md` lines ~448-450 -- updated report template with "Generated tasks"
+- `docs/tasks/0057.md` -- generated task: queue summary command
+- `docs/tasks/0058.md` -- generated task: frontmatter validator
diff --git a/docs/handoffs/LATEST.md b/docs/handoffs/LATEST.md
@@ -1,37 +1,38 @@
-# Handoff #0029
+# Handoff #0030
 **Date**: 2026-04-05
 **Version**: v0.0.7 in progress
 
 ## What I Built
-- **Task #0048** (Human escalation channel -- gh issue create + optional webhook): Added `notify_human()` function to `scripts/lib-agent.sh` that creates GitHub issues with the `needs-human` label and optionally fires a webhook. Wired into all 3 looping daemon circuit breakers, builder budget stop, and healer prompt (Step 5 escalation for critical patterns). Added `notification_webhook` to `NightshiftConfig`, `DEFAULT_CONFIG`, `config.py`, and `.nightshift.json.example`. Created `needs-human` label on the GitHub repo. Documented in `docs/ops/DAEMON.md`.
-- Files modified: `scripts/lib-agent.sh`, `scripts/daemon.sh`, `scripts/daemon-review.sh`, `scripts/daemon-overseer.sh`, `nightshift/types.py`, `nightshift/constants.py`, `nightshift/config.py`, `nightshift/profiler.py`, `.nightshift.json.example`, `docs/prompt/healer.md`, `docs/ops/DAEMON.md`, `tests/test_nightshift.py`
-- Tests: +4 new, 663 total passing
+- **Task #0053** (Agent generates its own tasks across all dimensions): Added Step 6n "Generate Work (ALWAYS)" to `docs/prompt/evolve.md`. After each session, the agent now scans the system across 7 dimensions (meta pipeline, code quality, repo health, architecture, agent DX, vision progress, security) and creates 1-5 new tasks. Includes constraints: max 5 per session, duplicate checking, dimension diversity, specific acceptance criteria required, honest priority. Also updated the Step 12 report template to include a "Generated tasks" section.
+- Files modified: `docs/prompt/evolve.md`
+- Generated tasks this session: #0057 (task queue summary command), #0058 (task file frontmatter validator)
 
 ## Decisions Made
-- `notify_human` fails silently (all calls wrapped in `|| true`) -- daemon stability is more important than notification delivery
-- GitHub issue title prefixed with `[Nightshift]` for easy filtering
-- Webhook payload uses `{"text": "..."}` format (compatible with Slack, Discord, and most webhook services)
-- Healer only escalates for "concern" health status -- issues fixable by builder tasks should NOT trigger escalation
-- Wired into reviewer and overseer circuit breakers too, not just builder
+- Placed the new step as Step 6n (subsection of "Update Every Document") rather than a new top-level step to avoid renumbering all subsequent steps and breaking cross-references in CLAUDE.md and daemon scripts
+- Used a table format for the 7 dimensions to make scanning fast for LLMs
+- Set max 5 tasks per session -- enough to be useful, not enough to flood the queue
+- Required spanning at least 2 different dimensions to prevent tunnel vision
+- Priority engine marked at 50% (not 100%) because the current implementation is prompt-based; a true priority engine would also analyze historical patterns programmatically
 
 ## Known Issues
 - Task #0012 (Phractal re-validation) still pending -- needs API access
 - v0.0.6 release not yet tagged
 - Codex `.git/` sandbox issue untested
 - OpenAI pricing should be re-verified periodically; rates change
-- Healer has not been tested in a real daemon run yet (first real test will be next daemon cycle)
-- `notify_human` has not been tested with a live webhook -- works silently without one
+- Healer has not been tested in a real daemon run yet
+- `notify_human` has not been tested with a live webhook
+- Tasks #0024 and #0036 have malformed YAML frontmatter (missing/broken status field) -- #0058 would catch this
 
 ## Current State
 - Loop 1: 100% (22/22)
 - Loop 2: 63% (7/11) -- unchanged
-- Self-Maintaining: 59% (was 57%) -- feedback loop 40% -> 60% (notify_human closes the escalation path)
-- Meta-Prompt: 61% -- unchanged
-- Overall: 77% (weighted, unchanged -- Self-Maintaining delta too small to move the rounded total)
+- Self-Maintaining: 59% -- unchanged
+- Meta-Prompt: 68% (was 61%) -- Priority engine 0% -> 50%
+- Overall: 78% (was 77%)
 - Version: v0.0.7 in progress
 - Test count: 663
 
-## Tracker delta: 77% -> 77% (Self-Maintaining 57% -> 59%)
+## Tracker delta: 77% -> 78% (Meta-Prompt 61% -> 68%)
 
 ## Evaluate
 Run evaluation against Phractal for the changes merged this session.
@@ -41,12 +42,12 @@ Run evaluation against Phractal for the changes merged this session.
 - #0018: low priority, v0.0.6 target
 - #0028: blocked (environment: integration)
 - #0029: blocked (environment: integration)
-- #0031: normal priority, v0.0.7 -- skipped because #0048 is urgent
+- #0031: normal priority, v0.0.7 -- task #0053 was urgent, took precedence
 - #0032: environment: integration -- skipped per rules
-- #0033: normal priority -- skipped because #0048 is urgent
-- #0036: pending -- not reviewed this session
+- #0033: normal priority, v0.0.7 -- task #0053 was urgent, took precedence
+- #0036: appears already done (malformed frontmatter shows `## status: done`)
 - #0038: low priority, v0.0.8
-- #0040: normal priority -- skipped because #0048 is urgent
+- #0040: normal priority, v0.0.7 -- task #0053 was urgent, took precedence
 - #0041: low priority, v0.0.8
 - #0042: low priority, v0.0.8
 - #0044: low priority, v0.0.8
@@ -56,22 +57,18 @@ Run evaluation against Phractal for the changes merged this session.
 - #0050: normal priority, v0.0.8
 - #0051: low priority, v0.0.9
 - #0052: normal priority, v0.0.8
-- #0053: urgent v0.0.8 -- next highest priority after #0048
 - #0054: normal priority, v0.0.8
 - #0055: low priority, v0.0.8
 - #0056: low priority, v0.0.8
 
 ## Next Session Should
-Tasks: #0053, #0031, #0033
-1. **Task #0053** (urgent) -- Agent generates its own tasks across all dimensions. Add a "generate work" step to evolve.md.
-2. **Task #0031** (normal, v0.0.7) -- Task queue vision-alignment check. Prevents all tasks targeting the same section.
-3. **Task #0033** (normal) -- whatever the task description says.
+Tasks: #0031, #0033, #0040
+1. **Task #0031** (normal, v0.0.7) -- Task queue vision-alignment check. Prevents consecutive tasks from all targeting the same vision section.
+2. **Task #0033** (normal, v0.0.7) -- Learnings verification. Require agents to quote specific learnings in status reports.
+3. **Task #0040** (normal, v0.0.7) -- Create CONTRIBUTING.md for agent-to-agent collaboration.
 
 ## Where to Look
-- `scripts/lib-agent.sh` lines 502-530 -- `notify_human()` function
-- `scripts/daemon.sh` -- circuit breaker (line ~280) and budget stop (line ~262) call `notify_human`
-- `scripts/daemon-review.sh`, `scripts/daemon-overseer.sh` -- circuit breaker calls
-- `docs/prompt/healer.md` Step 5 -- healer escalation instructions
-- `docs/ops/DAEMON.md` "Human Escalation" section
-- `nightshift/types.py` line 31 -- `notification_webhook` field
-- `nightshift/config.py` -- validation for notification_webhook
+- `docs/prompt/evolve.md` lines ~283-330 -- new Step 6n "Generate Work"
+- `docs/prompt/evolve.md` lines ~448-450 -- updated report template with "Generated tasks"
+- `docs/tasks/0057.md` -- generated task: queue summary command
+- `docs/tasks/0058.md` -- generated task: frontmatter validator
diff --git a/docs/learnings/2026-04-05-generate-work-placement.md b/docs/learnings/2026-04-05-generate-work-placement.md
@@ -0,0 +1,9 @@
+---
+type: optimization
+date: 2026-04-05
+session: 0030
+---
+
+# New evolve.md steps go INSIDE Step 6 as subsections, not as new top-level steps
+
+When adding a new capability to the evolve prompt (like "generate work"), inserting it as a new top-level step (Step 13, Step 14, etc.) forces renumbering all subsequent steps and breaks cross-references in CLAUDE.md, daemon scripts, and the autonomous override prompt. Instead, add it as a subsection under Step 6 (e.g., 6n, 6o). This keeps the step numbering stable while still making the new capability visible and mandatory. The Step 6 section is "Update Every Document" -- any per-session administrative action fits naturally here.
diff --git a/docs/learnings/INDEX.md b/docs/learnings/INDEX.md
@@ -18,6 +18,7 @@ Read this file FIRST. Only open individual learning files when they are relevant
 - [Task selection is mesa-optimization](2026-04-04-task-selection-mesa-optimization.md) — Agent optimizes session success over project progress; queue order is authoritative, handoff is advisory
 - [Merge strategy: --merge never --squash](2026-04-03-merge-never-squash.md) — Always --merge --admin, preserve all commits on main
 - [Turn budget kills good sessions](2026-04-03-turn-budget-kills-sessions.md) — 500 max turns = silent death mid-work; keep context lean
+- [New evolve steps go inside Step 6](2026-04-05-generate-work-placement.md) — Add as subsection (6n, 6o) to avoid renumbering and breaking cross-references
 - [Open PR recovery](2026-04-03-open-pr-recovery.md) — Daemon detects open PRs from crashed sessions and recovers them
 
 ## Code Patterns

diff --git a/docs/prompt/evolve.md b/docs/prompt/evolve.md
@@ -233,6 +233,7 @@ Write `docs/handoffs/NNNN.md` (increment from the last number). Follow the exact
 
 **Required sections in every handoff:**
 - "Tracker delta: XX% -> XX%" (makes project progress visible)
+- "Generated tasks: [list #NNNN titles, or 'none']" (from Step 6n — what work you identified)
 - "Tasks I did NOT pick and why:" (skip accountability — list every pending task you read and chose not to build, with the reason)
 
 ### 6c. Changelog (ALWAYS except docs-only changes)
@@ -286,6 +287,46 @@ Check `docs/ops/OPERATIONS.md` version milestones:
 - If yes: prepare for release (tag, changelog status, new version file)
 - If no: note in the handoff what's still needed
 
+### 6n. Generate Work (ALWAYS)
+
+You are not a task runner. You are the engineer who owns this system. Before ending the session, step back and look at the system from every angle. Create 1-5 new tasks based on what you observe.
+
+**How to scan:**
+1. Read the vision tracker. What sections are furthest behind? What would move the percentage?
+2. Scan `docs/sessions/index.md`, the last 3-5 entries. Any repeating patterns or stuck areas?
+3. Think about friction you hit THIS session. What slowed you down? What was confusing?
+4. Think about the meta layer. Are prompts bloated? Are handoffs useful? Is the task system working?
+5. Scan for TODOs, hacks, or weak spots in any code you touched.
+
+**Dimensions to consider** (create tasks across different ones, not all the same type):
+
+| Dimension | Example questions |
+|---|---|
+| Meta / autonomous pipeline | Daemon reliability? Prompt staleness? Cost trending? Sessions stuck in patterns? |
+| Code quality | Modules too big? Functions untested? Loose types? Dead code? Cryptic errors? |
+| Repo health | CI speed? Dependency freshness? Test coverage drift? Flaky tests? Doc accuracy? |
+| Architecture | Circular deps? Module tangles? Abstractions earning their keep? Config bloat? |
+| Agent DX | CLAUDE.md accurate? Learnings applied? Handoff format effective? Cold-start speed? |
+| Vision progress | Low-hanging tracker items? Blocked items unblockable? Avoided areas? |
+| Security / robustness | Edge cases that crash? Input validation gaps? Auto-merge exploitable? Secrets exposed? |
+
+**Constraints:**
+- **Max 5 tasks per session.** Quality over quantity. Do not flood the queue.
+- **Check for duplicates first.** Scan all pending tasks in `docs/tasks/`. If a task already covers your idea, skip it or update the existing task instead.
+- **Span multiple dimensions.** If you create 3 tasks, they should not all be "code quality." Spread across at least 2 different dimensions.
+- **Specific acceptance criteria required.** "Improve error handling" is not a task. "Add structured error types to config.py with specific messages for each validation failure" is.
+- **Honest priority.** Not everything is urgent. Most generated tasks are `normal` or `low`.
+- **Use `.next-id`** for task numbering (same as always -- read, use, increment, commit).
+
+**Output in the session:**
+```
+GENERATED TASKS
+===============
+#NNNN: [title] (dimension: [which], priority: [level])
+#NNNN: [title] (dimension: [which], priority: [level])
+...or "No new tasks -- queue already covers what I observed."
+```
+
 ## STEP 7 — PRE-PUSH CHECKLIST
 
 Before touching git, read `docs/ops/PRE-PUSH-CHECKLIST.md` and run through every item. This is mandatory. Answer each item honestly. If anything fails, fix it before proceeding. Output your checklist results:
@@ -442,6 +483,10 @@ Manual test suggestion:
 
 Tracker delta: [XX% -> XX%] (or "no change" if cleanup only)
 
+Generated tasks:
+  - #NNNN: [title] (dimension: [which])
+  ...or "No new tasks"
+
 Tasks I did NOT pick and why:
   - #NNNN: [reason — blocked-environment, blocked-dependency, or explicit justification]
 

diff --git a/docs/tasks/.next-id b/docs/tasks/.next-id
@@ -1 +1 @@
-57
+59
diff --git a/docs/tasks/0053.md b/docs/tasks/0053.md
@@ -1,9 +1,9 @@
 ---
-status: pending
+status: done
 priority: urgent
 target: v0.0.8
 created: 2026-04-05
-completed:
+completed: 2026-04-05
 ---
 
 # Agent generates its own tasks across all dimensions

diff --git a/docs/tasks/0057.md b/docs/tasks/0057.md
@@ -0,0 +1,35 @@
+---
+status: pending
+priority: low
+target: v0.0.8
+created: 2026-04-05
+completed:
+---
+
+# Task queue summary command -- make tasks or scripts/list-tasks.sh
+
+Every builder session starts by scanning docs/tasks/ to find pending work. This currently requires a custom bash loop or reading files one by one. A dedicated script would save 2-3 minutes per session and reduce the chance of missing a task.
+
+## What to build
+
+A `scripts/list-tasks.sh` script (and/or `make tasks` target) that outputs a formatted table of all pending/blocked/in-progress tasks, sorted by priority then number. Include: task number, status, priority, target version, environment tag, and title.
+
+Example output:
+```
+TASK QUEUE
+==========
+  0053  [pending]     urgent   v0.0.8            Agent generates its own tasks
+  0031  [pending]     normal   v0.0.7            Task queue vision-alignment check
+  0033  [pending]     normal   v0.0.7            Learnings verification
+  0040  [pending]     normal   v0.0.7            Create CONTRIBUTING.md
+  0012  [blocked]     normal   v0.0.4  integr.   Re-validate against Phractal
+  ...
+```
+
+## Acceptance Criteria
+- [ ] `scripts/list-tasks.sh` exists and runs without errors
+- [ ] Output sorted by priority (urgent > normal > low) then by task number
+- [ ] Shows status, priority, target, environment, and title
+- [ ] Skips done/archived tasks
+- [ ] `make tasks` target added to Makefile
+- [ ] Works with zero tasks (empty queue message)