Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,11 @@ python3 -m nightshift multi /repo1 /repo2 --agent claude --test --cycles 1
python3 -m nightshift module-map --write
```

`python3 -m nightshift test ...` now keeps its state files, runner logs, and
linked worktree under `$TMPDIR/nightshift-test-runs/...` so evaluation clones
stay clean. Full `run` mode still writes repo-local runtime artifacts under
`docs/Nightshift/`.

### From the installed skill bundle

Use the bundled wrapper scripts:
Expand Down
77 changes: 40 additions & 37 deletions docs/architecture/MODULE_MAP.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Module Map

Last updated: 2026-04-05 by session #0059
Last updated: 2026-04-06 by session #0062
Generated via: `python3 -m nightshift module-map --write`
Stale after: 5 newer sessions without a refresh

Expand All @@ -9,37 +9,39 @@ Read it before opening modules one by one when you need fast orientation.

## Modules (29)

| Module | Lines | Purpose | Key symbols | Last changed |
|---|---:|---|---|---|
| `errors.py` | 7 | Nightshift error types. | `NightshiftError` | 2802c51 |
| `eval_targets.py` | 96 | Known evaluation targets and their repo-specific verification settings. | `infer_target_verify_command`, `_KNOWN_TARGET_VERIFY_COMMANDS` | session #0059 |
| `types.py` | 561 | Strict type definitions for all Nightshift data structures. | `NightshiftConfig`, `DiffScore`, `Counters`, `Baseline` | PR #88 (7e36fa5) |
| `constants.py` | 745 | Module-level constants and tiny utilities used across the package. | `now_local`, `print_status`, `DATA_VERSION`, `SUPPORTED_AGENTS` | PR #88 (7e36fa5) |
| `shell.py` | 161 | Subprocess execution: streaming runner, git helper, shell utilities. | `run_command`, `run_capture`, `git`, `command_exists` | PR #27 (9e953eb) |
| `summary.py` | 141 | Feature summary generation for Loop 2 build output. | `generate_feature_summary`, `_API_DIR_SEGMENTS`, `_CLI_DIR_SEGMENTS`, `_CONFIG_DIR_SEGMENTS` | PR #67 (89f8cd6) |
| `cleanup.py` | 337 | Daemon housekeeping -- log rotation, healer archiving, and branch pruning. | `rotate_healer_log`, `rotate_logs`, `prune_orphan_branches`, `_HEALER_ENTRY_RE` | PR #88 (7e36fa5) |
| `compact.py` | 318 | Handoff compaction -- merges numbered handoff files into weekly summaries. | `compact_handoffs`, `_NUMBERED_RE`, `_SECTION_RE`, `_DATE_RE` | PR #83 (56e0c97) |
| `coordination.py` | 192 | Sub-agent coordination for Loop 2 -- detects file overlaps and generates hints. | `extract_file_references`, `detect_overlaps`, `generate_coordination_hints`, `inject_hints` | PR #72 (a5a3e47) |
| `costs.py` | 672 | Cost tracking for daemon sessions -- parse token usage from logs and maintain a ledger. | `parse_session_tokens`, `calculate_cost`, `read_ledger`, `write_ledger` | PR #89 (7211bd4) |
| `module_map.py` | 298 | Generate a persistent module map for fast cross-session orientation. | `module_map_path`, `generate_module_map`, `render_module_map`, `write_module_map` | PR #86 (77e5c25) |
| `readiness.py` | 211 | Production-readiness checks for Loop 2 feature builds. | `collect_changed_files`, `check_secrets`, `check_debug_prints`, `check_test_coverage` | PR #69 (3877225) |
| `scoring.py` | 113 | Post-cycle diff scoring: evaluates production impact of cycle changes. | `score_diff`, `log_score` | PR #10 (3e5f98f) |
| `state.py` | 187 | Shift state: read, write, mutate counters, JSON I/O. | `load_json`, `write_json`, `read_state`, `top_path` | PR #28 (60e4ed5) |
| `config.py` | 241 | Configuration loading, agent resolution, and environment detection. | `merge_config`, `prompt_for_agent`, `resolve_agent`, `infer_package_manager` | session #0059 |
| `multi.py` | 117 | Multi-repo shift orchestration: run hardening loops across multiple repos. | `validate_repos`, `format_multi_summary`, `run_multi_shift` | PR #22 (12ac402) |
| `e2e.py` | 113 | End-to-end test runner for Loop 2 feature builds. | `infer_test_command`, `detect_smoke_test`, `run_e2e_tests`, `_MAKEFILE_TEST_TARGET` | PR #70 (95ef827) |
| `profiler.py` | 569 | Repo profiling for Loop 2 -- detects language, framework, dependencies, structure. | `profile_repo` | PR #78 (5cc11a3) |
| `worktree.py` | 213 | Git worktree lifecycle: create, shift log, sync, revert, cleanup. | `canonical_repo_relative_path`, `resolve_nightshift_dir`, `validate_worktree`, `validate_repo_checkout` | PR #96 (34244ff) |
| `cycle.py` | 855 | Per-cycle logic: prompt building, agent dispatch, verification, evaluation. | `extract_json`, `read_repo_instructions`, `wrap_repo_instructions`, `command_for_agent` | PR #96 (34244ff) |
| `evaluation.py` | 874 | Self-evaluation loop: score nightshift runs against real repos. | `clone_target_repo`, `run_test_shift`, `parse_shift_artifacts`, `score_startup` | PR #96 (34244ff) |
| `planner.py` | 483 | Feature planner for Loop 2 -- builds structured plans from repo profiles. | `build_plan_prompt`, `validate_plan`, `parse_plan`, `execution_order` | PR #78 (5cc11a3) |
| `subagent.py` | 281 | Sub-agent spawner for Loop 2 -- executes work orders via codex or claude CLI. | `spawn_task`, `spawn_wave`, `format_wave_result`, `_TASK_COMPLETION_REQUIRED_KEYS` | PR #33 (bd23cc4) |
| `decomposer.py` | 175 | Task decomposer for Loop 2 -- converts FeaturePlans into sub-agent work orders. | `build_work_order_prompt`, `decompose_plan`, `format_work_orders` | PR #78 (5cc11a3) |
| `integrator.py` | 325 | Wave integrator for Loop 2 -- merges sub-agent work, runs tests, handles failures. | `collect_wave_files`, `stage_files`, `run_test_suite`, `diagnose_failure` | PR #33 (bd23cc4) |
| `feature.py` | 696 | Loop 2 feature-build orchestration and persisted build state. | `feature_state_path`, `feature_log_dir`, `read_feature_state`, `write_feature_state` | PR #78 (5cc11a3) |
| `cli.py` | 543 | CLI entry points: run, test, summarize, verify-cycle, module-map. | `run_nightshift`, `summarize`, `verify_cycle_cli`, `plan_feature` | PR #96 (34244ff) |
| `__main__.py` | 5 | Entry point for python3 -m nightshift. | `main` | 2802c51 |
| `__init__.py` | 537 | Nightshift -- autonomous overnight codebase improvement agent. | `AGENT_DEFAULT_MODELS`, `BACKEND_DIR_NAMES`, `BACKEND_EXTENSIONS`, `CATEGORY_ORDER` | session #0059 |

| Module | Lines | Purpose | Key symbols | Last changed |
| ----------------- | ----- | --------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- | ----------------- |
| `errors.py` | 7 | Nightshift error types. | `NightshiftError` | 2802c51 |
| `eval_targets.py` | 96 | Known evaluation targets and their repo-specific verification settings. | `infer_target_verify_command`, `_KNOWN_TARGET_VERIFY_COMMANDS` | PR #106 (e2d235c) |
| `types.py` | 561 | Strict type definitions for all Nightshift data structures. | `NightshiftConfig`, `DiffScore`, `Counters`, `Baseline` | PR #88 (7e36fa5) |
| `constants.py` | 749 | Module-level constants and tiny utilities used across the package. | `now_local`, `print_status`, `DATA_VERSION`, `SUPPORTED_AGENTS` | session #0062 |
| `shell.py` | 161 | Subprocess execution: streaming runner, git helper, shell utilities. | `run_command`, `run_capture`, `git`, `command_exists` | PR #27 (9e953eb) |
| `summary.py` | 141 | Feature summary generation for Loop 2 build output. | `generate_feature_summary`, `_API_DIR_SEGMENTS`, `_CLI_DIR_SEGMENTS`, `_CONFIG_DIR_SEGMENTS` | PR #67 (89f8cd6) |
| `cleanup.py` | 337 | Daemon housekeeping -- log rotation, healer archiving, and branch pruning. | `rotate_healer_log`, `rotate_logs`, `prune_orphan_branches`, `_HEALER_ENTRY_RE` | PR #88 (7e36fa5) |
| `compact.py` | 318 | Handoff compaction -- merges numbered handoff files into weekly summaries. | `compact_handoffs`, `_NUMBERED_RE`, `_SECTION_RE`, `_DATE_RE` | PR #83 (56e0c97) |
| `coordination.py` | 192 | Sub-agent coordination for Loop 2 -- detects file overlaps and generates hints. | `extract_file_references`, `detect_overlaps`, `generate_coordination_hints`, `inject_hints` | PR #72 (a5a3e47) |
| `costs.py` | 672 | Cost tracking for daemon sessions -- parse token usage from logs and maintain a ledger. | `parse_session_tokens`, `calculate_cost`, `read_ledger`, `write_ledger` | PR #89 (7211bd4) |
| `module_map.py` | 298 | Generate a persistent module map for fast cross-session orientation. | `module_map_path`, `generate_module_map`, `render_module_map`, `write_module_map` | PR #86 (77e5c25) |
| `readiness.py` | 211 | Production-readiness checks for Loop 2 feature builds. | `collect_changed_files`, `check_secrets`, `check_debug_prints`, `check_test_coverage` | PR #69 (3877225) |
| `scoring.py` | 113 | Post-cycle diff scoring: evaluates production impact of cycle changes. | `score_diff`, `log_score` | PR #10 (3e5f98f) |
| `state.py` | 187 | Shift state: read, write, mutate counters, JSON I/O. | `load_json`, `write_json`, `read_state`, `top_path` | PR #28 (60e4ed5) |
| `config.py` | 241 | Configuration loading, agent resolution, and environment detection. | `merge_config`, `prompt_for_agent`, `resolve_agent`, `infer_package_manager` | PR #106 (e2d235c) |
| `multi.py` | 117 | Multi-repo shift orchestration: run hardening loops across multiple repos. | `validate_repos`, `format_multi_summary`, `run_multi_shift` | PR #22 (12ac402) |
| `e2e.py` | 113 | End-to-end test runner for Loop 2 feature builds. | `infer_test_command`, `detect_smoke_test`, `run_e2e_tests`, `_MAKEFILE_TEST_TARGET` | PR #70 (95ef827) |
| `profiler.py` | 569 | Repo profiling for Loop 2 -- detects language, framework, dependencies, structure. | `profile_repo` | PR #78 (5cc11a3) |
| `worktree.py` | 232 | Git worktree lifecycle: create, shift log, sync, revert, cleanup. | `canonical_repo_relative_path`, `resolve_nightshift_dir`, `resolve_shift_log_relative_dir`, `resolve_test_runtime_dir` | session #0062 |
| `cycle.py` | 855 | Per-cycle logic: prompt building, agent dispatch, verification, evaluation. | `extract_json`, `read_repo_instructions`, `wrap_repo_instructions`, `command_for_agent` | PR #96 (34244ff) |
| `evaluation.py` | 906 | Self-evaluation loop: score nightshift runs against real repos. | `clone_target_repo`, `run_test_shift`, `parse_shift_artifacts`, `score_startup` | session #0062 |
| `planner.py` | 483 | Feature planner for Loop 2 -- builds structured plans from repo profiles. | `build_plan_prompt`, `validate_plan`, `parse_plan`, `execution_order` | PR #78 (5cc11a3) |
| `subagent.py` | 281 | Sub-agent spawner for Loop 2 -- executes work orders via codex or claude CLI. | `spawn_task`, `spawn_wave`, `format_wave_result`, `_TASK_COMPLETION_REQUIRED_KEYS` | PR #33 (bd23cc4) |
| `decomposer.py` | 175 | Task decomposer for Loop 2 -- converts FeaturePlans into sub-agent work orders. | `build_work_order_prompt`, `decompose_plan`, `format_work_orders` | PR #78 (5cc11a3) |
| `integrator.py` | 325 | Wave integrator for Loop 2 -- merges sub-agent work, runs tests, handles failures. | `collect_wave_files`, `stage_files`, `run_test_suite`, `diagnose_failure` | PR #33 (bd23cc4) |
| `feature.py` | 696 | Loop 2 feature-build orchestration and persisted build state. | `feature_state_path`, `feature_log_dir`, `read_feature_state`, `write_feature_state` | PR #78 (5cc11a3) |
| `cli.py` | 550 | CLI entry points: run, test, summarize, verify-cycle, module-map. | `run_nightshift`, `summarize`, `verify_cycle_cli`, `plan_feature` | session #0062 |
| `__main__.py` | 5 | Entry point for python3 -m nightshift. | `main` | 2802c51 |
| `__init__.py` | 547 | Nightshift -- autonomous overnight codebase improvement agent. | `AGENT_DEFAULT_MODELS`, `BACKEND_DIR_NAMES`, `BACKEND_EXTENSIONS`, `CATEGORY_ORDER` | session #0062 |


## Dependency Order

Expand All @@ -50,8 +52,9 @@ Topological order derived from internal `nightshift.*` imports.

## Recent Shipped Sessions

- PR #105: docs: close stale eval startup task
- PR #104: fix: gate autonomous queue on eval score
- PR #99: test: cover malformed task frontmatter edge case
- PR #98: docs: track task parser review follow-up
- PR #97: feat: add task frontmatter validator
- PR #125: feat: watchdog + natural task creation (no artificial caps)
- PR #124: fix: round 6 audit — 9 remaining issues patched
- PR #123: feat: overseer rewrite — ticket closer, not process auditor
- PR #122: overseer: dedupe auto-release queue
- PR #121: overseer: fix unified-daemon operator docs

2 changes: 2 additions & 0 deletions docs/changelog/v0.0.8.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Closing the self-maintaining gap: auto-release, auto-changelog, evaluation CLI,
- **[docs]** Refreshed `README.md` against the live repo so it now documents the real `python3 -m nightshift` entry points, installed wrapper scripts, current tracker snapshot, current config surface, and the current handoff/learnings/task workflow instead of stale marketing-era commands and percentages. (tasks `#0118`, `#0067`)

## Fixed
- **[fix]** `nightshift test` now keeps evaluation state, runner logs, and linked worktrees under an isolated temp-root runtime directory, so rejected Phractal eval runs no longer dirty the cloned target repo while evaluation artifact parsing still finds the shifted state/log files. (task `#0100`)
- **[fix]** Shell scripts in `scripts/` now use ASCII-only section dividers and restart/status text, removing box-drawing and em-dash characters that violated repo conventions and rendered inconsistently across terminals/filesystems. (task #0038)
- **[meta]** Corrected the authoritative Step 0 evaluation command in `docs/prompt/evolve.md` so fresh-clone Phractal evaluations pass `--repo-dir /tmp/nightshift-eval` from the Nightshift repo root instead of accidentally targeting the Nightshift checkout. (task `#0117`)
- **[fix]** Nightshift now resolves the repo's actual `docs/` casing across runtime artifacts, shift-log verification, and evaluation artifact parsing, so repos that use `Docs/Nightshift/` no longer get false rejected cycles or mis-targeted self-evaluation reads, and legitimate final-cycle shift-log summary commits no longer trip the extra-commit guard rail. (tasks `#0098`, `#0121`)
Expand All @@ -22,6 +23,7 @@ Closing the self-maintaining gap: auto-release, auto-changelog, evaluation CLI,
## Removed

## Internal
- **[test]** Added regression coverage for isolated test-mode runtime artifacts and for rejected test-mode runs leaving the cloned target repo clean. Test suite is now 992 passing.
- **[test]** Added regression coverage for repo-URL-based evaluation verifier selection, percent-bearing git remote URLs, and the documentation contract for known target metadata. Test suite is now 943 passing.
- **[meta]** Recorded `docs/evaluations/0014.md` from a fresh-clone Phractal run, confirmed the default Claude startup path still launches cleanly without `CLAUDECODE` or effort overrides, and closed stale eval task `#0097` so the eval gate now points at the remaining verification/cleanup gaps instead of obsolete startup drift.
- **[meta]** Added an eval-score gate to `docs/prompt/evolve-auto.md` and mirrored it in the builder operations docs so, after Step 0, any latest real-repo evaluation below `80/100` forces the autonomous builder to prefer eval-related normal-priority tasks over unrelated queue cleanup. Added prompt-contract regression coverage for the new rule and recorded fresh Phractal evaluation `docs/evaluations/0013.md` at `70/100`. (task `#0131`)
Expand Down
Loading