refactor(agents): decompose 4 backend hotspot files into packages by ty13r · Pull Request #57 · ty13r/skillforge

ty13r · 2026-04-20T07:05:23Z

Summary

Decomposes the four remaining backend hotspots — each agent/engine file over the 500-LOC Python ceiling in docs/clean-code.md §2. Every split follows the pure-planner / thin-I/O-shell pattern: prompt builders + data helpers in private modules, LLM/SDK/disk I/O in one clearly-labeled module, top-level orchestrators call through.

File	Before	After (main)	Package layout
`managed_agents.py`	620	7-submodule package	`_constants`, `environments`, `skills`, `agents`, `sessions`, `output`
`variant_evolution.py`	620	345 (dimension.py)	`_helpers`, `dimension`, `assembly`, `main`
`breeder.py`	629	213 (_reports.py)	`_ranking`, `_prompts`, `_reports`, `main`, `bible`
`spawner.py`	763	411 (main.py)	`_helpers`, `_prompts`, `main`

Largest submodule anywhere is 416 LOC, under the 500-LOC cap.

Test-patch compatibility

Two of the four packages required a lazy-lookup shim for tests that patch("skillforge.agents.breeder.breed_next_gen") (or similar) on the package root. Binding the reference at import time in a submodule shadows the patch; calling through the package namespace at runtime lets it take effect.

This was load-bearing — without the shim the breeder/spawner test suites silently made real LLM calls (11 minutes of API spend before first failure on the spawner).

Also re-exported a handful of private helpers (_extract_skill_name_from_md, _normalize_output_path, _extract_lessons_and_report, _read_bible_patterns, BIBLE_DIR) through each package's __init__ so existing monkeypatch.setattr(module, "PRIVATE_NAME", ...) calls resolve.

Test plan

uv run ruff check skillforge — clean
uv run mypy skillforge — 83 files pass (up from 65)
uv run pytest tests/ — 411 passed, 2 skipped (unchanged)
cd frontend && npm run build / lint / format:check / test — untouched, still green

Public API unchanged

Every import site keeps its original path — from skillforge.agents import breeder; breeder.breed(...), from skillforge.agents.managed_agents import upload_skill, from skillforge.engine.variant_evolution import run_variant_evolution all still work.

🤖 Generated with Claude Code

Seven-submodule decomposition along SDK-resource-lifecycle seams: managed_agents/__init__.py barrel + full docstring with all the Step-0 smoke test SDK quirks managed_agents/_constants.py (35) beta headers + make_client + the $0.08/hr session rate managed_agents/environments.py (54) create / archive env managed_agents/skills.py (167) upload + 3-step archive dance + archive_skill_safe + name extractor managed_agents/agents.py (57) create / archive competitor agent managed_agents/sessions.py (124) create / archive session + send_user_message + event polling managed_agents/output.py (211) post-run trace introspection — written_files, bash-write parsing, token usage, runtime cost Every public name is re-exported from the package __init__ so 38 call sites keep their ``from skillforge.agents import managed_agents`` + ``managed_agents.upload_skill(...)`` usage unchanged. Tests against two private helpers (_extract_skill_name_from_md, _normalize_output_path) were accessing them on the module directly; those are re-exported through the barrel so test patches continue to resolve. QA: ruff + mypy + 411 pytest (unchanged) all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Decomposed by orchestration level — the mini-evolution loop, the assembly step, and the top-level run entry each live in their own file: variant_evolution/__init__.py barrel + re-exports run_variant_evolution variant_evolution/_helpers.py constants + _tier_sort_key + _aggregate_fitness variant_evolution/dimension.py _run_dimension_mini_evolution (challenge -> spawn -> compete -> score -> judge -> breed -> pick winner) variant_evolution/assembly.py _real_assembly (Engineer call + integration check) variant_evolution/main.py run_variant_evolution orchestrator Largest submodule is dimension.py at 345 LOC, under the 500-LOC ceiling in docs/clean-code.md §2. Prior to this split, the monolith held a single 311-LOC function (_run_dimension_mini_evolution) alongside the assembly logic and the main loop — the file was 620 LOC and every refactor touched everything. Test-access surface preserved: tests/test_variant_evolution.py imports _aggregate_fitness and _tier_sort_key directly from the package, so the __init__ re-exports them. Also rolled in: _extract_skill_name_from_md and _normalize_output_path added to the managed_agents package __all__ (they were already re-exported for test access, just needed the __all__ entry to satisfy F401). QA: ruff + mypy + 411 pytest all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Decomposed by responsibility. The six section comments in the monolith ("slot allocation", "ranking", "main breed", "mutation prompts", "lessons + reports", "bible publishing") each correspond to a submodule: breeder/__init__.py barrel — re-exports breed + compute_slots + rank_skills + publish_findings_to_bible, plus legacy aliases (breed_next_gen, spawn_gen0, BIBLE_DIR) that tests patch on the package root breeder/_ranking.py compute_slots + rank_skills + _aggregate_fitness breeder/_prompts.py diagnostic + crossover instruction templates + breeding-context formatter (pure) breeder/_reports.py _extract_lessons_and_report + siblings (LLM calls) breeder/main.py breed() + _carry_elite (orchestrator) breeder/bible.py publish_findings_to_bible (disk I/O) Largest submodule is _reports.py at 213 LOC, under the 500-LOC Python ceiling in docs/clean-code.md §2. Test-patch compatibility ------------------------ Tests patch three functions on the package root: ``breeder.breed_next_gen``, ``breeder.spawn_gen0``, ``breeder._extract_lessons_and_report``. Those patches don't propagate to bindings in submodules, so ``main.breed()`` now resolves each through the package namespace at call time (``_pkg().breed_next_gen`` etc.). BIBLE_DIR follows the same pattern in bible.py. QA: ruff + mypy + 411 pytest (unchanged) all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Decomposed along the pure-planner / thin-I/O-shell seam called out in docs/clean-code.md §7. Four submodules: spawner/__init__.py barrel — re-exports four entry points plus every helper tests patch on the package root (_generate, _read_bible_patterns, BIBLE_DIR, ...) spawner/_helpers.py _generate (LLM streaming) + _parse_genomes + _auto_repair_missing_references + _validate_genomes + _read_bible_patterns spawner/_prompts.py all _build_*_system_prompt string templates + embedded JSON schema constants (pure — no I/O, no LLM calls) spawner/main.py four public entry points: spawn_gen0, breed_next_gen, spawn_from_parent, spawn_variant_gen0 Largest submodule is main.py at 411 LOC, under the 500-LOC ceiling. Test-patch compatibility ------------------------ Same pattern as the breeder split: tests patch ``spawner._generate``, ``spawner._read_bible_patterns``, and ``spawner.BIBLE_DIR`` on the package root. Those patches do not propagate to direct imports in submodules, so ``main._generate`` / ``main._read_bible_patterns`` and ``_helpers._read_bible_patterns`` now resolve the reference through the package namespace at call time. Without these shims the test suite made real LLM calls for 11 minutes before first failure — the fix is load-bearing for both test speed and API-cost safety. QA: ruff + mypy (83 files) + 411 pytest all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Matt (via Claude Code) and others added 4 commits April 20, 2026 01:19

ty13r merged commit fcc007f into main Apr 20, 2026
2 checks passed

ty13r deleted the refactor/agent-decomposition branch April 20, 2026 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(agents): decompose 4 backend hotspot files into packages#57

refactor(agents): decompose 4 backend hotspot files into packages#57
ty13r merged 4 commits intomainfrom
refactor/agent-decomposition

ty13r commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ty13r commented Apr 20, 2026

Summary

Test-patch compatibility

Test plan

Public API unchanged

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant