Skip to content

epic: Agent god-object decomposition and sibling crate extraction (Phase 2) #3498

@bug-ops

Description

@bug-ops

Context

Phase 2 of the zeph-core decomposition (#3480). This epic tracks the actual creation of three sibling crates that the original issue proposed. It is blocked until Phase 1 (#3497) completes and the four hard prerequisites below are met.

Architecture plan: .local/plan/decompose-zeph-core.md §5 in branch refactor/3480-decompose-zeph-core-god-crate.

Why Phase 2 cannot happen now

Every targeted module (persistence.rs, tool_execution/native.rs, context/assembly.rs) is implemented as impl<C: Channel> Agent<C> blocks that access 25–98 private sub-state fields. tool_execution/native.rs touches 98 distinct self.X fields. Extracting them into sibling crates requires either:

  • making all sub-state fields pub (destroys every pub(super) invariant), or
  • a circular sibling → zeph-core::Agent → sibling dependency.

Neither is acceptable. The prerequisite is the Agent god-object decomposition (see §A1 TODO at crates/zeph-core/src/agent/mod.rs:159-175).

Hard prerequisites (ALL must be met before any Phase 2 PR)

  • P2-prereq-1: task_supervisor adoption reaches 100% in crates/zeph-core/src/agent/ (currently 2/15 = ~13%; 13 raw tokio::spawn sites remain)
  • P2-prereq-2: Agent god-object decomposition lands (A1 TODO). Target structure:
    • Conversation coremsg, context_manager, persistence stays on Agent
    • Services aggregatormemory, learning, focus, sidequest, compression, mcp, index, security, orchestration, experiments behind a Services struct that Agent borrows
    • AgentRuntime newtyperuntime, lifecycle, providers, metrics, debug_state, instructions
  • P2-prereq-3: TurnContext value type defined — a value type that survives boundary cuts between loop/compose/persist phases
  • P2-prereq-4: task_supervisor.rs migrates to zeph-common (deferred from Phase 1 row 14)

Target crate map (after Phase 2)

zeph-core (~22K LoC, orchestrator only)
├── zeph-context           (expanded in Phase 1)
├── zeph-agent-feedback    (extracted in Phase 1, PR #3494)
├── zeph-agent-persistence (NEW)
├── zeph-agent-tools       (NEW — or fold into zeph-tools)
└── zeph-agent-context     (NEW)
New crate Owns Key deps
zeph-agent-persistence PersistenceService, sanitize helpers, graph/persona/trajectory task enqueueing as fns taking &MemoryState zeph-memory, zeph-db, zeph-llm, zeph-context
zeph-agent-tools ToolDispatcher, native dispatch, retry, confirmation, batch result processing zeph-tools, zeph-llm, zeph-mcp, zeph-skills, zeph-context
zeph-agent-context Remaining Agent-coupled assembly code after pure helpers moved to zeph-context in Phase 1 zeph-context, zeph-memory, zeph-llm, zeph-skills, zeph-index

Acceptance Criteria

  • zeph-core LoC drops below 30K (from ~79K post-Phase-1)
  • Three new sibling crates compile independently (cargo build -p zeph-agent-persistence etc. without zeph-core)
  • cargo build --timings shows measurable incremental rebuild improvement: edit to persistence.rs does not recompile tool_execution/ or context/assembly.rs
  • All existing tests pass
  • CI LoC gate ratcheted from 80K → 30K for zeph-core

Hand-off contract from Phase 1

Phase 1 (#3497) leaves the codebase in a state that makes Phase 2 mechanical:

  1. Pure helpers already in zeph-context — Phase 2 doesn't re-touch them
  2. zeph-agent-feedback already extracted — serves as the template for Phase 2 crate structure
  3. Sub-state structs each in their own file (state/{message,memory,…}.rs) — easier to make pub with clear per-struct boundary
  4. tool_execution/ and persistence/ grouped into per-phase files — easier to move whole files when Agent decomposition lands
  5. CI LoC gate enforced at 80K — no regression possible during Phase 2 prep

SDD requirement

Before any Phase 2 implementation starts, create a spec entry under /specs/ for the Agent decomposition using /sdd. The spec must define:

  • The Services aggregator contract (what implementors must guarantee)
  • The TurnContext value type boundary
  • The AgentRuntime newtype interface
  • Acceptance tests for each new crate boundary

Blocked by

  • #3497 — Phase 1 file splits (must complete first)
  • P2-prereq-1 through P2-prereq-4 above

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2High value, medium complexityrefactorCode refactoring without functional changes

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions