feat(daemon): wire Session Manager + agent shim + RepoResolver + inbox messenger#70
Conversation
…x messenger PR #62 ("simplify session lifecycle and zellij runtime") deleted the Session Manager wiring from the daemon — every call to session.New() was removed and only the integration test still constructed one. This brings SM back, end to end: calling sm.Spawn() now launches a real Claude Code agent in a real git worktree in a real zellij session, and lifecycle nudges reach the agent via an inbox file. Four new pieces: - adapters/agent/portshim: bridges the richer adapters/agent.Agent interface (PR #65, @yyovil) onto the narrower ports.Agent the SM consumes. POSIX shell-quoting joins argv into the single string the zellij `sh -lc` wrapper expects. - adapters/workspace/gitworktree/projectresolver: gitworktree.RepoResolver backed by project.Manager. Lives in its own subpackage so gitworktree stays free of the project import (and the cycle that would create). - adapters/messenger/inbox: ports.AgentMessenger writing each message as <session-workspace>/.ao/inbox/<nano>_<hash>.md. Symlink-safe via os.Lstat on the .ao/inbox segments. - daemon/session_wiring.go: assembles claudecode → portshim, gitworktree over projectresolver, inbox messenger over the sqlite store, and the SM itself. Reuses the existing zellij runtime / project manager / lcm singletons rather than constructing parallel copies. Daemon-wide singleton sharing (the change of behavior under #62 / #65 + this PR): - One zellij.Runtime instance services both the terminal mux and SM.Spawn. Two adapters would race on the same socket. - One lifecycle.Manager instance services both the reaper (runtime liveness observations) and the SM (spawn/restore/kill writes). Two LCMs would split agent-nudge state. - One project.Manager instance services both httpd (/api/v1/projects) and the gitworktree RepoResolver. Two stores would diverge on cached reads. - One ports.AgentMessenger services both the LCM (PR-driven reactions: CI fail, review feedback, merge conflict) and the SM (Send). - One *sqlite.Store services CDC, lifecycle, SM, and the inbox workspace lookup. Already the case; preserved. Also promotes the duplicated "agentSessionId" metadata key literal in the claudecode and codex adapters to a single agent.MetadataKeyAgentSessionID constant in adapters/agent, which the portshim now uses to populate Session.Metadata for the underlying adapter's GetRestoreCommand. What this PR does NOT do (covered by follow-ups γ/δ): - No HTTP routes for SM (POST /sessions, etc.) — γ - No `ao session new` CLI — γ - No SCM poller — δ - No codex agent wiring (claude-code only) — later - No zellij send-keys pane-ping — the agent reads its inbox on demand Tests: - portshim: 15 table-driven cases (shell quoting, env, restore propagation, error fall-through, safe-string short-circuit). - projectresolver: 4 cases (interface satisfaction, happy path, unknown project, degraded project). - inbox: 9 cases (interface satisfaction, write, dir create, two-distinct, unknown session, empty workspace path, symlinked inbox refused, empty message, filename shape). - daemon/wiring_test: SM stack constructed + sharing singletons + messenger reaches the same store via SessionMetadata.WorkspacePath end to end. go test -race / go vet / gofmt clean. The pre-existing TestSessionStreamsRealZellijPane integration test fails on this host because \$TMPDIR > 103 chars (zellij IPC socket limit) — also fails on origin/staging without these changes. Branched from origin/staging (= main + #65) so claudecode is available; PR base must be staging. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Greptile SummaryThis PR re-wires the Session Manager end-to-end after it was removed in #62, adding four new components: a
Confidence Score: 3/5The wiring is structurally sound and the singleton contract is correctly enforced, but the portshim silently returns an empty launch command on adapter error, which can cause the Session Manager to record a successful spawn while no agent process is actually running. The portshim swallows GetLaunchCommand errors and returns an empty string. The SM passes that string directly to runtime.Create without checking it, so if the claude binary is missing the daemon creates a zellij session with sh -lc empty string, marks it spawned, and the caller gets back a valid domain.Session with no agent behind it. The reaper will eventually clean up the zombie, but no error is surfaced at spawn time. The permission-mode zeroing and context.Background() propagation are known port-interface limitations that do not introduce incorrect state. backend/internal/adapters/agent/portshim/shim.go warrants the closest review — specifically how GetLaunchCommand and GetRestoreCommand handle errors and the absence of permission-mode propagation. Important Files Changed
Sequence DiagramsequenceDiagram
participant D as daemon.Run
participant LC as startLifecycle
participant SS as buildSessionStack
participant PM as portshim.Shim
participant CC as claudecode.Plugin
participant SM as session.Manager
participant WS as gitworktree.Workspace
participant RT as zellij.Runtime
participant IB as inbox.Messenger
participant ST as sqlite.Store
D->>LC: startLifecycle(ctx, store, runtime, messenger)
LC-->>D: "lifecycleStack{lcm, reaperDone}"
D->>SS: buildSessionStack(cfg, store, runtime, projects, lcm, messenger)
SS->>PM: portshim.New(claudecode.New())
SS->>WS: gitworktree.New(projectresolver.New(projects))
SS->>SM: "session.New(Deps{...})"
SS-->>D: "sessionStack{sm, workspace, messenger}"
Note over D: _ = ss (routes in follow-up PR)
D->>RT: srv.Run(ctx)
Note over SM,RT: On Spawn()
SM->>PM: GetLaunchCommand(AgentConfig)
PM->>CC: GetLaunchCommand(context.Background, LaunchConfig)
CC-->>PM: argv or err
PM-->>SM: shell-quoted string or empty string on err
SM->>WS: Create(WorkspaceConfig)
WS-->>SM: "WorkspaceInfo{Path, Branch}"
SM->>RT: "Create(RuntimeConfig{LaunchCommand})"
RT-->>SM: RuntimeHandle
Note over SM,IB: On Send()
SM->>IB: Send(ctx, sessionID, message)
IB->>ST: WorkspacePath(ctx, id)
ST-->>IB: workspace path
IB->>IB: ensureRealDir(.ao, .ao/inbox)
IB->>IB: WriteFile(nano_hash.md)
|
Review fixes (PR #71): - spawn CLI now uses a dedicated 90 s timeout (90 s > server's 60 s DefaultRequestTimeout) via context.WithTimeout, and stops sharing deps.HTTPClient — that client is sized for fast /healthz/shutdown probes (2 s) and was preempting the synchronous Spawn long before the daemon could finish provisioning a worktree + zellij pane + agent. - Harden writeSpawnError so a *project.Error with a non-client Kind ("internal", "not_implemented", or anything unknown) falls through to the generic 500 SPAWN_FAILED envelope instead of passing the project error's Code/Message verbatim to the client. Adds three subtests that pin down the opacity contract. Lint debt cleared (inherited from PRs #65/#70): - Add doc comments on every exported symbol in the agent / claudecode / codex / adapters-registry packages (revive: exported) - gosec G306/G301: inbox file/dir perms 0644→0600 and 0755→0750 - gosec G703 (path traversal via taint): excluded globally with the same rationale as G304 — adapter paths are daemon-config/worktree-derived, not user input - gocritic emptyStringTest: len(strings.TrimSpace(...)) > 0 → != "" - gocritic paramTypeCombine: combine adjacent same-type params - errcheck: wrap deferred os.Remove(tmpName) in a closure - prealloc: preallocate cmd slices on the resume paths
* feat(ao): `ao spawn` CLI + POST /api/v1/sessions route * fix(ao): address PR γ review + clear inherited lint debt Review fixes (PR #71): - spawn CLI now uses a dedicated 90 s timeout (90 s > server's 60 s DefaultRequestTimeout) via context.WithTimeout, and stops sharing deps.HTTPClient — that client is sized for fast /healthz/shutdown probes (2 s) and was preempting the synchronous Spawn long before the daemon could finish provisioning a worktree + zellij pane + agent. - Harden writeSpawnError so a *project.Error with a non-client Kind ("internal", "not_implemented", or anything unknown) falls through to the generic 500 SPAWN_FAILED envelope instead of passing the project error's Code/Message verbatim to the client. Adds three subtests that pin down the opacity contract. Lint debt cleared (inherited from PRs #65/#70): - Add doc comments on every exported symbol in the agent / claudecode / codex / adapters-registry packages (revive: exported) - gosec G306/G301: inbox file/dir perms 0644→0600 and 0755→0750 - gosec G703 (path traversal via taint): excluded globally with the same rationale as G304 — adapter paths are daemon-config/worktree-derived, not user input - gocritic emptyStringTest: len(strings.TrimSpace(...)) > 0 → != "" - gocritic paramTypeCombine: combine adjacent same-type params - errcheck: wrap deferred os.Remove(tmpName) in a closure - prealloc: preallocate cmd slices on the resume paths * fix(lint): lift loop condition in scm poller test (staticcheck QF1006) Inherited from PR #72 merging into staging after this branch opened. golangci-lint v2.12.2 → 0 issues.
…s wiring Main moved structurally via PR #67 (session→session_manager rename + service layer for read-model assembly). Staging had the live SM daemon wiring (PR #70) but for the old session package shape. This merge adopts main's shape and retargets staging's wiring on top. Structural decisions: - Keep main's httpd/api.go shape with controllers.SessionService (drop staging's session.Spawner). - Keep main's rich httpd/controllers/sessions.go (list/get/spawn/restore/kill/ send/rename/spawnOrchestrator) and its tests. - Keep main's SCM provider/client/test code: the pagination guard, the merge_state_status REST field, and the application/vnd.github+json Accept header are intentional safety + correctness fixes. Staging's text/plain Accept tweak would 406 the /actions/jobs/{id}/logs endpoint. - Delete session.Spawner: service.Session is the new controller contract and duck-types into controllers.SessionService. Wiring retarget (daemon/session_wiring.go, daemon/daemon.go): - buildSessionStack now constructs *session_manager.Manager and wraps it in service.NewSession(sm, store); the stack returns *service.Session instead of a bare *Manager. - daemon.Run passes ss.svc into httpd.APIDeps{Sessions: ...}. Preserved from staging: - aa-46 default-branch fix in session_manager/manager.go::Spawn (branch defaults to "ao/<sessionID>" when SpawnConfig.Branch is empty) plus its two pinning tests (TestSpawn_DefaultsBranchPerSession_WhenUnset, TestSpawn_HonorsExplicitBranch). - ao spawn CLI + POST /api/v1/sessions route, ao-here.sh helper, agent adapters (claudecode + portshim), inbox messenger, projectresolver, gitworktree daemon wiring, SCM poller, find_branch_pr, ao spawn route fix. Import migration: internal/session → internal/session_manager updated in daemon/session_wiring.go and adapters/agent/portshim/shim_test.go. Verification: go build, go vet, gofmt all clean. go test -race ./... reports 406 pass / 1 fail (TestSessionStreamsRealZellijPane — pre-existing macOS $TMPDIR > 103-char zellij IPC socket length issue, not introduced by merge). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
PR β in our SM-revival planning. PR #62 (merged 2026-06-01) deleted the Session Manager wiring from the daemon — every call to
session.New()was removed; only the integration test still constructs one. This brings SM back end-to-end: callingsm.Spawn()now launches a real Claude Code agent in a real git worktree in a real zellij session, and lifecycle nudges actually reach the agent via an inbox file.Branches from
origin/staging(= main + #65) so @yyovil'sclaudecodeadapter is available. PR base isstaging.What's new
Four new pieces:
adapters/agent/portshim— bridges the richeradapters/agent.Agentinterface (PR Add agent adapters and wire per-session agents into the session manager #65) onto the narrowerports.Agentthe SM consumes. POSIX shell-quoting joins argv into the single string the zellijsh -lcwrapper expects.adapters/workspace/gitworktree/projectresolver—gitworktree.RepoResolverbacked byproject.Manager. Lives in its own subpackage sogitworktreestays free of theprojectimport.adapters/messenger/inbox—ports.AgentMessengerwriting each message as<session-workspace>/.ao/inbox/<nano>_<hash>.md. Symlink-safe viaos.Lstaton the.ao/.ao/inboxsegments.daemon/session_wiring.go— assembles claudecode → portshim, gitworktree over projectresolver, inbox messenger over the sqlite store, and the SM itself. Reuses the existing zellij runtime / project manager / lcm singletons.Also promotes the duplicated
"agentSessionId"metadata key inclaudecodeandcodexto a singleagent.MetadataKeyAgentSessionIDconstant so the portshim has a stable place to populateSession.Metadatafor the underlying adapter.Singleton sharing (the contract this PR enforces)
zellij.Runtimeservices both terminal mux and SM.Spawn (two adapters would race on the same socket).*lifecycle.Managerservices both the reaper and the SM (two LCMs would split agent-nudge state).project.Managerservices both httpd and the gitworktree RepoResolver.ports.AgentMessengerservices both the LCM (PR-driven reactions) and the SM (Send).Out of scope (covered by follow-ups)
ao session newCLI — γReferences
claudecodeadapter added by Add agent adapters and wire per-session agents into the session manager #65 (@yyovil).Test plan
go test -race ./...— 323/324 pass (the 1 failure isTestSessionStreamsRealZellijPane, a pre-existing host-env issue:\$TMPDIR > 103chars exceeds zellij's IPC socket limit; also fails onorigin/stagingwithout these changes)go vet ./...cleangofmt -l backend/cleangoimports -lclean🤖 Generated with Claude Code