feat(backend): Lifecycle Manager + Session Manager lane#2
Merged
Conversation
Contract-first boundary for the Lifecycle Manager + Session Manager lane.
Pure shapes only — types, interfaces, and the display-status derivation —
so adil (SCM poller), Tom (persistence), and aditi (API) can review and
build against a stable boundary before any behaviour lands.
domain/
- CanonicalSessionLifecycle: the only persisted state (session/pr/runtime
sub-states), with Activity + Detecting sub-states added as decider inputs
that must survive between observations.
- DeriveLegacyStatus: the sole producer of the derived display status
(never persisted), with 11 table tests.
ports/
- inbound: LifecycleManager (Apply* pipeline, per-session serialised) and
SessionManager.
- outbound: LifecycleStore (Tom), Notifier, AgentMessenger, and the
Runtime/Agent/Workspace plugin ports (co-owned with the agents lane).
- facts: SCMFacts / RuntimeFacts / ActivitySignal DTOs.
decide/ pure-core signatures + I/O types; bodies stubbed for the next PR.
Folds in four design-review fixes (documented in-code, pending team confirm):
1. Activity + Detecting persisted so the pure decider has memory across calls.
2. Per-session serialisation documented; LifecyclePatch.ExpectedVersion
offers optimistic-locking as an alternative.
3. LifecyclePatch is a sparse pointer-field merge-patch (+ ClearDetecting).
4. SCMFacts gains Fetched (failed fetch != "PR closed") and per-comment
IsBot (bot vs human route to different reactions).
go build / go vet / go test all pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces the initial contract-first Go packages for the backend “Lifecycle Manager (LCM) + Session Manager (SM)” lane, defining the canonical lifecycle domain model, inbound/outbound ports, and a single pure status-derivation function (with tests) to stabilize boundaries for parallel implementation work.
Changes:
- Added
domaincanonical lifecycle/state enums,Sessionread-model, andDeriveLegacyStatus(plus table-style tests). - Added
portspackage with inbound interfaces (LCM/SM), outbound interfaces (store/notifier/runtime/agent/workspace), and fact DTOs (SCM/runtime/activity/spawn/kill). - Added
domain/decidepackage with pure-core I/O shapes and stubbed function bodies for a follow-up PR.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/internal/ports/outbound.go | Outbound ports for persistence, notifications, and runtime/agent/workspace plugins |
| backend/internal/ports/inbound.go | Inbound contracts for LifecycleManager and SessionManager |
| backend/internal/ports/facts.go | DTOs (“facts”) crossing the LCM/SM boundary (SCM, runtime, activity, spawn/kill) |
| backend/internal/domain/status.go | Derived display status enum + DeriveLegacyStatus |
| backend/internal/domain/status_test.go | Tests for DeriveLegacyStatus |
| backend/internal/domain/session.go | API read-model Session definition |
| backend/internal/domain/lifecycle.go | Canonical persisted lifecycle model + enums/substates |
| backend/internal/domain/decide/decide.go | Decide-core input/output shapes and tuning constants (stubbed bodies) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Review feedback on PR #2: - Add CanonicalSessionLifecycle.Revision (monotonic write counter) distinct from the schema Version; LifecyclePatch.ExpectedVersion -> ExpectedRevision now compares it, so optimistic locking actually works. - LifecycleStore.List returns []domain.SessionRecord (persistence shape, no derived status); add SessionRecord and make Session embed it. Keeps the Session Manager the single producer of the derived display status. - SpawnOutcome.RuntimeHandle is now the structured ports.RuntimeHandle, not a string, so Destroy/SendMessage get the handle without ad-hoc encoding. - Agent.IsProcessRunning -> ProbeProcess returning ProcessProbe (liveness), not domain.ActivityState; the name no longer implies a boolean. - Document LifecyclePatch Detecting vs ClearDetecting precedence (clear wins). - Correct the DeriveLegacyStatus doc: hard session states outrank PR facts; "PR dominates" applies only to the soft idle/working states. Implementation was already correct (matches canonical AO); only the comment overstated it. - Replace personal-name attributions in package/interface comments with role-based terms (SCM poller / persistence adapter / API layer). go build / go vet / go test all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- go.yml: gofmt check, build, vet, and race-enabled tests for backend/, triggered on backend changes and pushes to main. - gitleaks.yml: secret scanning on PRs and main using gitleaks-action v1 (license-free; v2 requires GITLEAKS_LICENSE for org repos). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8 tasks
…tests Replace the stubbed deciders in domain/decide with real, total, side-effect-free implementations: - ResolveProbeDecision: kill-intent short-circuits to terminal; failed probes and probe disagreement route to detecting; only runtime-dead + process-dead + no-recent-activity concludes killed. - ResolveOpenPRDecision: the PR pipeline ladder (ci_failing > changes_requested > approved+mergeable > approved > review_pending > idle-beyond > open). - ResolveTerminalPRStateDecision: merged -> idle/merged_waiting_decision, closed -> idle. - CreateDetectingDecision: anti-flap quarantine — unchanged-evidence counter with StartedAt preserved across the episode so the duration cap is a real wall-clock safety net; escalates to stuck at 3 ticks or 5m. - HashEvidence: strips timestamps/epochs and collapses whitespace before hashing so restamped-but-unchanged signals compare equal. Table tests cover every branch (100% statement coverage), including a consistency check that the open-PR ladder's display Status matches DeriveLegacyStatus over the canonical state it emits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
…ency test Address PR review (aa-1): - ResolveOpenPRDecision now keys MERGEABLE on Mergeable alone (checked before Approved). Mergeability is the authoritative merge gate, so a PR on a no-required-review repo no longer falls through to PR_OPEN. The approved+mergeable and approved-only cases are unchanged. - Broaden the derive-consistency test to cover the probe and terminal deciders too, not just the open-PR ladder. - Document the HashEvidence epoch-stripping regex's breadth. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ting helper Address PR review (aa-1): - Move the decider input/output type definitions (LifecycleDecision, ProbeInput, ProcessLiveness, OpenPRInput, DetectingInput) out of the execution file into a dedicated types.go, leaving decide.go for behaviour. - Expand the detecting() helper doc to spell out that it adapts a probe verdict into the shared CreateDetectingDecision anti-flap path so each probe branch doesn't re-implement the quarantine counter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Clarify the timestamp-stripping block flagged in review: spell out what each of the three regexes matches (full ISO/RFC3339 datetime, bare time-of-day, bare unix epoch) and why order matters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… contract Address Copilot review: - ResolveOpenPRDecision now sets a stable, timestamp-free Evidence summary "<condition> #<num> <url>" for every ladder outcome, consuming the previously-unused OpenPRInput.Number/URL identity inputs and making PR decisions traceable in logs. Covered by TestResolveOpenPRDecisionEvidence. - Document LifecycleDecision's zero-value contract: an empty PRState/PRReason means "this decider does not address PR — leave unchanged", not PRNone. The LCM must map empty PR fields to a nil LifecyclePatch.PR; writing PRNone on a probe tick would clobber a live PR. (Pointers were considered but the empty sentinel is distinguishable from every valid state and consistent with the codebase's value-enum style; LifecyclePatch already owns nil-means-unchanged.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(decide): pure DECIDE core + exhaustive truth-table tests
Implements ports.LifecycleManager as a synchronous observe->decide->persist reducer. Every entrypoint runs the shared pipeline under a per-session lock: load canonical -> run the matching pure decider -> diff into a sparse merge-patch -> persist. Never polls, never writes the display status. - ApplyRuntimeObservation -> probe decider; always writes the runtime axis. - ApplySCMObservation -> open-PR / terminal-PR deciders (failed fetch is a no-op: failed probe != "no PR"). Open PRs write only the PR axis. - ApplyActivitySignal -> updates the activity axis + maps onto the session axis; only valid-confidence signals are authoritative. - OnSpawnCompleted -> runtime alive + handles to metadata; session stays not_started (display: spawning). - OnKillRequested -> SM's explicit terminal-write authority. - TickEscalations -> no-op stub (reaction/escalation engine is split B). Composition rule (#1): liveness owns the runtime + death axis; activity owns the working/idle/waiting axis. A healthy probe verdict writes the session axis only to recover a liveness-owned state, so it never clobbers an activity-owned needs_input/blocked. Activity is the mirror: it stays off the death axis. Detecting clear (#3): a non-detecting probe verdict clears stale detecting memory so the next probe reads no phantom Prior. Built/tested against in-memory fakes (LifecycleStore with full merge-patch + ExpectedRevision, recording Notifier/AgentMessenger). Per-session serialisation verified under -race. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- shouldWriteSessionRuntime: never resurrect a terminal session; an observation may refresh the runtime axis but must touch neither the session axis nor the detecting memory (gated in ApplyRuntimeObservation). - OnSpawnCompleted: error on an unseeded session instead of fabricating a partial record (SM must seed first — a missing seed is a contract violation). - OnKillRequested: no-op on an unknown/already-gone session (benign race) instead of fabricating a terminal record. - keyedMutex: reference-count entries and evict on last release so the lock map stays bounded in a long-running daemon. - runtimeSubstateFromFacts: map RuntimeProbeIndeterminate to RuntimeUnknown with a neutral reason, distinct from the probe_error of a failed probe. Adds tests for terminal non-resurrection, unseeded spawn-completed error, and unknown-session kill no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
Address Harshit's PR #5 review (approve w/ design confirms + polish): - #1 (design decision): a valid activity signal is proof of life, so it now resolves a detecting session — writes the activity-mapped session state and clears the quarantine memory. Scoped to detecting only; a liveness-escalated stuck stays the probe pipeline's to resolve. Terminal still never reopens. - #2: document why a merged/closed PR parks the session axis even over an activity-owned needs_input/blocked (a merge is a milestone), unlike the open-PR path that defers to activity. - #3: map plain idle activity to a neutral session reason instead of the misleading research_complete (kept for ready, which implies completion). - #6: cover all three kill kinds (manual/cleanup/error), the open-PR review branches (changes_requested/mergeable/review_pending), and the neutral idle reason. Coverage 86.5% -> 88.6%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(lifecycle): LCM Apply* pipeline (split A)
…t B) Add the ACT half of the LCM: map persisted status transitions to reactions (send-to-agent / notify / auto-merge) and drive escalation. - reactions.go: the §4.2 default reaction table, reactionEventFor (mirrors DeriveLegacyStatus for the ACT layer), in-memory per-(session,reaction) escalation trackers, the react() dispatch chokepoint, and a real TickEscalations for duration-based escalations the synchronous LCM can't wake itself for. auto-merge action exists but is off by default; bugbot-comments/merge-conflicts are configured but dormant (no decide-core producer yet). - manager.go: mutate now returns a transition; each Apply* path fires the mapped reaction after persist via the single synchronous react() seam. OnKillRequested intentionally does not react (explicit kill != inferred event). Split-A load->decide->diff->persist behavior is unchanged. ci-failed budget is persistent across fail->pending->fail oscillation; non-persistent trackers reset when the status leaves the triggering state. Escalation silences further auto-dispatch until the condition clears. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
review) Address review finding #1: the persistent ci-failed tracker leaked and could stale-silence a future regression. It was only cleared when leaving the ci-failed reaction AND incidentOver held at that moment — so a recovery to another open-PR state (ci-failed -> approved -> merged) never cleared it. - react() now clears ALL of a session's trackers when the state REACHED is incident-over (PR resolved / session terminal) OR a genuine recovery (approved/mergeable, which the open-PR ladder guarantees means CI is no longer failing). Keyed on the state reached, not the one left, since the recovery transition is typically review_pending->approved (empty beforeKey). - Persistent ci-failed still survives the ambiguous review_pending limbo, so fail->pending->fail keeps one shared budget (§4.2). - Document the out-of-lock react() dispatch caveat for the daemon integration step (review #2) and the intentionally-skipped agent-stuck 10m threshold. Tests: re-arm after a genuine recovery (regression re-nudges, not silenced); all session trackers cleared once the incident is over. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…escalation (Copilot review) - OnKillRequested now clears the session's escalation trackers after a successful kill, so a later duration-based TickEscalations can't emit reaction.escalated for a dead session (dispatch is still skipped). - sendToAgent rolls back the attempt (and firstAttemptAt when it set it) on a messenger.Send error, so undelivered messages don't march a reaction toward escalation — honoring "send failures retry next tick" (§4.3). - Duration escalation now uses an inclusive boundary (>=) in both shouldEscalate and TickEscalations, so a 30m reaction escalates at exactly 30m instead of waiting for the next tick. Tests: kill clears trackers + no post-kill escalation; repeated failed delivery never escalates; duration escalation fires at exactly escalateAfter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat(lifecycle): ACT layer — reaction table + escalation engine (split B)
…store/cleanup) Implements ports.SessionManager against fakes for the outbound ports. The SM is the explicit-mutation half of the lane: it drives Runtime/Agent/Workspace, seeds the initial lifecycle, and routes outcomes to the LCM (OnSpawnCompleted / OnKillRequested). It never derives observed state and is the single producer of the derived display status (attached on read, never persisted). - Spawn: Workspace.Create -> Runtime.Create (AO_* identity env) -> Seed -> OnSpawnCompleted, with eager rollback of completed steps on failure. - Kill: OnKillRequested first -> Runtime.Destroy -> Workspace.Destroy, honoring the worktree-remove safety (refusal surfaced, never forced). - List/Get: derive status via DeriveLegacyStatus. Send: via AgentMessenger. Restore: re-seed (reopen) + relaunch via GetRestoreCommand. Cleanup: reclaim terminal sessions, skip worktrees holding uncommitted work. Store-contract additions (co-owned with Tom's persistence layer, flagged for review): LifecycleStore.Seed (explicit create-with-identity; OnSpawnCompleted requires a seeded record) and LifecycleStore.Get (single record-with-identity read; Load is lifecycle-only). Lifecycle test fake updated to satisfy both. Tests route through the real LCM Manager (wrapped to record call order). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ntime on post-create failure (PR #7 review) - Restore now fails early with a clear error if MetaAgentSessionID is missing, rather than emitting an ambiguous "resume nothing" launch command (no stored prompt means a fresh-launch fallback isn't possible). - On a post-runtime-create failure (reopen patch or OnSpawnCompleted), best-effort destroy the newly created runtime (never the workspace, which holds prior work) so we don't strand a live process while parking the session terminal. - Added a test for Restore with a missing agent session id: errors early, touches no workspace/runtime, leaves the session terminal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e rollback (review follow-ups) Adds an injectable OnSpawnCompleted failure to the recording LCM and two tests: - Spawn: when OnSpawnCompleted fails, the seeded record is parked terminal/errored (via OnKillRequested(KillError)) and runtime+workspace are torn down. - Restore: when OnSpawnCompleted fails post-create, the new runtime is destroyed while the workspace is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8 tasks
feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup
Add docs/ for newcomers: an index, an architecture deep-dive (the OBSERVE→DECIDE→ACT loop, the canonical state model, the package layout, every component, and the load-bearing invariants), and a status/roadmap (what's done PR-by-PR, what's left, the integration to-dos + carried-forward items, the open cross-lane contract questions, and where to plug in). Link them from the README. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address PR #2 Copilot review comments on the merged LCM+SM lane: - session: validate runtime handle + workspace path before Kill/Cleanup teardown; refuse (ErrIncompleteTeardownMetadata) or skip rather than hand empty args to a real adapter's Destroy (unsafe delete). - session: reject Restore unless the session is terminal (ErrNotRestorable) so a live session can't spawn a duplicate runtime/workspace. - ports: document SpawnConfig.OpenTerminal as reserved/not yet honored. - lifecycle: remove the unread reactionConfig.auto field; note approved-and-green is notify-only (human decides to merge). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(session): harden teardown/restore safety + drop dead reaction flag
Collaborator
Author
|
All 4 review comments addressed in #8 (f03c7c8), now merged into this branch — resolving the threads. Verified on the current tip (CI build-test + scan green; local gofmt/build/vet/
|
This was referenced May 27, 2026
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This branch is the complete Lifecycle Manager (LCM) + Session Manager (SM) lane of the backend rewrite — the deterministic core that supervises agent sessions and keeps one true status per session. It was built contract-first on a single integration branch; the sub-PRs below merged into this branch, and it now lands on
mainas one cohesive unit.Full documentation lives in
docs/: start withdocs/architecture.md, and seedocs/status.mdfor what's done vs. what's left.What it delivers
A per-session OBSERVE → DECIDE → ACT loop where OBSERVE lives outside the LCM (observers call in) and the LCM is a synchronous reducer:
domain/— the canonical state model (CanonicalSessionLifecycle, persisted) andDeriveLegacyStatus, the sole producer of the display status (derived on read, never stored).domain/decide/— the pure, total, deterministic decision core (liveness reconciliation, the PR ladder, the anti-flapdetectingquarantine), table-tested to 100%.ports/— the stabilized boundary: inboundLifecycleManager/SessionManager, outboundLifecycleStore/Notifier/AgentMessenger/Runtime/Agent/Workspace, and theSCMFacts/RuntimeFacts/ActivitySignalDTOs.lifecycle/— the LCM: theApply*pipeline (load → decide → diff → persist), per-session serialization, the liveness-vs-activity composition rules, and the reaction table + escalation engine (TickEscalations).session/— the SM: spawn / kill / restore / cleanup / list, with eager rollback and the worktree-remove safety.Folded-in PRs (merged into this branch)
Key invariants
detecting) writes inferred terminal states; the SM's explicit-kill path goes throughOnKillRequested.Status
Implemented and tested entirely behind in-memory fakes —
gofmt/go build/go vet/go test -racegreen acrossdomain,decide,lifecycle, andsession. The SM tests drive the real LCM for spawn/kill round-trips.Not in this lane (integration phase, tracked in #3): swapping fakes for real adapters (persistence/CDC, SCM poller, runtime/agent/workspace plugins, notifier, API), plus three carried-forward items — the
react()out-of-lock dispatch ordering, the unusedExpectedRevisionCAS for multi-writer/CDC, and a real implementation of the store's newSeed/Get. Seedocs/status.mdfor the full hand-off.Test plan
cd backend && gofmt -l . && go build ./... && go vet ./... && go test -race ./...all greenmainas one lane (sub-PRs were reviewed against the integration branch)🤖 Generated with Claude Code