fix(cloud-agent): use execution heartbeat for idle cleanup instead of kiloServerLastActivity#3176
Conversation
… kiloServerLastActivity The kiloServerLastActivity field was only set at session prepare and wrapper start time — never updated during an active execution. After 15 minutes, cleanupIdleKiloServer would see a stale timestamp and SIGTERM the container, even though the execution was actively running and heartbeating every 30s. Replace kiloServerLastActivity with execution-level data: if there's an active execution, skip cleanup immediately; otherwise derive the last activity timestamp from the latest execution's lastHeartbeat, completedAt, or startedAt (in that priority order). Remove the now-unnecessary recordKiloServerActivity() RPC method and all its call sites.
Code Review SummaryStatus: No Issues Found | Recommendation: Merge OverviewThis is a clean, well-scoped fix. The root cause (stale NotesOne behavioral change worth being aware of (not a bug): Sessions that have been prepared (kilo server started) but have zero executions will now return early from Logic correctness of the new approach:
All removed symbols ( Files Reviewed (5 files)
Fix these issues in Kilo Cloud Reviewed by claude-sonnet-4.6 · 646,262 tokens |
Summary
The
kiloServerLastActivityfield in session metadata was only set at session prepare and wrapper start time — never updated during an active execution. After 15 minutes (the idle timeout),cleanupIdleKiloServerwould see a stale timestamp and SIGTERM the container, even though the execution was actively running and heartbeating every 30s. This caused false "Container shutdown: SIGTERM" interruptions for any session running longer than 15 minutes.Replace
kiloServerLastActivitywith execution-level data:lastHeartbeat,completedAt, orstartedAt(in that priority order)This removes the need for the separate
kiloServerLastActivityfield and therecordKiloServerActivity()RPC method entirely, along with all its call sites in the router and orchestrator.Verification
Triggered a Cloud Agent session on
Kilo-Org/kilocodeand confirmed via Axiom logs that:okwith zero exceptionscleanupIdleKiloServerat the 15-minute mark (idleMs=959890,idleTimeoutMs=900000)No wrapper found to stoplog confirmed the container was already gone by the time the alarm tried to stop itinterruptedand reasonContainer shutdown: SIGTERMVisual Changes
N/A
Reviewer Notes
The
kiloServerLastActivityfield is being removed from the session metadata schema. Existing sessions in DO storage that have this field set will simply have it ignored — the zod schema uses.optional()and no code path reads it anymore. No migration is needed.