You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
admin-brave-f7b5e0f5 stuck in "terminating" when agent replaced
admin-brave-fa96fa51 stuck in "pending" waiting for specific agent
All sessions require manual intervention when agent changes
Proposed Architecture: Agent Pool Model
Design Principles
Logical Agent Groups: Sessions bind to agent pools, not specific instances
Platform-Based Routing: Route by platform/region, not agent ID
Dynamic Assignment: Agents claim sessions at runtime
Graceful Failover: Sessions automatically reassign on agent failure
Database Schema Changes
Before (Current):
sessions:
agent_id: "k8s-prod-cluster-abc123"-- Specific instance
After (Proposed):
sessions:
platform: "kubernetes"-- Platform type
region: "us-east-1"-- Deployment region
agent_pool: "k8s-prod"-- Logical pool
assigned_agent_id: "..."-- Current handler (nullable, dynamic)
platform_resource_id: "..."-- K8s pod name, Docker container ID, etc.
Implementation Plan
Phase 1: Database Schema (v2.1.0)
Add platform, region, agent_pool columns to sessions
Problem
Sessions are currently tightly coupled to the specific agent that created them. This creates operational fragility and prevents key features:
Current Architecture:
Impact of Tight Coupling:
Agent Failure = Session Loss
No Session Migration
Blocks Auto-Scaling (Issue Agent auto-registration for Kubernetes auto-scaling #234)
Resource Waste
Observed Issues:
admin-brave-f7b5e0f5stuck in "terminating" when agent replacedadmin-brave-fa96fa51stuck in "pending" waiting for specific agentProposed Architecture: Agent Pool Model
Design Principles
Database Schema Changes
Before (Current):
After (Proposed):
Implementation Plan
Phase 1: Database Schema (v2.1.0)
platform,region,agent_poolcolumns to sessionsassigned_agent_id(nullable, replacesagent_id)platform_resource_idfor K8s pod name / Docker container IDagent_idPhase 2: Agent Pool Registry (v2.1.0)
map[poolID][]AgentPhase 3: Dynamic Session Assignment (v2.1.0)
assigned_agent_idon successful dispatchPhase 4: Platform Resource Tracking (v2.1.0)
Benefits
Related Issues
admin-brave-fa96fa51stuck in pendingPriority: P1 | Timeline: v2.1.0