Skip to content

Persist agent service state instead of keeping agents only in memory #5267

@bobbai00

Description

@bobbai00

Task Summary

Each agent in the agent service is a stateful in-memory object with no durable backing store.

In agent-service/src/server.ts the entire fleet lives in a process-local map:

const agentStore = new Map<string, TexeraAgent>();
let agentCounter = 0;

All agent state — the ReAct step tree, current HEAD/checkout pointer, per-agent settings, delegate config, and the cached operator-result state — is held in TexeraAgent instance fields. The only thing persisted today is the workflow content, which is pushed back to the dashboard backend (persistWorkflow, debounced 500ms). The agent and its conversation/step history are not.

Consequences:

Before:  restart / crash / redeploy  ->  agentStore is empty  ->  all agents + history lost
         second replica              ->  separate agentStore   ->  agent not found
After:   restart / crash / redeploy  ->  agents rehydrated from store
         any replica                 ->  shared store          ->  agent reachable

This also blocks horizontal scaling: a request routed to a replica that doesn't hold the agent gets Agent not found.

Introduce a persistence layer (e.g. database or Redis) for agent records and their step history, with load-on-demand / rehydration on startup, so agents survive restarts and can be served by more than one replica.

Task Type

  • Refactor / Cleanup
  • DevOps / Deployment / CI
  • Testing / QA
  • Documentation
  • Performance
  • Other

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Task.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions