Skip to content

Implement LLM request batching for concurrent agent decisions #22

@justinmadison

Description

@justinmadison

Problem

Currently, each agent makes an individual LLM call during AgentRuntime.process_tick(). While agents are processed concurrently via ThreadPoolExecutor, their LLM requests are not batched together, which underutilizes vLLM's continuous batching capabilities.

Current Flow

# python/agent_runtime/runtime.py:87
for agent_id, agent in self.agents.items():
    task = asyncio.create_task(self._agent_decide(agent))
    # Each agent makes individual LLM call

Proposed Solution

Implement batch LLM generation in AgentRuntime.process_tick():

  1. Collect all agent contexts first:

    contexts = {}
    for agent_id, agent in self.agents.items():
        contexts[agent_id] = agent._build_context()
  2. Send all prompts to vLLM together:

    if isinstance(self.backend, VLLMBackend):
        results = await self.backend.generate_batch(prompts)
    else:
        # Fallback to concurrent individual calls
        results = await self._concurrent_llm_calls(contexts)
  3. Parse results into actions:

    actions = {}
    for agent_id, result in results.items():
        actions[agent_id] = agent._parse_action(result)

Expected Impact

  • 50-70% faster LLM inference with 4+ agents
  • Better utilization of vLLM's continuous batching (PagedAttention)
  • With 4 agents: ~1.5x time of single agent instead of 4x

Files to Modify

  • python/agent_runtime/runtime.py - Implement batch processing in process_tick()
  • python/backends/vllm_backend.py - Add generate_batch() method
  • python/backends/base.py - Add abstract generate_batch() interface
  • python/backends/llama_cpp_backend.py - Implement fallback (no native batching)

References

Priority

HIGH - This is the primary bottleneck in multi-agent scenarios

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions