A Godot-native framework for LLM-driven NPCs with tools, memory, and goal-oriented behavior in small simulation scenes. Focused on local models, reproducible evaluations, and pluggable inference backends.
nMaintained by JustInternetAI
Founded by Andrew Madison and Justin Madison
Agent Arena combines a high-performance Godot 4 C++ module with a Python-based training and evaluation harness to create a testbed for multi-agent AI research. Agents interact in deterministic sandbox environments using function-calling tool APIs, episodic memory, and RAG-based retrieval.
- Godot C++ Module: Deterministic tick loop, event bus, navigation, sensors, stable replay logs
- Agent Runtime: Adapters for llama.cpp, TensorRT-LLM, vLLM with function-calling tool API
- Tool System: World querying (vision rays, inventories), pathfinding, crafting actions via JSON schemas
- Memory & RAG: Short-term scratchpad + long-term vector store with episode summaries
- Benchmark Scenes: 3 sandbox environments (foraging, crafting chain, team capture) with metrics
- Eval Harness: Seedable scenarios, scorecards, replays, unit tests for agent APIs
- Curriculum learning with increasing scene complexity
- Self-play RL fine-tuning (PPO on discrete action schemas)
- Multi-modal support with small vision encoders (CLIP-like) for visual observations
┌─────────────────────────────────────────────────────────┐
│ Godot 4 Engine │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Agent Arena C++ Module (GDExtension) │ │
│ │ • Deterministic Simulation Loop │ │
│ │ • Event Bus & Sensors │ │
│ │ • Navigation & Pathfinding │ │
│ │ • Action Execution & World State │ │
│ └──────────────────┬───────────────────────────────┘ │
└─────────────────────┼───────────────────────────────────┘
│ IPC / gRPC / HTTP
┌────────────┴────────────┐
│ Python Agent Runtime │
│ • LLM Inference │
│ • Tool Dispatching │
│ • Memory Management │
│ • RAG Retrieval │
└─────────┬───────────────┘
│
┌────────────┼────────────┐
│ │ │
llama.cpp TensorRT-LLM vLLM
- Game Engine: Godot 4 with GDExtension (C++)
- Languages: C++ (module), Python 3.11 (runtime/evals)
- LLM Backends: llama.cpp, TensorRT-LLM, vLLM
- ML Framework: PyTorch (optional for training)
- Vector Store: Milvus/FAISS for memory
- Serialization: msgpack for replay logs
- Config Management: Hydra
agent-arena/
├── godot/ # Godot 4 C++ module
│ ├── src/ # C++ source files
│ ├── include/ # Header files
│ ├── bindings/ # GDExtension bindings
│ └── CMakeLists.txt
├── python/ # Python runtime and tools
│ ├── agent_runtime/ # Agent inference runtime
│ ├── memory/ # Memory and RAG systems
│ ├── tools/ # Tool implementations
│ ├── evals/ # Evaluation harness
│ └── backends/ # LLM backend adapters
├── scenes/ # Benchmark Godot scenes
│ ├── foraging/
│ ├── crafting_chain/
│ └── team_capture/
├── configs/ # Hydra configuration files
├── tests/ # Unit and integration tests
├── docs/ # Documentation
└── scripts/ # Build and utility scripts
- Godot 4.2+ (with GDExtension support)
- CMake 3.20+
- C++17 compatible compiler (GCC 9+, Clang 10+, MSVC 2019+)
- Python 3.11+
- CUDA Toolkit 12+ (optional, for TensorRT-LLM)
-
Clone the repository
git clone https://github.com/JustInternetAI/AgentArena.git cd agent-arena -
Build the Godot module
cd godot mkdir build && cd build cmake .. cmake --build .
-
Set up Python environment
cd ../../python python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt
-
Run tests
pytest tests/
See docs/quickstart.md for a tutorial on creating your first agent-driven scene.
- Phase 1: Core infrastructure (deterministic sim, event bus, basic tools)
- Phase 2: Agent runtime with llama.cpp integration
- Phase 3: Memory system (scratchpad + vector store)
- Phase 4: First benchmark scene (foraging)
- Phase 5: Eval harness and metrics
- Phase 6: Additional backends (TensorRT-LLM, vLLM)
- Phase 7: Advanced features (curriculum learning, RL fine-tuning)
Contributions are welcome! This project bridges gamedev and AI research, making it accessible to both communities. Please read CONTRIBUTING.md for guidelines.
Apache License 2.0 - see LICENSE for details.
If you use Agent Arena in your research, please cite:
@software{agent_arena_2025,
title={Agent Arena: A Godot Framework for LLM-Driven Multi-Agent Simulation},
author={Madison, Andrew and Madison, Justin},
year={2025},
url={https://github.com/JustInternetAI/AgentArena}
}