GitHub - ThunderAgent-org/ThunderAgent: A simple, fast and robust program-aware agentic inference system.

Fast, simple and program-aware agentic inference system.

About

ThunderAgent is a fast and easy-to-use library for agentic inference and rollout.

ThunderAgent is fast with:

Agentic program-aware scheduler that increases KV-cache hit rate and reduces memory imbalance across nodes, increasing agentic inference throughputs 1.5-3.6x across multiple agentic workflows.
Tool-call lifecycle management with automatic resource reclaim for more stable and reliable long-running rollouts

ThunderAgent is flexible and easy to use with:

OpenAI-compatible API passthrough with only one changing, adding Program_id to the sending API.
Multiple inference support for vLLM and SGLang
Multiple agentic RL training example like Search-R1 agent with slime and mini-swe-agent with SkyRL.
Real-time visualization of agentic trajectory metrics including total tokens, tool-use time, and per-program profiling.

Overview

ThunderAgent sits between agent clients and the infrastructure layer as an agentic workflow scheduler. On one hand, it improves inference throughput of vLLM/SGLang across multiple GPU nodes through program-aware scheduling. On the other hand, it provides a unified tool management interface for resources like Docker containers and remote APIs.

Inference & Evaluation Results

ThunderAgent improves vLLM throughput by 1.5–3.6× across diverse agentic workloads including SWE-Agent, OpenHands, and ToolOrchestra.

Getting Started

Install ThunderAgent from source:

git clone git@github.com:HaoKang-Timmy/ThunderAgent.git
cd ThunderAgent
pip install -e .

How to use? Choose one backend you like, for example vllm.

uv pip install vllm --torch-backend=auto # install vllm

vllm serve Qwen/Qwen3-32B --port 8000 # serve a model

thunderagent --backend-type vllm --backends http://localhost:8000 --port 9000 --metrics --profile # launch ThunderAgent, make sure to send request through 9000.

How to embed with your own agentic workflow?

# original openai sender
openai.client.chat.completions.create(
            model=self.config.model_name,
            messages=messages,
          )
# ThunderAgent openai sender
extra_body = {}
extra_body["program_id"] = "unique_id"
# if you use docker for your agentic workflow
# extra_body["docker_ids"] = ["docker_id1", "docker_id2", ...]
openai.client.chat.completions.create(
            model=self.config.model_name,
            messages=messages,
            extra_body = extra_body
          )

Contributing

We welcome and value any contributions and collaborations. Please create a pull request.

Citation

If you use ThunderAgent for your research, please cite our paper:

Contact Us

For enterprises interested in adopting or deploying ThunderAgent at scale, including technical consulting, sponsorship opportunities, or partnership inquiries, please contact us at hkang342@gatech.edu or Simran@together.ai

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
ThunderAgent		ThunderAgent
assets		assets
examples		examples
wiki		wiki
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fast, simple and program-aware agentic inference system.

About

Overview

Inference & Evaluation Results

Getting Started

Contributing

Citation

Contact Us

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

ThunderAgent-org/ThunderAgent

Folders and files

Latest commit

History

Repository files navigation

Fast, simple and program-aware agentic inference system.

About

Overview

Inference & Evaluation Results

Getting Started

Contributing

Citation

Contact Us

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages