GitHub - samuelcolvin/autoagents-bench: AutoAgents Benchamark

AutoAgents Benchmark

Concurrent completion benchmarks for the autoagents framework alongside GraphBit, LangChain, LangGraph, CrewAI, PydanticAI, Rig (Rust) and LlamaIndex agents.

All runners read their workload settings from benchmark.yaml (or a path provided via BENCH_CONFIG). Update that file to change request count, concurrency, model, or prompt template once and share it across languages.

All runners require an OPENAI_API_KEY that can call the configured models. The benchmark suite runs in two modes:

Tool mode: agents call the trip-data tool.
LLM-only mode: the average is precomputed and the model only formats the answer.

Results are written to:

benchmark_results_tool.json
benchmark_results_llm.json

Disclaimer

The benchmarks are written in Rust and Python. The Rust benchmarks use the autoagents framework, while the Python benchmarks use GraphBit, LangChain, LangGraph, CrewAI, PydanticAI, Rig, and LlamaIndex. The benchmarks measure execution speed, CPU usage, memory footprint, and determinism. If you feel like the benchmarks are not accurate or you have any suggestions, please feel free to open an issue or submit a pull request.

Benchmark

The below bencmark is run for 50 concurrent requests to process an ReAct Style Agent to process and parquet file to calculate the average duration time.

Rust benchmark (AutoAgents)

export OPENAI_API_KEY=sk-your-key

cargo run --release -- autoagents
cargo run --release -- rig

Python benchmark (GraphBit, LangChain, LangGraph, CrewAI, PydanticAI, LlamaIndex)

export OPENAI_API_KEY=sk-your-key

# Using uv (recommended) or your preferred Python runner
uv run main.py pydantic --model tool

cargo run --release -- all

Plotting

uv run python plot_benchmarks.py --input benchmark_results_tool.json
uv run python plot_benchmarks.py --input benchmark_results_llm.json

Web Dashboard

For web based UI to look into, check out benchmark-dashboard.

Note

Python Files are in _src folder and Rust in src

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
_src		_src
benchmark-dashboard		benchmark-dashboard
plots		plots
src		src
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
APACHE_LICENSE		APACHE_LICENSE
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
benchmark.yaml		benchmark.yaml
benchmark_results_tool.json		benchmark_results_tool.json
main.py		main.py
plot_benchmarks.py		plot_benchmarks.py
pyproject.toml		pyproject.toml
trip_data.parquet		trip_data.parquet
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoAgents Benchmark

Disclaimer

Benchmark

Rust benchmark (AutoAgents)

Python benchmark (GraphBit, LangChain, LangGraph, CrewAI, PydanticAI, LlamaIndex)

Plotting

Web Dashboard

Note

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AutoAgents Benchmark

Disclaimer

Benchmark

Rust benchmark (AutoAgents)

Python benchmark (GraphBit, LangChain, LangGraph, CrewAI, PydanticAI, LlamaIndex)

Plotting

Web Dashboard

Note

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages