ARES (Agentic Research and Evaluation Suite) is an RL-first framework for training and evaluating agents.
- Python 3.12 or higher
- uv - Fast Python package installer and resolver
To install uv, follow the instructions at https://docs.astral.sh/uv/getting-started/installation/
For now, we recommend running ARES locally from this directory:
uv sync --all-groups
and you're ready to get started.
Alternatively, include it as a dependency in your own project's pyproject.toml using a relative path. PyPI installation will be coming soon.
ARES requires API keys for various services. To get started:
- Copy the example environment file:
cp .env.example .env - Edit
.envand fill in your API keys (see.env.examplefor required and optional variables)
ARES environments use an async version of the dm_env spec. Below is an example snippet of what this might look like in your code.
By default, containers are run in Daytona, so you will need to:
- Create a daytona account at https://www.daytona.io
- Create a
.envwithDAYTONA_API_KEY=...andDAYTONA_API_URL=...set with an API key generated from your account.
This example also makes use of Martian for API inference. Similarly, you will need to
- Create an account at https://app.withmartian.com
- Add
CHAT_COMPLETION_API_KEY=...to your.envwith a Martian API key.
Then, you can run the following example:
import asyncio
from ares.code_agents import llms
from ares.environments import swebench_env
async def main():
agent = llms.ChatCompletionCompatibleLLMClient(model="openai/gpt-4.1-mini")
all_tasks = swebench_env.swebench_verified_tasks()
tasks = [all_tasks[0]] # Run on only one task for now.
async with swebench_env.SweBenchEnv(tasks=tasks) as env:
ts = await env.reset()
while not ts.last():
# The agent takes the observation (LLM Request)
# and returns an action (LLM Response).
print(f"Observation: {ts.observation}")
action = await agent(ts.observation)
# The environment takes the action (LLM Response)
# and returns the next LLM request, reward, and discount.
print(f"Action: {action}")
ts = await env.step(action)
if __name__ == "__main__":
asyncio.run(main())