Test your AI agents. Catch failures before your users do.
Quick Links: Quickstart · Assertions · Failure Types · Contributing
pip install playagent[all]from playagent import record
from playagent.adapters.openai import OpenAI
client = OpenAI()
@record
def run_agent(user_input: str):
return client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": user_input}],
) session sess_a1b2c3d4
agent run_agent
started 2026-04-05 09:14:32
duration 3.24s
status passed
──────────────────────── turn 1 ─────────────────────────
model gpt-4o
latency 812ms
▸ user
What's the weather in Lagos today?
▸ assistant
I'll look that up for you.
⬡ tool call get_weather
location "Lagos, NG"
units "celsius"
| Command | What it does |
|---|---|
playagent trace list |
Lists recent trace sessions. |
playagent trace view <trace_id> |
Shows turn-by-turn trace details. |
playagent report |
Shows aggregate pass/fail and failure breakdowns. |
playagent report --format json |
Emits report stats as JSON for CI pipelines. |
playagent --version |
Prints installed PlayAgent version. |
- You stay local-first. PlayAgent writes to SQLite on your machine; nothing is sent to a hosted dashboard by default.
- You can test behavior, not only outputs. Assertions check tool-call order, parameters, and call counts directly.
- If you already use LangSmith, PlayAgent is a smaller option for local SDK-level checks; if you need hosted traces, collaboration, and observability dashboards, LangSmith is the better fit.
MIT