Tip
ACE is the open-source engine behind Kayba. If you'd rather have the whole loop managed for you, from failure investigation to fixes shipped as PRs, get a demo.
AI agents don't learn from experience. They repeat the same mistakes every session, forget what worked, and ignore what failed. ACE is the open-source engine that adds a persistent learning loop. It also powers Kayba, the managed service that does this for your production agents automatically.
The agent claims a seahorse emoji exists. ACE reflects on the error, and on the next attempt, the agent responds correctly β without human intervention.
| Metric | Result | Context |
|---|---|---|
| 2x consistency | Doubles pass^4 on Tau2 airline benchmark | 15 learned strategies, no reward signals |
| 49% token reduction | Browser automation costs cut nearly in half | 10-run learning curve |
| $1.50 learning cost | Claude Code translated 14k lines to TypeScript | Zero build errors, all tests passing |
uv add ace-frameworkOption A β Interactive setup (recommended):
ace setup # Walks you through model selection, API keys, and connection validationOption B β Manual configuration:
export OPENAI_API_KEY="your-key" # or ANTHROPIC_API_KEY, or any of 100+ supported providersThen use it:
from ace import ACELiteLLM
agent = ACELiteLLM(model="gpt-4o-mini")
# First attempt β the agent may hallucinate
answer = agent.ask("Is there a seahorse emoji?")
# Feed a correction β ACE extracts a strategy and updates the Skillbook
agent.learn_from_feedback("There is no seahorse emoji in Unicode.")
# Subsequent calls benefit from the learned strategy
answer = agent.ask("Is there a seahorse emoji?")
# Inspect what the agent has learned
print(agent.get_strategies())No fine-tuning, no training data, no vector database.
-> Quick Start Guide | -> Setup Guide | -> Hosted API: Where Do Traces Come From?
ACE maintains a Skillbook β a persistent collection of strategies that evolves with every task. Three specialized roles manage the learning loop:
| Role | Responsibility |
|---|---|
| Agent | Executes tasks, enhanced with Skillbook strategies |
| Reflector | Analyzes execution traces to extract what worked and what failed |
| SkillManager | Curates the Skillbook β adds, refines, and removes strategies |
The Recursive Reflector is the key innovation: instead of summarizing traces in a single pass, it writes and executes Python code in a sandboxed environment to programmatically search for patterns, isolate errors, and iterate until it finds actionable insights.
flowchart LR
Skillbook[(Skillbook)]
Start([Task]) --> Agent[Agent]
Agent <--> Environment[Environment]
Environment -- Trace --> Reflector[Reflector]
Reflector --> SkillManager[SkillManager]
SkillManager -- Updates --> Skillbook
Skillbook -. Strategies .-> Agent
All roles are backed by PydanticAI agents with structured output validation. PydanticAI routes to 100+ LLM providers through its LiteLLM integration, with native support for OpenAI, Anthropic, Google, Bedrock, Groq, and more.
Based on the ACE paper (Stanford & SambaNova) and Dynamic Cheatsheet.
| Runner | Class | Description |
|---|---|---|
| LiteLLM | ACELiteLLM |
Batteries-included agent with .ask(), .learn(), .save() β accepts any LiteLLM model string |
| Core | ACE |
Full learning loop with batch epochs and evaluation |
| Trace Analyser | TraceAnalyser |
Learn from pre-recorded traces without re-running tasks |
| browser-use | BrowserUse |
Browser automation that improves with each run |
| LangChain | LangChain |
Wrap any LangChain chain or agent with learning |
| Claude Code | ClaudeCode |
Claude Code CLI tasks with learning |
uv add 'ace-framework[browser-use]' # Browser automation
uv add 'ace-framework[langchain]' # LangChain
uv add 'ace-framework[logfire]' # Observability (auto-instruments PydanticAI)
uv add 'ace-framework[mcp]' # MCP server for IDE integration
uv add 'ace-framework[deduplication]' # Embedding-based skill deduplicationHave existing agent logs? Extract strategies from them directly:
from ace import ACELiteLLM
agent = ACELiteLLM(model="gpt-4o-mini")
agent.learn_from_traces(your_existing_traces)
print(agent.get_strategies())tau2-bench by Sierra Research: airline domain tasks requiring tool use and policy adherence. Claude Haiku 4.5 agent, strategies learned on the train split with no reward signals, evaluated on the held-out test split.
pass^k = probability all k independent attempts succeed. ACE doubles consistency at pass^4 with 15 learned strategies.
ACE + Claude Code translated this library from Python to TypeScript with zero supervision:
| Metric | Result |
|---|---|
| Duration | ~4 hours |
| Commits | 119 |
| Lines written | ~14,000 |
| Build errors | 0 |
| Tests | All passing |
| Learning cost | ~$1.50 |
ACE is built on a composable pipeline engine. Each step declares what it requires and what it produces:
AgentStep -> EvaluateStep -> ReflectStep -> UpdateStep -> DeduplicateStep
Use learning_tail() for the standard learning sequence, or compose custom pipelines:
from ace import Pipeline, AgentStep, EvaluateStep, learning_tail
steps = [AgentStep(agent, skillbook), EvaluateStep(env)] + learning_tail(reflector, skill_manager, skillbook)
pipeline = Pipeline(steps)The pipeline engine (pipeline/) is framework-agnostic with requires/provides contracts, immutable context, and error isolation. See Pipeline Design and Architecture.
| Command | Description |
|---|---|
ace setup |
Interactive setup β model selection, API keys, connection validation |
ace models <query> |
Search available models with pricing |
ace validate <model> |
Test a model connection |
ace config |
Show current configuration |
kayba |
Cloud CLI β upload traces, fetch insights, manage prompts |
ace-mcp |
MCP server for IDE integration |
- Full Documentation β Guides, API reference, examples
- Quick Start β 5-minute setup
- Setup Guide β Configuration and providers
- Hosted API Guide β Hosted CLI, trace upload, prompt install
- Architecture β Core concepts and system design
- Code Reference β Implementations, API, usage examples
- Design Decisions β Rejected alternatives and rationale
- Pipeline Engine β Step composition and context flow
- Examples β Runnable demos
- Changelog β Version history
Contributions are welcome. See Contributing Guidelines.
Built by Kayba and the open-source community.


