Adding intelligent learning capabilities to OpenAI Codex CLI, enabling AI to learn from conversations and continuously improve
δΈζζζ‘£ / Chinese Documentation
This project is a fork and enhancement of OpenAI Codex CLI, adding the ACE (Agentic Context Engineering) intelligent learning framework on top of the original foundation.
- β Retains all original Codex CLI functionality
- β Adds intelligent learning and context memory capabilities
- β This documentation only covers ACE extension features
- β Does not include basic Codex CLI usage instructions
Need Codex CLI documentation? Visit OpenAI Codex CLI Official Repository
ACE (Agentic Context Engineering) is an intelligent context engineering framework that enables AI assistants to learn from conversation history, build an evolving knowledge base (Playbook), and provide relevant experience in subsequent conversations.
Based on the paper "Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models", ACE achieves intelligent learning through the following mechanisms:
- Context Adaptation: Improves performance by modifying input context rather than model weights
- Avoiding Brevity Bias: Retains detailed domain-specific knowledge instead of compressing into brief summaries
- Preventing Context Collapse: Uses incremental updates rather than complete rewrites to avoid information loss
- Playbook Evolution: Treats context as an evolving knowledge base that continuously accumulates and organizes strategies
- π§ Automatic Learning (Reflector) - Extracts tool usage, error handling, and development patterns from conversations
- π Knowledge Accumulation (Playbook) - Builds an evolving structured knowledge base
- π― Incremental Updates (Delta Updates) - Local updates instead of complete rewrites, preventing information loss
- π Grow-and-Refine - Balances knowledge expansion with redundancy control
- π Intelligent Retrieval - Context matching based on keywords and semantics
- β‘ High Performance - Extremely fast learning and retrieval (< 100ms)
- π Minimal Intrusion - Integrated via Hook mechanism without polluting original codebase
- π Ready to Use - Automatically creates configuration, works out of the box
git clone https://github.com/UU114/codeACE.git
cd codeACEπ‘ Windows Users Note It's recommended to use Git Bash instead of PowerShell for building to avoid path handling and command compatibility issues.
cd codex-rs
# Build release version
cargo build --release
# Or build debug version for development
cargo build⨠Starting from v1.0, ACE features are enabled by default during compilation, no additional feature flags needed!
To disable ACE features, you can use:
cargo build --release --no-default-features# Method 1: Using cargo install (recommended)
cargo install --path cli
# Method 2: Manually copy binary
cp target/release/codex ~/.local/bin/
# Or any other directory in your PATH# Usage is identical to Codex CLI
codex tui # Launch TUI interface
codex exec "your question" # Command line mode
# ACE works automatically in the background:
# - Before conversation (pre_execute Hook): Load relevant historical context
# - After conversation (post_execute Hook): Asynchronously learn and extract knowledge# Check ACE status
codex ace status
# You should see output similar to:
# π ACE (Agentic Coding Environment) Status
#
# Configuration:
# Enabled: β
Yes
# Storage: ~/.codeACE/ace
# Max entries: 500According to the paper, ACE introduces three key innovations to address limitations of existing methods:
- Problem: Previous methods had a single model handling all responsibilities, leading to quality degradation
- Solution: Separate evaluation and insight extraction into an independent Reflector role
- Effect: Significantly improves context quality and downstream performance (proven in Β§4.5 ablation study)
- Problem: Monolithic rewrites are expensive and prone to causing context collapse
- Solution: Use local, incremental delta updates that only modify relevant parts
- Effect: Reduces adaptation latency and computational cost by 82-92% (Β§4.6)
- Problem: Brevity bias leads to loss of domain-specific knowledge
- Solution: Balance stable context expansion with redundancy control
- Effect: Maintains detailed, task-specific knowledge, preventing information compression
User Query
β
[pre_execute Hook] Load relevant historical context
β
Generator: Generate reasoning trace and execution
β
[post_execute Hook] Asynchronous learning process:
ββ Reflector: Analyze trace, extract insights (can iterate multiple times)
ββ Curator: Generate delta context items
ββ Storage: Incrementally merge into Playbook
β
Complete (transparent to user)
# First query
$ codex "How do I run tests?"
> You can run tests using: cargo test
# ACE automatically learns:
β Extracted: Tool usage "cargo test"
β Tags: testing, tools
β Saved to playbook
# Second similar query
$ codex "Run unit tests"
> Based on previous experience, use: cargo test
> (Context automatically loaded β¨)ACE uses a separate configuration file (isolated from Codex CLI main configuration):
~/.codeACE/codeACE-config.toml
On first run, ACE automatically creates the configuration file, no manual configuration needed.
[ace]
enabled = true # Enable/disable ACE
storage_path = "~/.codeACE/ace" # Knowledge base storage path
max_entries = 500 # Maximum number of entries
[ace.reflector]
extract_patterns = true # Extract code patterns
extract_tools = true # Extract tool usage
extract_errors = true # Extract error information
[ace.context]
max_recent_entries = 10 # Maximum context entries per load
include_all_successes = true # Include all successful cases
max_context_chars = 4000 # Maximum context charactersMethod 1: Temporarily disable via config (keeps ACE code)
[ace]
enabled = falseMethod 2: Completely remove ACE features at compile time (reduce binary size)
cd codex-rs
cargo build --release --no-default-featuresPlaybook is the core knowledge base of the ACE system, used to store actionable knowledge extracted from conversations. Unlike traditional conversation history, Playbook is a structured, deduplicated, evolving long-term memory system.
| Feature | Playbook (Long-term Memory) | History Message (Short-term Memory) |
|---|---|---|
| Purpose | Store reusable knowledge and patterns | Maintain current conversation context continuity |
| Lifecycle | Persists across sessions | Limited to current session |
| Content | Refined insights, patterns, best practices | Complete user-AI conversation sequences |
| Information Density | High (compressed essence) | Low (includes all details) |
| Storage Efficiency | Saves 76% space vs raw conversations | Full conversation storage |
| Retrieval Method | Semantic + keyword intelligent matching | Sequential loading |
| Update Mechanism | Incremental Delta updates | Append new messages |
Key Conclusion: Both work together, cannot replace each other
- History Message provides fluency and context of current conversation
- Playbook provides accumulated knowledge and experience from the past
Each Playbook entry includes:
PlaybookEntry {
id: String, // Unique identifier (UUID v4)
timestamp: DateTime, // Creation time
context: String, // Execution context (user question, task description)
insights: Vec<String>, // List of extracted insights
tags: Vec<String>, // Classification tags (tools, testing, error_handling, etc.)
metadata: {
session_id: String, // Session identifier
success: bool, // Whether execution was successful
relevance_score: f32 // Relevance score (for retrieval)
}
}Storage Format: JSONL (JSON Lines)
- One complete JSON object per line
- Append-only writes, excellent performance (< 1ms)
- Easy for streaming and incremental parsing
- Only update relevant parts, no complete Playbook rewrite
- Reduces update cost by 82-92% (compared to complete rewrites)
- Prevents "Context Collapse"
- Automatically detects similar entries (based on semantics and keywords)
- Merges redundant information, keeps knowledge base compact
- Retains detailed domain-specific knowledge (avoids brevity bias)
- Keyword Matching: Fast filtering based on tags and context
- Semantic Search (planned): Relevance ranking based on embeddings
- Hybrid Strategy: Combines temporal recency and relevance scoring
- Triggered when entry count exceeds configured limit (default 500)
- Old data automatically moved to
archive/directory - Archive files named by timestamp for easy tracing
ACE provides a suite of management tools to view and manage learning content:
codex ace status # View learning status and statistics
codex ace show # Display learning content (default 10 items)
codex ace search # Search knowledge base
codex ace config # View configuration
codex ace clear # Clear knowledge base (auto-archive)In Codex TUI interactive interface, you can use the following slash commands to quickly access playbook:
/playbook # Display playbook status (alias: /pb)
/playbook-show # Show recent learning entries (alias: /pbs)
/playbook-clear # Clear playbook (alias: /pbc)
/playbook-search # Search playbook (alias: /pbsearch, /pbq)For faster access, the following short aliases are supported:
| Full Command | Alias | Description |
|---|---|---|
/playbook |
/pb |
View status |
/playbook-show |
/pbs |
Display entries |
/playbook-clear |
/pbc |
Clear data |
/playbook-search |
/pbsearch, /pbq |
Search content |
# CLI commands
codex ace show --limit 5
codex ace search "rust async"
codex ace status
# TUI slash commands (in Codex conversation)
/pb # Quick playbook status view
/pbs # Show recent learning entries
/pbq error # Search entries containing "error"LAPS (Lightweight Adaptive Playbook System) is CodeACE's innovative implementation approach for Playbook management, optimized and simplified for real-world engineering applications while maintaining the core principles of the ACE paper.
Problem: Traditional knowledge base management systems typically require complex databases, indexing systems, and query engines.
LAPS Solution:
- β Zero Database Dependencies: Uses JSONL plain text format
- β
Minimal Storage: Single file
playbook.jsonl, human-readable - β Fast Startup: No database initialization needed, auto-creates on first run
- β Easy Backup: Simple file copy for backup
- β Portability: Seamless migration across platforms and systems
Performance Metrics:
Write Latency: < 1ms (append-only writes)
Read Performance: < 10ms (100 entries full load)
Retrieval Speed: < 50ms (keyword filter + relevance sort)
Storage Overhead: ~500KB (500 typical entries)
Problem: Fixed knowledge extraction strategies cannot adapt to different scenario requirements.
LAPS Solution:
Traditional method problems:
- β Record all details β rapid context bloat
- β Over-compression β loss of critical information (brevity bias)
- β Undifferentiated recording β noise drowns valuable information
LAPS adaptive strategy:
A. Compressed Essence Principle
One conversation β typically 1 refined insight (200-800 characters)
Complex tasks β can generate 2-3 insights (different aspects)
Simple operations β may generate none (trivial operations filtered)
B. 7 Core Information Dimensions Each insight should include:
- User Requirements - Clear task objectives
- What Was Done - Specific operations executed
- Why - Rationale for choosing this approach
- Outcomes - Final achieved results
- Problems Solved - Obstacles encountered and resolved
- Unresolved Issues - Remaining problems or limitations
- Future Plans - Suggested improvement directions
C. Intelligent Filtering Rules
// Content NOT recorded
- Trivial operations: ls, cat, pwd and other read-only commands
- Temporary attempts: unsuccessful intermediate steps
- Repeated operations: already recorded patterns
// Content MUST be recorded
- Successful solutions and final code
- Error handling and debugging experience
- Tool usage best practices
- Unresolved issues and failed attempts (with reasons)Effect: Context bloat rate reduced by 80% (from 2000 chars/conversation β 400 chars/conversation)
Automatically adjust entry weights based on usage feedback:
Successfully applied β relevance_score += 0.1
Marked misleading β relevance_score -= 0.2
Long-term unused β relevance_score *= 0.9 (decay)
Dynamically adjust loaded context amount based on query complexity:
Simple query β Load Top 5 relevant entries
Medium query β Load Top 10 relevant entries
Complex task β Load Top 20 + all successful cases
Core Innovation: Organize knowledge as "executable playbooks" rather than passive documents
| Traditional Knowledge Base | LAPS Playbook |
|---|---|
| Static document collection | Dynamically evolving action guide |
| "Know what" (What) | "How to do" (How) + "Why do" (Why) |
| Requires manual interpretation | AI can directly apply |
| Fragmented information | Structured + contextual association |
| Passive query | Proactive recommendation |
Playbook Entry Example:
{
"id": "pb-2024-001",
"timestamp": "2024-11-19T10:30:00Z",
"context": "User requests to optimize Rust project compilation performance",
"insights": [
"Using cargo build --timings visualizes compilation bottlenecks, found codex-core compilation takes 45% of total time",
"By adding incremental = true and parallel = true to Cargo.toml, compilation time reduced by 30%",
"Key optimization: Split large mod.rs into multiple small files to improve incremental compilation efficiency"
],
"tags": ["rust", "performance", "compilation", "cargo"],
"metadata": {
"session_id": "session-123",
"success": true,
"relevance_score": 0.95
}
}| Metric | Full Conversation History | LAPS Playbook | Advantage |
|---|---|---|---|
| Space Efficiency | Baseline (100%) | 24% | Saves 76% |
| Information Density | Baseline (1x) | 4.18x | 318% increase |
| Retrieval Speed | Traverse all messages | Keyword+relevance | 10-50x faster |
| Cross-session | β Not supported | β Supported | Long-term memory |
| Deduplication | β None | β Automatic | Avoid redundancy |
| Feature | Vector DB (Pinecone/Weaviate) | LAPS | LAPS Advantage |
|---|---|---|---|
| Dependencies | Requires external service/process | Zero dependencies | β Simple |
| Startup Time | Seconds to minutes | < 10ms | β Fast |
| Storage Cost | Cloud service fees or local resources | Local files | β Free |
| Readability | Binary/proprietary format | Plain text JSON | β Transparent |
| Semantic Search | β Native support | π Planned | |
| Exact Match | β Keyword precise | β Reliable |
- Zero-configuration startup: Auto-creates required files on first run
- No external dependencies: No database, vector engine, etc. needed
- Low resource usage: Memory < 10MB, Storage < 1MB
- Cross-platform compatible: Windows/macOS/Linux identical
- Non-blocking writes: Asynchronous append-only writes
- Efficient reads: Incremental JSONL parsing
- Fast retrieval: Two-tier indexing (tags + relevance)
- Scalable: Supports 10,000+ entries (tested)
- Auto-deduplication: Prevents knowledge base bloat
- Relevance learning: Adjusts based on usage feedback
- Context adaptation: Dynamically adjusts load amount
- Essence extraction: 80% compression while retaining key information
- Human-readable: Standard JSON format
- Easy debugging: Directly view/edit JSONL files
- Version control: Can be managed with Git
- Auto-archiving: Prevents infinite growth
Storage Layer: JSONL (plain text)
Index Layer: HashMap (tags) + BTreeMap (time)
Retrieval Layer: Keyword matching + TF-IDF relevance
Learning Layer: Incremental Delta updates + weight adjustment
Interface Layer: CLI commands + TUI slash commands + Hook integration
LAPS evolution roadmap:
Phase 1 β (Completed)
- Basic JSONL storage
- Keyword retrieval
- CLI/TUI management commands
Phase 2 π§ (In Progress)
- Complete incremental Delta update implementation
- Reflector insight extraction optimization
- Relevance scoring algorithm improvements
Phase 3 π (Planned)
- Hybrid retrieval: Keyword + semantic vectors
- Multi-project knowledge isolation
- Knowledge graph associations
- Visual management interface
codeACE/
βββ codex-rs/ # Rust implementation (main code)
β βββ core/
β β βββ src/ace/ # ACE core modules β
β β βββ mod.rs # Main plugin
β β βββ config_loader.rs # Configuration loading
β β βββ storage.rs # Storage system
β β βββ reflector.rs # Knowledge extraction
β β βββ curator.rs # Bullet generation
β β βββ cli.rs # CLI commands
β β βββ types.rs # Data types
β βββ cli/ # CLI entry point
β βββ tui/ # TUI interface
βββ docs/
β βββ readme-zh.md # Chinese documentation
β βββ ACE_Configuration_Guide.md # Detailed configuration guide
βββ README.md # This file
β = ACE core files
ACE adopts a modular agentic architecture, decomposing tasks into three specialized roles:
Generates reasoning traces and executes tasks:
- Receives user queries and relevant Playbook context
- Executes multi-turn reasoning and tool calls
- Marks which bullets are useful or misleading
- Provides feedback to Reflector
Core Innovation: Independent evaluation and insight extraction module
- π Analyzes execution traces, identifies successful strategies and failure patterns
- π‘ Extracts actionable insights
- π Supports Iterative Refinement
- βοΈ Avoids brevity bias, retains detailed domain knowledge
Essence Extraction Strategy β¨ (v1.0 new)
- π― Compressed Essence: One conversation typically generates 1 refined insight (200-800 characters)
- π Final Results Only: For code modified multiple times, only record the final successful version
- π§Ή Intelligent Filtering: Trivial operations (ls, cat) not recorded, unresolved issues must be recorded
- π 80% Context Bloat Reduction: From average 2000 chars/conversation down to 400 chars
- π 7 Core Information Points: User requirements, what was done, why, outcomes, problems solved, unresolved issues, future plans
Integrates insights into structured delta updates:
- π Generates compact delta context items (candidate bullets)
- π Uses lightweight non-LLM logic to merge into existing Playbook
- π Manages bullet metadata (ID, counters, etc.)
- π« Deduplication and redundancy control
Efficient JSONL format storage:
- β‘ Append-only writes (< 1ms)
- π Fast reads (100 entries < 10ms)
- π Embedding-based semantic search
- π¦ Auto-archiving (when limit exceeded)
Storage Location: ~/.codeACE/ace/playbook.jsonl
Minimally intrusive integration into Codex CLI:
pre_execute: Load relevant context before executionpost_execute: Asynchronously learn after execution (non-blocking to user)
# Run all ACE tests (ACE enabled by default)
cargo test
# Run specific tests
cargo test ace_e2e
cargo test ace_learning_test
# Run core package tests
cargo test -p codex-core- β E2E integration tests: 10/10 passed
- β Runtime integration tests: 1/1 passed
- β Configuration system: 100%
- β Hook system: 100%
- β CLI commands: 100%
- β Playbook context tests: 5/5 passed π
Test Date: 2025-11-19
Core Question: Can Playbook replace History Message?
Test Results: β All tests passed (5/5)
# Run Playbook context tests
cd codex-rs
cargo test --test playbook_context_test --features ace -- --nocaptureKey Findings:
| Metric | Result |
|---|---|
| Information Density | Playbook 4.18x higher than full conversation |
| Space Savings | 76.1% |
| Retrieval Accuracy | β Successfully retrieves relevant domain knowledge |
| Long-term Memory | β Achieves cross-session knowledge reuse |
Core Conclusion: β Playbook cannot and should not completely replace History Message
- History Message: Provides current conversation context and continuity (short-term memory)
- Playbook: Provides past learned knowledge and best practices (long-term memory)
- Correct Approach: Both work together, complementing each other
Detailed test report: codex-rs/test20251119/ζ΅θ―η»ζ.md
- β Configuration system (auto-creation)
- β Hook system (pre/post execute)
- β Storage system (JSONL + Playbook)
- β CLI commands (5 commands)
- β Test coverage (11/11 passed)
- β ACE module compiled by default (simplified build process)
- β³ Reflector implementation (pattern extraction)
- β³ Curator implementation (Bullet generation)
- β³ Relevance retrieval optimization
- π Semantic vector retrieval
- π Multi-project knowledge isolation
- π Knowledge export/import
- π Visualization interface
Contributions welcome! Whether bug reports, feature suggestions, or code submissions.
- Fork this repository
- Create feature branch (
git checkout -b feature/AmazingFeature) - Commit changes (
git commit -m 'Add some AmazingFeature') - Push to branch (
git push origin feature/AmazingFeature) - Open Pull Request
If you encounter issues, please submit on Issues page.
- ACE Configuration Guide
- ACE Paper - Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
- Paper Authors: Qizheng Zhang et al. (Stanford University, SambaNova Systems, UC Berkeley)
- Paper Link: arXiv:2510.04618
This project is based on OpenAI Codex CLI and follows the original project's license.
The ACE framework extension is independently developed and uses MIT License.
- OpenAI - Providing Codex CLI foundation
- ACE Paper Authors - Providing Agentic Context Engineering theoretical foundation
- Qizheng Zhang, Changran Hu, Shubhangi Upasani, Boyuan Ma, Fenglu Hong, et al.
- Stanford University, SambaNova Systems, UC Berkeley
- All contributors and users
- Project Homepage: https://github.com/UU114/codeACE
- Bug Reports: https://github.com/UU114/codeACE/issues
Let AI learn from conversations, make programming more intelligent!
Made with β€οΈ by the CodeACE Community