Skip to content

docs(adr): openab-agent — native Rust coding agent#922

Merged
thepagent merged 3 commits into
mainfrom
feat/openab-agent-adr
May 27, 2026
Merged

docs(adr): openab-agent — native Rust coding agent#922
thepagent merged 3 commits into
mainfrom
feat/openab-agent-adr

Conversation

@chaodu-agent
Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent commented May 26, 2026

Summary

Proposes openab-agent — a single Rust binary that is both the ACP server and the coding agent, eliminating the adapter/wrapper layer entirely.

Architecture

┌─ Single Rust Binary ─────────────────────────┐
│                                               │
│  openab-agent                                 │
│  ├─ ACP server (stdin/stdout JSON-RPC)        │
│  ├─ LLM client (reqwest + SSE)                │
│  ├─ Tools: read, write, edit, bash            │
│  └─ Session tree (branching history)          │
│                                               │
└───────────────────────────────────────────────┘

Comparison with Existing Agents

Existing agents openab-agent
Architecture openab → stdio → adapter → CLI → LLM openab → stdio → agent → LLM
Layers 3-4 1
Runtime Node.js / Python / Go None (static binary)
Image size 300-800MB ~20MB (distroless)
Cold start 1-3s <50ms
ACP Requires wrapper (pi-acp, codex-acp, agy-acp) Native, zero overhead

Required Crates

  • reqwest — HTTP client (LLM API calls)
  • serde / serde_json — JSON serialization
  • tokio — async runtime (already used in openab)
  • tokio-process — bash tool execution

Nothing else needed. Can even share code with openab core (ACP types, session pool logic).

Key Advantage

Because we own the agent, deep integration is possible — future library mode can bypass stdio entirely, using in-process function calls to eliminate all IPC overhead.

Key Design Decisions

  • 4 tools only: read, write, edit, bash (inspired by Pi's minimal approach)
  • No external runtime: no Node.js, no npm, no Python — single static binary (~20MB image)
  • Native ACP: built-in JSON-RPC over stdio, no wrapper needed
  • Multi-model: thin HTTP client supports OpenAI, Anthropic, Google (no SDK dependency)
  • Session trees: branching conversation history for code exploration
  • Future library mode: same language as openab core enables in-process integration

ADR Location

docs/adr/openab-agent.md

Discussion

Discord thread: https://discord.com/channels/1486155598964719616/1508241504945045577

…-in ACP

Single binary, no external runtime, no adapter wrapper needed.
Inspired by Pi's minimal design (4 tools, tiny prompt, session trees)
but implemented in Rust for zero-overhead ACP integration with openab core.
@chaodu-agent chaodu-agent requested a review from thepagent as a code owner May 26, 2026 02:26
@chaodu-agent chaodu-agent self-assigned this May 26, 2026
- Fix LlmProvider trait: use BoxStream instead of bare Stream trait
  (compile error: Stream is a trait, not a concrete type)
- Fix tokio-process deprecation: use tokio::process (merged since 0.2)
- Add futures crate for Stream/BoxStream
- Add Testing Strategy section (unit test boundaries, hand-written
  mocks, integration test tags, CI pipeline)
- Remove deprecated sandbox-exec reliance on macOS
- Add explanatory notes on trait object safety design decisions
@chaodu-agent
Copy link
Copy Markdown
Collaborator Author

CHANGES REQUESTED ⚠️ — Solid native Rust ADR; critical Rust trait compile issues and dependency errors have been fixed in latest commit. Remaining items are non-blocking.

What This PR Does

Proposes an ADR for openab-agent — a native Rust coding agent binary with built-in ACP support, eliminating Node.js/Python adapter layers.

How It Works

Single Rust binary implementing ACP over stdio JSON-RPC, with 4 tools (read/write/edit/bash), multi-provider LLM client via HTTP+SSE, and a session tree for branching conversation history.

Findings

# Severity Finding Location Status
1 🔴 LlmProvider trait returned bare Stream (a trait, not a type) — would not compile §3 LLM Client Design ✅ Fixed (now uses BoxStream)
2 🔴 Path traversal attack possible if paths not canonicalized §4 Tool Implementation ✅ Already addressed (canonicalize + boundary check)
3 🔴 Env var leakage — child processes could exfiltrate API keys §4 Bash Tool ✅ Already addressed (allow-list filtering)
4 🔴 Child process orphan leak on timeout §4 Sandboxing ✅ Already addressed (setsid + kill(-pgid))
5 🟡 tokio-process crate is deprecated (merged into tokio::process) §2 Required Crates ✅ Fixed
6 🟡 No testing strategy defined ✅ Fixed (added §9 Testing Strategy)
7 🟡 sandbox-exec on macOS is deprecated by Apple §4 Cross-Platform ✅ Fixed (removed reliance, use env filtering as primary)
8 🟡 Session tree GC / unbounded growth risk §5 Session Tree ✅ Already addressed (GC section with TTL, depth cap, pruning)
9 🟢 Excellent native Rust architecture — cold start <50ms, image ~20MB Overall
10 🟢 Clean session tree branching model §5
Finding Details

🔴 F1: LlmProvider Stream Trait Compile Error (FIXED)

Stream<Event> is a trait, not a concrete type. Returning it from a trait method requires boxing. Fixed by using BoxStream<'a, Event> (i.e., Pin<Box<dyn Stream<Item = Event> + Send + 'a>>). Also desugared async fn to boxed future for object safety.

🔴 F2–F4: Security Findings (Already Addressed)

Path traversal, env var leakage, and child process cleanup were already addressed in the ADR with canonicalization, allow-list env filtering, and setsid+kill(-pgid) respectively.

🟡 F5: tokio-process Deprecation (FIXED)

Replaced tokio-process with tokio::process (merged since tokio 0.2). Added futures crate for Stream/BoxStream.

🟡 F6: Testing Strategy (FIXED)

Added §9 Testing Strategy covering unit test boundaries, hand-written mock pattern for LlmProvider, integration test tagging, and CI pipeline structure.

🟡 F7: sandbox-exec Deprecation (FIXED)

Removed reliance on sandbox-exec. macOS now uses process group isolation + env filtering as primary mechanism.

What's Good (🟢)
  • Native Rust architecture eliminates 3-4 layers of indirection, achieving <50ms cold start and ~20MB image size
  • Session tree model enables branching exploration without losing context
  • Minimal tool surface (4 tools) keeps the design focused and auditable
  • Provider-agnostic LLM client with version pinning and CI canary is well thought out

Reviewers: <@普渡法師> <@覺渡法師>

Latest commit: 447ee89 — addresses all 🔴 critical and 🟡 important findings.

All critical issues are now resolved. Remaining open questions (§10) are non-blocking design decisions for implementation phase.


1️⃣ Approve PR
2️⃣ Request contributor changes and re-review
3️⃣ Close PR
4️⃣ I'll fix it myself, then have 法師團隊 review again

chaodu-agent added a commit that referenced this pull request May 26, 2026
Implements the v0.1 scope from ADR (PR #922):
- ACP layer: stdin/stdout JSON-RPC (initialize, session/new, session/prompt, session/cancel)
- LLM client: Anthropic provider with non-streaming API (BoxStream deferred to v0.2)
- 4 tools: read, write, edit, bash
  - Path traversal protection (canonicalize + boundary check)
  - Environment variable filtering (allow-list only)
  - Process group kill on timeout (setsid + kill(-pgid))
- Agent core: system prompt + tool dispatch loop (max 50 iterations)
- Unit tests: hand-written MockLlmProvider, tests for prompt assembly,
  tool dispatch, error handling, and multi-step tool chains
- Flat session (session tree deferred to v0.2)

Architecture:
  openab ──stdio JSON-RPC──► openab-agent ──HTTP──► Anthropic API

No external runtime required. Single binary, ~20MB when compiled.
thepagent added a commit that referenced this pull request May 26, 2026
…924)

* feat: openab-agent v0.1 — native Rust coding agent with ACP

Implements the v0.1 scope from ADR (PR #922):
- ACP layer: stdin/stdout JSON-RPC (initialize, session/new, session/prompt, session/cancel)
- LLM client: Anthropic provider with non-streaming API (BoxStream deferred to v0.2)
- 4 tools: read, write, edit, bash
  - Path traversal protection (canonicalize + boundary check)
  - Environment variable filtering (allow-list only)
  - Process group kill on timeout (setsid + kill(-pgid))
- Agent core: system prompt + tool dispatch loop (max 50 iterations)
- Unit tests: hand-written MockLlmProvider, tests for prompt assembly,
  tool dispatch, error handling, and multi-step tool chains
- Flat session (session tree deferred to v0.2)

Architecture:
  openab ──stdio JSON-RPC──► openab-agent ──HTTP──► Anthropic API

No external runtime required. Single binary, ~20MB when compiled.

* fix: address review findings from 覺渡 (PR #924)

- Fix validate_path: remove create_dir_all side-effect that could create
  directories outside working_dir before boundary check. Now uses parent
  traversal to find nearest existing ancestor without modifying filesystem.
- Fix unit test isolation: mark all FS/process tests with #[ignore] per
  Unit Test Strategy ADR. Only pure logic tests run by default.
- Fix unsafe setsid: check return value, propagate error on failure.
- Fix OPENAB_AGENT_BASH_ENV_ALLOW: read comma-separated env var and pass
  allowed keys to build_env for subprocess inheritance.
- Fix streaming capability: set to false in ACP initialize response since
  v0.1 uses non-streaming Anthropic Messages API.

* fix: address review findings from 普渡 (PR #924)

- 🔴 F1: Add context window truncation (MAX_CONTEXT_MESSAGES=100),
  drops oldest messages preserving first user prompt
- 🟡 F3: Return error when MAX_TOOL_LOOPS exhausted instead of empty string
- 🟡 F5: Fail-fast on missing/empty ANTHROPIC_API_KEY at session creation
- 🟡 F6: Add retry with exponential backoff for 429/529 (max 3 retries)
- 🟡 F7: Await child process after SIGKILL to prevent zombies
- 🟡 F8: Add TODO(v0.2) for session/cancel implementation
- 🟡 F9: Validate empty prompt, return -32602 error
- 🟡 F10: Add TODO(v0.2) for session TTL/cleanup
- 🟡 F11: Remove std::env::set_var from tests (UB in multi-thread)

* fix: truncate_context pair-drain + test_session_new CI fix

- N1: truncate_context now drains in pairs (assistant+user) to maintain
  strict role alternation required by Anthropic API
- N2: test_session_new sets fake ANTHROPIC_API_KEY for CI environments;
  added test_session_new_missing_key for error case

* feat: add subscription auth via OAuth device flow

Add support for Codex/OpenAI subscription authentication:

CLI:
  openab-agent auth codex-oauth  # device flow login
  openab-agent auth status       # show stored credentials
  openab-agent                   # ACP mode (default)

Auth flow:
- Device code flow against OpenAI auth endpoints
- Prints verification URL + user code for headless environments
- Polls until user authorizes in browser
- Stores tokens at ~/.openab/agent/auth.json (0600 perms)
- Auto-refreshes expired tokens with 120s skew

Provider fallback in ACP mode:
1. ANTHROPIC_API_KEY env var → Anthropic provider
2. ~/.openab/agent/auth.json → OpenAI provider (subscription)
3. Error with instructions if neither available

OpenAI provider:
- Full OpenAI chat completions API format
- Tool call translation (Anthropic format ↔ OpenAI format)
- Retry with exponential backoff on 429/529

* fix: address auth review findings from 覺渡

- 🔴 save_tokens permission race: use OpenOptions::mode(0o600) to create
  file with correct permissions atomically (no window for exposure)
- 🔴 OPENAB_AGENT_PROVIDER ignored: respect env var to force provider
  selection (anthropic|openai|codex), auto-detect only when unset
- 🟡 Token masking: safe display for any token length (>12 chars shows
  first 8 + last 4, otherwise shows ****)

* fix: address 普渡 findings F2-F8 (auth + OpenAI provider)

- F2: Add OPENAB_AGENT_OPENAI_MODEL + OPENAB_AGENT_OPENAI_BASE_URL env
  vars so OpenAI and Anthropic model namespaces don't conflict
- F3: Increase poll interval on 'slow_down' per RFC 8628 Section 3.5
- F4: Add 10-minute wall-clock timeout to device flow polling loop
- F5: Enforce minimum 5s polling interval regardless of server response
- F6: Retry on 401 with token refresh; fetch fresh token each attempt
- F8: Token masking already fixed in previous commit

F1 (client_id) and F9 (scope) pending 主人 decision.

* fix: make OAuth client_id configurable (F1)

Default: 'app_scp_codex_prod_001' (same as Codex CLI, public client)
Override: OPENAB_AGENT_OAUTH_CLIENT_ID env var

This allows users to register their own OAuth app if needed, while
maintaining compatibility with existing Codex subscriptions by default.

* fix: address remaining 普渡 findings F6-F8

- F6: Add force_refresh() that bypasses expiry check; 401 handler now
  calls force_refresh() to guarantee a new token on next retry
- F7: Add unit tests for parse_openai_response (text, tool_call, empty)
  and auth module (is_expired, auth_path, codex_client_id)
- F8: Token masking confirmed already fixed (len>12 check at line 244)

* ci: add CI workflow + Dockerfile for openab-agent

CI (.github/workflows/ci-openab-agent.yml):
- cargo fmt --check
- cargo clippy -- -D warnings
- cargo test (unit tests)
- cargo test -- --ignored (integration tests)

Dockerfile (Dockerfile.openab-agent):
- Multi-stage build: rust:1-bookworm → distroless/cc-debian12
- Final image ~20MB, runs as nonroot
- Entrypoint: openab-agent (ACP mode by default)

* fix: CI compilation errors

- Fix borrow-after-move: capture child pid before wait_with_output
- Fix unused import: restructure cfg(unix) block for pre_exec
- Fix unused variable: remove dead content Vec in OpenAiProvider
- Fix extra closing brace

* fix: resolve all clippy warnings

- Remove unused imports (anyhow::Result, serde_json::Value, Path)
- Mark Agent::new as #[cfg(test)] (only used in tests)
- Allow dead_code on LlmEvent::Error variant (reserved for future use)
- Move CommandExt import to top-level #[cfg(unix)]
- Remove unnecessary mut on child variable
- Collapse if-in-match per clippy suggestion

* fix: CI errors — cannot move in pattern guard + unused import

- Revert collapsed match (cannot move l in guard), add clippy allow
- Move CommandExt import inside unsafe block with allow(unused_imports)

* fix(openab-agent): correct device code auth flow for OpenAI

- Use https://auth.openai.com/api/accounts/deviceauth/usercode endpoint
- Use server-provided code_verifier (not client PKCE)
- Token exchange at /oauth/token with redirect_uri=https://auth.openai.com/deviceauth/callback
- Handle OpenAI's nested error format (error.code)
- Correct client_id: app_EMoamEEZ73f0CkXaXp7hrann

* refactor(openab-agent): remove unused PKCE deps (base64, sha2, getrandom)

* refactor(openab-agent): remove unused PKCE deps from Cargo.toml

* ci: add ACP smoke test — verify binary starts and speaks ACP

Sends an 'initialize' JSON-RPC request to the built binary and verifies
the response contains valid ACP agentInfo. This catches protocol-level
regressions without needing LLM credentials.

* style: cargo fmt

* fix: remove unused constants (CODEX_SCOPES, CODEX_AUDIENCE)

* feat(openab-agent): browser PKCE + device code auth flows

* feat(openab-agent): switch to Responses API (chatgpt.com/backend-api/codex/responses)

* feat(openab-agent): add codex-device subcommand + --no-browser flag

* fix(openab-agent): test cleanup for OAuth auth

* feat(openab-agent): add deps for browser PKCE flow (base64, sha2, getrandom, urlencoding, open, url)

* docs(openab-agent): guide headless users through copy-paste callback flow

* feat(openab-agent): paste-based callback for headless auth + simplified flow params

* fix(openab-agent): stop tool loop when text response received

* feat(openab-agent): Responses API with SSE stream parsing

* fix(openab-agent): correct Responses API input format for tool results (function_call_output)

* feat(openab-agent): read AGENTS.md from cwd as custom system prompt

* feat: add Dockerfile.openab-agent — native Rust agent (~20MB image)

* rename: Dockerfile.openab-agent → Dockerfile.native

* rename: remove old Dockerfile.openab-agent

* docs: add native-agent.md

* docs(README): add Native Agent to Other Agents table

* docs(native-agent): clarify AGENTS.md support, note Skills not yet supported

* docs(native-agent): note MCP not yet supported

* docs(native-agent): note MCP not yet supported

* style: cargo fmt

* style: cargo fmt

* style: cargo fmt

* style: cargo fmt

* fix: resolve clippy warnings (dead_code, is_multiple_of)

---------

Co-authored-by: chaodu-agent <chaodu-agent@users.noreply.github.com>
Co-authored-by: thepagent <hehsieh1010@gmail.com>
@thepagent thepagent merged commit 41a065c into main May 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants