[RFC] Horizontal Scaling Support: stdio_bus Protocol Integration #14918
morozow
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
This proposal introduces stdio_bus protocol support for Codex, enabling horizontal scaling and multi-user support through a standardized process orchestration layer.
The Problem
Codex is designed as a single-user CLI tool. Each instance handles one user session. This works perfectly for individual developers, but creates significant challenges when running Codex as a service for multiple users.
Current Configuration Options
Option 1: One Process Per User
Problems:
Option 2: WebSocket Server Mode
Codex supports
--listen ws://0.0.0.0:PORT, but:Option 3: Build Custom Orchestration
You could build your own process manager, but you'd need to implement:
This is significant engineering effort, and every team running Codex at scale would need to solve the same problems independently.
The Solution: stdio_bus Protocol
stdio_bus is a lightweight process orchestration protocol that solves exactly this problem. It manages a pool of worker processes, routes requests based on session affinity, and handles all the operational concerns automatically.
How It Works
sequenceDiagram participant App as Your Application participant Bus as stdio_bus daemon participant W0 as Worker 0 participant W1 as Worker 1 participant W2 as Worker 2 Note over Bus: Session Router<br/>user-A → W0<br/>user-B → W1<br/>user-C → W2<br/>user-D → W0 App->>Bus: request {sessionId: "user-A"} Bus->>W0: route to worker 0 W0-->>Bus: response Bus-->>App: response App->>Bus: request {sessionId: "user-B"} Bus->>W1: route to worker 1 W1-->>Bus: response Bus-->>App: response App->>Bus: request {sessionId: "user-D"} Note over Bus: Same hash as user-A<br/>→ load balanced to W0 Bus->>W0: route to worker 0 W0-->>Bus: response Bus-->>App: responseKey Features
sessionIdalways route to the same workerProtocol
Communication uses NDJSON (newline-delimited JSON) over stdio:
The
sessionIdfield is the routing key. All messages with the samesessionIdgo to the same worker, preserving conversation state.Implementation
Changes to Codex
New
--workerflag forcodex-app-server:sessionIdin all responsesNew
codex-stdio-buscrate:Configuration Example
{ "pools": [{ "id": "codex", "command": "/usr/local/bin/codex-app-server", "args": ["--worker"], "instances": 10, "env": {"RUST_LOG": "info"} }], "limits": { "max_restarts": 5, "restart_window_sec": 60, "drain_timeout_sec": 30 }, "routing": { "session_id_field": "sessionId", "default_pool": "codex" } }Real-World Test Results
Tested with 5 concurrent users and 3 workers:
Session distribution:
All subsequent requests from each user routed to their assigned worker (session affinity confirmed).
Use Cases
1. Multi-Tenant SaaS
Operate Codex as a service for multiple customers:
2. IDE Backend
Power multiple IDE instances from a shared Codex pool:
3. CI/CD Integration
Run Codex tasks in parallel across a worker pool:
Comparison
Questions for Discussion
Default worker count: Should there be a recommended ratio of workers to expected concurrent users?
Session timeout: Should stdio_bus support automatic session cleanup after inactivity?
Metrics endpoint: Would a
/metricsendpoint for Prometheus integration be valuable?Alternative routing: Beyond session affinity, are there use cases for content-based routing?
Resources
I'd love to hear feedback from the community on this approach. Is horizontal scaling a pain point you've encountered? Are there other scalability scenarios we should consider?
Beta Was this translation helpful? Give feedback.
All reactions