Skip to content

goldtetsola/opencode-bridge

Repository files navigation

OpenCode Bridge

Use OpenCode Go OSS models (DeepSeek V4 Pro, Kimi K2.6, DeepSeek V4 Flash) as native Codex subagents — with full tool-loop support, multi-turn conversation, reasoning preservation, and orchestration routing.

Codex speaks the OpenAI Responses API. OpenCode Go exposes Chat Completions. This bridge sits in the middle, translating between them so Codex can spawn DeepSeek and Kimi workers the same way it spawns GPT workers.

Critical: Architecture warning

Do NOT set model_provider = "opencode_bridge" as your top-level Codex provider. Codex will route GPT-5.5 orchestrator requests through the bridge, which cannot serve GPT models (OpenCode Go rejects them). This causes timeouts on any operation requiring orchestration (reads, writes, multi-turn tool loops).

Correct architecture:

Parent session: GPT-5.5 (native openai provider)
OSS subagents only: opencode_bridge provider

The bridge is a subagent-only provider. Use model_provider = "opencode_bridge" in agent TOMLs only. Your .codex/config.toml should NOT set a top-level model_provider to opencode_bridge.

For direct codex exec testing without subagents, use v6 compatibility mode: start the bridge with GPT_MODEL_STRATEGY=oss to alias GPT requests to OSS models. This is for bridge testing only — not the recommended production setup.

Quick start

# 1. Clone the bridge
git clone https://github.com/goldtetsola/opencode-bridge.git ~/bridge

# 2. Set your OpenCode Go key
echo 'OPENCODE_GO_API_KEY=sk-...' > ~/bridge/.codex-oss/env/opencode-go.env

# 3. cd into your Codex project and install
cd ~/your-codex-project
python3 ~/bridge/bin/codex-oss install

# 4. Start the bridge
OPENCODE_GO_API_KEY=sk-... python3 ~/bridge/bin/codex-oss start --mode production

# 5. Verify everything
python3 ~/bridge/bin/codex-oss doctor
# Expected: 22 passed, 0 warnings, 0 failed

That's it. codex-oss install generates all config — .codex/config.toml, agent TOMLs, AGENTS.md routing rules, and recursive-codex-exec blocking. codex-oss doctor checks 22 invariants and tells you exactly what to fix.

What codex-oss does

Command What it does
install Generates provider config, 3 agent TOMLs, AGENTS.md delegation contract, recursive-codex blocking rules, runtime directories, and gitignore entries.
doctor Checks 22 invariants: config correctness, agent configuration, AGENTS.md compliance, recursive-codex exec blocking, bridge health, GPT leakage, OSS inference, state DB persistence. PASS/WARN/FAIL with fix instructions. Supports --json for automation.
start Launches the bridge with mode selection (production/compat-test/openai). Refuses to start in production mode without a valid key. Sets project-local state paths.
stop Graceful shutdown via PID file or port.
status Queries the bridge health endpoint — shows version, mode, model health, and concurrency config.

Architecture

Codex Desktop / CLI
    │
    │  Parent session: GPT-5.5 (native openai provider)
    │  OSS subagents only: opencode_bridge provider
    │
    ├─ GPT-5.5 orchestrator (native)
    │     │
    │     ├─ GPT-5.4 worker (native)
    │     └─ OSS subagent spawn
    │           │
    │           │  Responses API (SSE streaming, live upstream)
    │           ▼
    │     bridge.py   ← this repo
    │           │
    │           │  Chat Completions API (stream=true)
    │           ▼
    │     api.opencode.ai/zen/go/v1
    │           │
    │           ▼
    │     DeepSeek V4 Pro / Kimi K2.6 / Flash

Bridge is a subagent-only provider. Do NOT set model_provider = "opencode_bridge" as your session-wide provider. The bridge rejects GPT-5.5 requests (or aliases them to OSS in compatibility mode, which is for testing only).

The bridge handles:

  • Protocol translation: Responses API ↔ Chat Completions (request format, tool definitions, output items)
  • SSE streaming with heartbeat: Sends response.created immediately, then heartbeat comments during upstream processing. Prevents Codex timeouts on complex queries with long reasoning.
  • Tool type filtering: Strips hosted tools (image_generation, web_search, code_interpreter), MCP namespaces, and app/connector tools that OSS providers reject
  • Tool format conversion: Responses flat format → Chat Completions nested function wrapper, with name sanitization for strict providers
  • Reasoning preservation: DeepSeek V4 Pro requires reasoning_content to be replayed across multi-turn tool calls. The proxy stores and injects it correctly
  • Conversation state: Tracks response history in SQLite so tool round-trips survive proxy restarts. Matches orphan function_call_output items to cached function_call items by call_id
  • Context preservation: Repairs conversation history so earlier completed assistant→tool exchanges are preserved (not truncated), while incomplete tails are dropped
  • Retry + fallback: Retries transient upstream errors with exponential backoff. Falls back to alternate models on capacity errors
  • Developer role mapping: Maps Codex's developer role to system for providers that reject it (DeepSeek, Kimi)
  • GPT model handling (v6): Detects and rejects GPT-5.5/5.4 requests hitting the bridge by mistake. Configurable via GPT_MODEL_STRATEGYerror (immediate rejection, default), oss (alias to OSS for compatibility testing), or openai (API passthrough)
  • Live upstream streaming (v5+): Uses stream=true against OpenCode Go and translates Chat Completions chunks to Responses SSE deltas in real time

Environment variables

Variable Default Description
OPENCODE_GO_API_KEY (required) Your OpenCode Go API key
PROXY_API_KEY LITELLM_MASTER_KEY value Key Codex sends to authenticate with the proxy
LITELLM_MASTER_KEY sk-local-codex-bridge Auth key (shared name for Codex config compatibility). Leave empty for no auth on localhost.
PROXY_PORT 4000 Port the proxy listens on
PROXY_STATE_DB /tmp/opencode_responses_proxy_state.sqlite3 SQLite file for conversation state
FORCE_SINGLE_TOOL_INSTRUCTIONS 0 Set to 1 to inject a guard discouraging parallel tool calls
FALLBACK_MODEL_MAP_JSON deepseek→kimi/flash fallback JSON map of model→fallback chain
UPSTREAM_TIMEOUT_SECONDS 240 Timeout for upstream API calls
UPSTREAM_RETRIES 2 Number of retries on transient errors
MODEL_MAP_JSON (built-in) Override model name mapping
PROXY_LOG_PATH (stderr) Path for structured JSON log output
SSE_CHUNK_SIZE 256 Characters per SSE text delta chunk
SSE_UPSTREAM_HEARTBEAT_SECONDS 5 Seconds between heartbeat comments while waiting for upstream
UPSTREAM_STREAM 1 Use stream=true for upstream Chat Completions (live streaming)
GPT_MODEL_STRATEGY error How to handle GPT-model requests: error (reject immediately), oss (alias to OSS model for testing), openai (passthrough to OpenAI API — requires OPENAI_API_KEY)
GPT_MODEL_OSS_FALLBACK deepseek-v4-pro OSS model to use when GPT_MODEL_STRATEGY=oss
OPENAI_API_KEY (not set) Required only for GPT_MODEL_STRATEGY=openai
MAX_GLOBAL_UPSTREAM_CONCURRENCY 2 Cap concurrent upstream requests globally
MODEL_CONCURRENCY_JSON deepseek/kimi 1, flash 2 Per-model concurrency caps
CIRCUIT_BREAKER_ERRORS 2 Errors before marking a model degraded
CIRCUIT_BREAKER_COOLDOWN 300 Seconds before auto-recovering a degraded model
OSS_MAX_TOOL_TURNS 6 Max tool turns before OSS agent is stopped
ALLOW_MISSING_OPENCODE_KEY 0 Set to 1 to bypass fatal key check
OSS_NATIVE_MAX_TOOL_EXCHANGES 1 Max tool calls per OSS subagent turn
CONTINUATION_TOOLS none Tools for continuation turns (none = force finalization)
CONTINUATION_MODEL kimi-k2.6 Model for read-result finalizer
CONTINUATION_DEADLINE_SECONDS 45 Deadline for finalizer model calls
WRITE_RESULT_MODE deterministic Write results: deterministic = no model call
MAX_TOOL_OUTPUT_CHARS 20000 Compact tool outputs larger than this
UPSTREAM_FIRST_BYTE_TIMEOUT_SECONDS 30 Timeout for first byte from upstream
UPSTREAM_IDLE_TIMEOUT_SECONDS 30 Timeout for upstream idle during processing
DEGRADED_COMPLETION_ON_TIMEOUT 1 Return degraded report on timeout
EXPOSE_EMPTY_REASONING_ITEM 1 Include empty reasoning item in output for Codex compatibility
STRIP_TOOLS 0 Set to 1 to strip ALL tools (force text-only responses)

Supported models

Codex model ID Upstream model Best for
ocg-deepseek-v4-pro deepseek-v4-pro Bounded implementation, debugging, reasoning-heavy analysis
ocg-kimi-k2.6 kimi-k2.6 Fast repo navigation, scouting, review
ocg-deepseek-v4-flash deepseek-v4-flash Docs, summaries, mechanical low-risk tasks
ocg-kimi-k2.5 kimi-k2.5 (untested)
ocg-qwen3.6-plus qwen3.6-plus (untested)
ocg-glm-5.1 glm-5.1 (untested)
ocg-minimax-m2.7 minimax-m2.7 (untested)

Also accepts OpenCode-style opencode-go/<model> model IDs.

Bridge: OSS subagent runtime

OSS agents are bounded transactions with deterministic finalization, managed autonomy, and context-pack mode for prep tasks. The bridge auto-detects subagent forks (GPT-5.5 with continuation context) and aliases to OSS — no fork_turns configuration needed.

Execution modes

The bridge selects the right mode based on the task handoff:

  • no_tool_exact — exact output tasks (guardrail tests, control probes). No model decisions needed.
  • context_pack_report — read tasks with explicit READ-ONLY PATHS + DELIVERABLE. Bridge gathers all files internally, sends one no-tools synthesis call. Command-aware: grep X in Y steps get grep output, not full files.
  • managed_autonomy — discovery tasks where the model chooses what to inspect. Budget-capped, duplicate-suppressed, evidence-ledger-injected.
  • bounded_write_exact — writes to explicit OWNED PATHS with exact content. Bridge writes the file directly, reads it back, returns deterministic PASS/FAIL. No model call needed.

Guarantees

  • Writes: Deterministic — bridge writes file, reads back, reports result. No model self-report dependency.
  • Reads: Context-pack or managed autonomy with deadline control. Grimg detected, only grep output included (not full files).
  • Intent rejection: "I will", "Running...", "Starting..." rejected as non-terminal. Internal retry once, then deterministic report from gathered evidence.
  • Timeout recovery: Request-level deadline prevents serial timeout stacking. Deterministic PARTIAL report within deadline instead of client disconnect.
  • Terminal guarantee: Every path emits response.completed or response.failed before closing the SSE stream.

Runtime environment variables

Variable Default Description
OSS_NATIVE_MAX_TOOL_EXCHANGES 1 Max tool calls per OSS subagent turn
CONTINUATION_TOOLS none Tools for continuation turns (none = no tools)
CONTINUATION_MODEL kimi-k2.6 Model for read finalizer
CONTINUATION_FALLBACK_MODELS deepseek-v4-flash Fallback finalizer models
CONTINUATION_DEADLINE_SECONDS 45 Deadline for finalizer calls
WRITE_RESULT_MODE deterministic Write handling: deterministic = no model call
MAX_TOOL_OUTPUT_CHARS 20000 Compact outputs larger than this
FORCE_SINGLE_TOOL_INSTRUCTIONS 1 Enforce single-tool-per-turn (prevents parallel-call repair failures)
DEGRADED_COMPLETION_ON_TIMEOUT 1 Return degraded report on timeout
REQUEST_DEADLINE_SECONDS 90 Hard deadline for entire request
CONTEXT_PACK_MAX_CHARS 24000 Max chars in context-pack source bundle
GPT_MODEL_STRATEGY error GPT handling: error (reject top-level misuse, auto-alias subagent forks), oss (alias all), openai (passthrough)

Model-task matrix

Model Best for Real example Rate limit
DeepSeek V4 Flash Docs, summaries, mechanical edits, test inventories "Write a changelog entry for the last 3 commits" 31K req/5hr
DeepSeek V4 Pro Bounded implementation, debugging, feature work "Add a test for the validateToken function following existing patterns" 3.4K req/5hr
Kimi K2.6 Repo exploration, scouting, code review, fast navigation "Find every place that calls formatName and summarize the call patterns" 1.1K req/5hr
GPT-5.4 Implementation where blast radius matters, cross-module changes "Refactor the publish-bundle hydration to use the new artifact reader" Usage-based
GPT-5.5 Architecture, final review, critical paths "Review this recovery path change for safety" Usage-based

Orchestration

How routing works

Codex's orchestrator (GPT-5.5) reads AGENTS.md and agent description fields from .codex/agents/ to decide which worker handles each task:

You: "Find all callers of formatName"
         │
         ▼
   GPT-5.5 reads AGENTS.md routing rules
         │
         │  Safety check: auth? No
         │  Task type: exploration → Kimi
         │
         ▼
   GPT-5.5 spawns oss_kimi_rapid (fork_turns: "none")
         │  Handoff: task scope, allowed paths, output format
         ▼
   Kimi returns file paths + line numbers + confidence
         │
         ▼
   GPT-5.5 synthesizes result. Done.

Safety boundaries

OSS agents have explicit "DO NOT USE FOR" descriptions and developer instructions that prevent them from touching critical paths. If the orchestrator routes an auth/schema/recovery task to an OSS agent, the agent should refuse.

Critical paths that must stay on GPT-5.5/5.4:

  • Authentication, authorization, session management
  • Recovery paths, error recovery, state repair
  • Schema authority, database migrations
  • CI gates, build pipelines, deployment
  • Cross-module invariants (>2 modules affected)
  • Any path where failure = data loss or security breach

Fork mode

OSS subagents must be spawned with fork_turns: "none". Full-history forks inherit the parent GPT-5.5 model and reasoning effort, which conflicts with the model/provider overrides OSS agents need. This is a known Codex limitation (issue #20077).

The AGENTS.md handoff template includes this requirement. See orchestration/ROUTING.md for details.

Orchestration files

File Purpose
orchestration/AGENTS.md Routing rules for GPT-5.5. Merge into your project's AGENTS.md.
orchestration/ROUTING.md Reference: decision flowchart, capability matrix, handoff examples, troubleshooting.
orchestration/agents/*.toml Recommended agent TOMLs with explicit routing descriptions and safety boundaries.

Agent TOMLs

Three pre-built agent files are provided in orchestration/agents/ (recommended) and agents/ (minimal):

Agent TOML Model Reasoning Sandbox Use case
oss-deepseek-pro.toml deepseek-v4-pro high workspace-write Bounded impl, debugging, analysis
oss-kimi-rapid.toml kimi-k2.6 medium read-only Repo navigation, scouting, review
oss-flash-support.toml deepseek-v4-flash medium read-only Docs, summaries, changelog, mechanical

Creating your own agent

You can create agents for any model OpenCode Go supports:

  1. Pick a model ID. Run curl https://opencode.ai/zen/go/v1/models -H "Authorization: Bearer $OPENCODE_GO_API_KEY" to see the full catalog. Use the model name with an ocg- prefix (e.g. qwen3.6-plusocg-qwen3.6-plus).

  2. Create a .toml file in your project's .codex/agents/:

name = "oss_my_worker"
description = "What this agent does. USE ME WHEN: <criteria>. DO NOT USE FOR: <boundaries>."

model_provider = "opencode_bridge"       # always this
model = "ocg-<model-id>"                 # e.g. ocg-qwen3.6-plus
model_reasoning_effort = "high"          # high / medium / low
sandbox_mode = "workspace-write"         # or "read-only"

developer_instructions = """
Your instructions. Rules, scope, output format, escalation criteria.
Include: confidence marker (HIGH/MEDIUM/LOW), files inspected, caveats.
"""
  1. Set the right reasoning effort:
Effort When to use Example models
high Implementation, debugging, analysis deepseek-v4-pro
medium Navigation, docs, summaries, mechanical kimi-k2.6, deepseek-v4-flash
low Trivial text generation Any fast model
  1. Choose the right sandbox mode:
Mode Permissions Best for
workspace-write Read and edit project files Implementation, debugging, refactoring
read-only Read files, run safe commands Exploration, review, docs, analysis
  1. Write good descriptions. The description field is Codex's routing signal. Include both "use me when" AND "do NOT use for" criteria. Example:
"Bounded implementation worker. USE ME WHEN: single file change, tests exist, requirements clear. DO NOT USE FOR: auth, schema, recovery, cross-module changes."
  1. Register in AGENTS.md. Add your agent to the routing rules so the orchestrator knows when to delegate to it.

  2. Use fork_turns: "none". OSS agents use different models/providers than GPT-5.5, so they must not inherit the parent session via full-history fork.

Tested models

Agent Model Reasoning Sandbox Status
oss-deepseek-pro.toml deepseek-v4-pro high workspace-write Working
oss-kimi-rapid.toml kimi-k2.6 medium read-only Working
oss-flash-support.toml deepseek-v4-flash medium read-only Working
oss-qwen3.6-plus (custom) qwen3.6-plus Untested
oss-glm-5.1 (custom) glm-5.1 Untested
oss-minimax-m2.7 (custom) minimax-m2.7 Untested

Untested models may need adjustments — some providers are stricter about tool schemas (shape failures) or message format requirements (relational failures). The bridge strips unsupported tool types and maps developersystem, but provider-specific quirks may still surface. If you test an untested model, open an issue with your findings.

External OSS workers (fallback)

The bin/ directory includes four wrapper scripts for running OSS models as external workers (via opencode run directly, without the proxy):

Script Default model Purpose
oss-scout kimi-k2.6 Read-only repo exploration, file mapping, summaries
oss-review kimi-k2.6 First-pass review, missing-test detection
oss-docs deepseek-v4-flash Docs, changelog, low-stakes text
oss-patch deepseek-v4-pro Isolated patch drafts in separate worktree

Use these when the proxy is down, rate-limited, or you need an isolated worktree for write tasks.

Model routing lanes

Lane A — GPT-5.5
  Orchestration, architecture, final acceptance, critical review

Lane B — GPT-5.4
  Trusted bounded implementation and review

Lane C — GPT-5.4-mini
  Cheap read-heavy exploration and support

Lane D — OSS native subagents (through this bridge)
  oss-deepseek-pro: bounded implementation, debugging, analysis
  oss-kimi-rapid: repo navigation, scouting, review
  oss-flash-support: docs, summaries, mechanical

Lane E — OSS external workers (fallback)
  Direct opencode run when proxy is unavailable

Limitations

  • Not production-grade: This is a local development tool. It uses a single-threaded Python HTTP server (though concurrent via ThreadingHTTPServer) and no authentication beyond a shared key (configurable; disable entirely for localhost).
  • Single machine only: Bind to localhost. Do not expose publicly.
  • Subagent-only provider: The bridge is NOT a session-wide Codex provider. GPT-5.5 must remain native. Only OSS agents in .codex/agents/ should route through the bridge.
  • DeepSeek thinking mode costs tokens: DeepSeek V4 Pro's reasoning_content is preserved internally but counts against your OpenCode Go usage. Expect ~300-400K tokens for multi-turn coding tasks.
  • True upstream streaming (v5+): The bridge uses stream=true against OpenCode Go and translates chunks live. This reduces latency compared to v3's fake SSE but depends on OpenCode Go's streaming behavior.
  • Subagent spawning: OSS agents must use fork_turns: "none" (full-history forks conflict with model/provider overrides). The orchestrator needs to include explicit task context in handoffs since the child doesn't inherit parent conversation history.

Self-test

python3 bridge.py --self-test

Expected output:

self-test passed

Troubleshooting

First step for any issue: run codex-oss doctor. It checks 22 invariants and tells you exactly what to fix.

Common issues the doctor catches:

Problem Doctor check What it means
GPT requests timing out bridge.gpt_rejection FAIL Bridge set as session-wide provider. Remove model_provider = "opencode_bridge" from config.
OSS agents can't spawn agents.*.provider FAIL Agent TOML missing model_provider = "opencode_bridge". Run codex-oss install --force.
Bridge won't start Fatal at launch OPENCODE_GO_API_KEY not set in production mode. Set key or ALLOW_MISSING_OPENCODE_KEY=1.
State lost after reboot bridge.state_db WARN State DB in /tmp. codex-oss start now uses .codex-oss/state/ by default.
Recursive codex exec rules.recursive_block WARN No blocking rule installed. Run codex-oss install.
Full-history fork error agreements.fork_turns WARN AGENTS.md doesn't specify fork_turns: none. Run codex-oss install.

For advanced debugging, use the JSON output:

codex-oss doctor --json

License

Apache 2.0 — see LICENSE file.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors