A local HTTP service that exposes OpenAI-compatible API endpoints (/v1/chat/completions, /v1/models) backed by the Claude Code CLI. This lets OpenAI API clients (OpenCode, OpenClaw, the openai Python SDK, etc.) use Claude Code under the hood — leveraging your Claude Max Plan subscription instead of paying per-token API credits.
OpenAI Client → FastAPI Server → claude -p "prompt" → Anthropic API
(OpenCode) (this project) (Claude Code CLI) (Max Plan)
- Python 3.10+
- Claude Code CLI installed and authenticated (
claudemust be in your PATH)
git clone https://github.com/your-user/claude-code-api-wrapper.git
cd claude-code-api-wrapper
pip install -e .Copy the example env file and set your API key:
cp .env.example .envEdit .env and set API_KEY to whatever Bearer token you want clients to use for authentication.
| Variable | Type | Default | Description |
|---|---|---|---|
API_KEY |
str | required | Bearer token for authenticating requests |
DEFAULT_MODEL |
str | sonnet |
Default Claude model |
FALLBACK_MODEL |
str | — | Fallback model when primary is overloaded |
MODEL_MAP |
JSON dict | {"gpt-4":"opus","gpt-4o":"sonnet",...} |
OpenAI → Claude model name mapping |
MAX_CONCURRENT |
int | 10 |
Max concurrent CLI processes |
SESSION_TTL_SECONDS |
int | 1800 |
Session expiry in seconds (30 min) |
MAX_BUDGET_USD |
float | — | Per-request budget cap in USD |
WORKING_DIRECTORY |
str | — | Working directory for CLI processes |
HOST |
str | 0.0.0.0 |
Server bind address |
PORT |
int | 8000 |
Server bind port |
Start the server:
API_KEY=your-secret-key uvicorn app.main:appOr with auto-reload for development:
API_KEY=your-secret-key uvicorn app.main:app --reload| Method | Path | Auth | Description |
|---|---|---|---|
GET |
/health |
No | Health check |
GET |
/v1/models |
Yes | List available models |
POST |
/v1/chat/completions |
Yes | Chat completion (streaming & non-streaming) |
Non-streaming:
curl http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer your-secret-key" \
-H "Content-Type: application/json" \
-d '{"model":"sonnet","messages":[{"role":"user","content":"Say hello"}]}'Streaming:
curl http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer your-secret-key" \
-H "Content-Type: application/json" \
-d '{"model":"sonnet","messages":[{"role":"user","content":"Say hello"}],"stream":true}'Multi-turn conversation:
curl http://localhost:8000/v1/chat/completions \
-H "Authorization: Bearer your-secret-key" \
-H "Content-Type: application/json" \
-d '{
"model": "sonnet",
"messages": [
{"role": "user", "content": "My name is Alice"},
{"role": "assistant", "content": "Hello Alice!"},
{"role": "user", "content": "What is my name?"}
]
}'OpenAI Python SDK:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="your-secret-key")
response = client.chat.completions.create(
model="sonnet",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)Streaming with OpenAI Python SDK:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="your-secret-key")
stream = client.chat.completions.create(
model="sonnet",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)OpenAI model names are automatically mapped to Claude equivalents:
| OpenAI Name | Claude Name |
|---|---|
gpt-4 |
opus |
gpt-4o |
sonnet |
gpt-4o-mini |
haiku |
gpt-4-turbo |
sonnet |
gpt-3.5-turbo |
haiku |
You can also use Claude model names directly (opus, sonnet, haiku). Customize the mapping via the MODEL_MAP env var.
The wrapper automatically tracks conversation sessions using Claude Code's --session-id / --resume flags. When you send a multi-turn request, it matches the message history against known sessions and resumes the matching one — so Claude retains full context without resending the entire history each time.
Sessions expire after SESSION_TTL_SECONDS (default: 30 minutes). You can also pass an explicit X-Session-Id header for manual session control.
app/
├── __init__.py
├── config.py # Settings via pydantic-settings + env vars
├── models.py # Pydantic models for OpenAI request/response schemas
├── claude.py # CLI invocation, session store, message formatting
├── stream.py # Claude stream-json → OpenAI SSE conversion
└── main.py # FastAPI app, routes, auth, error handling
MIT