Skip to content

sinistercodes/claude-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Perception Claude Proxy

A small OpenAI-compatible HTTP proxy that forwards /v1/chat/completions traffic to the Claude Agent SDK, so IDEs and tools that only speak the OpenAI format can use a Claude subscription.

Built primarily for Perception.cx but works with any OpenAI-compatible client (Continue, Cursor, etc.).

Install

npm install
cp .env.example .env   # optional; defaults are sane
node server.js

The proxy listens on :4001 by default. Point your IDE at http://localhost:4001/v1 and pick sonnet, opus, or haiku as the model.

What it does

  • OpenAI-compatible: /v1/chat/completions (streaming + non-streaming), /v1/models. Tool calls are translated between the OpenAI JSON format and Claude's native tool format in both directions.
  • Persistent SDK sessions: an LRU-keyed pool reuses warm Claude sessions across turns. Same conversation prefix + tool list = cache hit, dropping warm-turn latency by 50–70%. Pool size and idle TTL are tunable.
  • Per-project context: hijacks the IDE's update_notes tool and routes it to <workspace>/.proxy/context.md. Context is workspace-scoped, so notes for one game don't bleed into a different project. Workspace root is inferred from absolute paths in tool calls and tool results.
  • Auto-fallback: on 529/overloaded_error/rate_limit_error, idle timeout, or the SDK's ~120K-token "Usage Policy" refusal, the proxy walks opus → sonnet → haiku and retries on a fresh AbortController. The IDE sees a status note in the stream.
  • Thinking visibility: surfaces extended-thinking deltas as <think>...</think> blocks (default), reasoning_content deltas (DeepSeek/Cursor convention), or both.
  • Image input: data URLs and image_url fields on user messages are decoded and passed through as native Claude image blocks.
  • Built-in FS tools across every drive: Read / Glob / Grep are executed inside the proxy via an internal text-tool protocol, so the model can search the whole filesystem (every drive root on Windows, / on Unix) instead of being capped at the IDE workspace. Optional Bash, Write/Edit, and WebSearch tools are registered as native SDK tools when enabled.
  • 1M-context beta on by default: the context-1m-2025-08-07 beta is enabled out of the box so long conversations don't get clipped on plans where Opus isn't auto-upgraded. Toggle with CLAUDE_1M_CONTEXT=0 or override the full list with CLAUDE_BETAS.
  • Input-size guards: per-tool-result, per-message, and total-prompt char caps clip oversized history (e.g. a 5MB tool dump) before it reaches the SDK, preserving the tail of the conversation so the active query stays intact.
  • Two-tier timeouts: a hard wall-clock cap plus an idle-token cap. The idle cap resets on every streaming delta and catches silently stuck upstreams that would otherwise hang for minutes.

Configuration

All configuration is via environment variables. See .env.example for the full annotated list with defaults.

Core

Variable Default Notes
PORT 4001 HTTP listen port
CLAUDE_MODEL sonnet default alias when the request doesn't specify
CLAUDE_THINKING high off / low / medium / high / max
STREAM_THINKING tags off / tags / reasoning_content / both
CLAUDE_FALLBACK 1 0 disables the opus→sonnet→haiku fallback
INCLUDE_GLOBAL_NOTES 1 0 strips the IDE's global notes block entirely

Sessions & timeouts

Variable Default Notes
CLAUDE_SESSIONS 1 0 disables the LRU session pool (one-shot)
CLAUDE_SESSION_MAX 20 warm sessions held in the pool
CLAUDE_SESSION_TTL_MS 1800000 per-session idle TTL
CLAUDE_REQUEST_TIMEOUT_MS 600000 hard wall-clock cap per request
CLAUDE_IDLE_TIMEOUT_MS 120000 abort if no streaming token within this window

Built-in SDK tools

Read / Glob / Grep are executed inside the proxy (this is what gives the model access to drives outside the IDE workspace); the rest are registered with the SDK as native tools when enabled.

Variable Default Notes
ENABLE_FS_TOOLS 1 0 drops Read/Glob/Grep
ENABLE_BASH_TOOL 0 1 adds Bash (shell exec)
ENABLE_WRITE_TOOLS 0 1 adds Write + Edit
ENABLE_WEB_SEARCH 0 1 adds the SDK WebSearch tool
EXTRA_SDK_TOOLS (empty) comma-separated extra SDK tool names
FS_ADDITIONAL_DIRS (all drives) comma-separated roots; defaults to every drive on Win, / elsewhere

Internal FS tool resource caps

Variable Default Notes
INTERNAL_READ_MAX_BYTES 2097152 max bytes returned by one Read
INTERNAL_GLOB_MAX 1000 max paths returned by Glob
INTERNAL_GREP_MAX 200 max matches returned by Grep
INTERNAL_RESULT_MAX_CHARS 100000 char cap on any single tool result
INTERNAL_WALK_DEPTH 12 max recursion depth
MAX_INTERNAL_TURNS 20 sequential internal-tool calls per request
INTERNAL_TOOL_TIMEOUT_MS 60000 per-call timeout

Input-size guards

Variable Default Notes
MAX_TOOL_RESULT_CHARS 100000 per-tool-result body cap (≈ 25K tokens)
MAX_HISTORY_MESSAGE_CHARS 200000 per non-tool history message cap
MAX_TOTAL_PROMPT_CHARS 3000000 total chars across all messages (≈ 750K tokens)
PROTECT_LAST_N_MESSAGES 2 never clip the tail (active query stays intact)

SDK betas

Variable Default Notes
CLAUDE_1M_CONTEXT 1 0 disables the 1M-context beta
CLAUDE_BETAS context-1m-2025-08-07 comma-separated; overrides CLAUDE_1M_CONTEXT

Observability

Variable Default Notes
VERBOSE 0 1 enables [proxy] logs (also --verbose)
LIVE 0 1 enables the colored live request feed (--live)
LOG_REQUESTS 0 1 dumps every request body to logs/requests/
PROXY_TEST_HOOKS 0 1 exposes internals on module.exports for tests

Endpoints

POST /v1/chat/completions     # OpenAI-compatible, streaming or one-shot
GET  /v1/models               # lists sonnet, opus, haiku, plus a few aliases
GET  /                        # health
GET  /debug/status            # loopback-only; runtime config + session count
GET  /debug/last-exchange     # loopback-only; the last full request/response
GET  /debug/last-request      # loopback-only; the last logged request body
GET  /debug/exchanges         # loopback-only; recent exchange ring
GET  /debug/tools             # loopback-only; tool catalog from the last request
GET  /debug/workspace         # loopback-only; resolved per-project workspace cache
POST /debug/workspace/reset   # loopback-only; clear the workspace cache

Debug endpoints are bound to loopback only and reject non-127.0.0.1 callers.

Per-project notes (.proxy/context.md)

Whenever the model invokes the IDE's update_notes tool, the proxy intercepts the content and writes it to <workspace_root>/.proxy/context.md instead of the IDE's single global notes pool. On the next request, the proxy strips the IDE's === YOUR WORKING NOTES === block from the system prompt and replaces it with the project's own context.

Workspace root is inferred from any absolute file path the model or IDE has mentioned. Detection is sticky across turns, since Perception elides tool_call/tool_result from history each turn. If the wrong root is picked up, POST /debug/workspace/reset clears the cache.

The INCLUDE_GLOBAL_NOTES=0 env var drops the IDE's globals entirely; otherwise they're re-emitted as a clearly-labelled read-only "GLOBAL NOTES" reference block alongside per-project notes.

Running tests

npm test

Boots the proxy on :4099, runs 28 unit + integration cases (message conversion, tool-call round-trip, image pass-through, usage mapping, hashing, session reuse, streaming, fallback). Requires a working Claude subscription since some tests hit the live API.

Repository layout

server.js              the whole proxy (one file, ~2100 lines)
.env.example           every configurable env var
package.json           start | dev | test scripts
tests/                 integration test harness
logs/requests/         request dumps when LOG_REQUESTS=1 (rotated, last 50)

About

Proxy service to map Anthropic api calls to OpenAI format utilizing the Claude-Agent-SDK

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors