Perception Claude Proxy

A small OpenAI-compatible HTTP proxy that forwards /v1/chat/completions traffic to the Claude Agent SDK, so IDEs and tools that only speak the OpenAI format can use a Claude subscription.

Built primarily for Perception.cx but works with any OpenAI-compatible client (Continue, Cursor, etc.).

Install

npm install
cp .env.example .env   # optional; defaults are sane
node server.js

The proxy listens on :4001 by default. Point your IDE at http://localhost:4001/v1 and pick sonnet, opus, or haiku as the model.

What it does

OpenAI-compatible: /v1/chat/completions (streaming + non-streaming), /v1/models. Tool calls are translated between the OpenAI JSON format and Claude's native tool format in both directions.
Persistent SDK sessions: an LRU-keyed pool reuses warm Claude sessions across turns. Same conversation prefix + tool list = cache hit, dropping warm-turn latency by 50–70%. Pool size and idle TTL are tunable.
Per-project context: hijacks the IDE's update_notes tool and routes it to <workspace>/.proxy/context.md. Context is workspace-scoped, so notes for one game don't bleed into a different project. Workspace root is inferred from absolute paths in tool calls and tool results.
Auto-fallback: on 529/overloaded_error/rate_limit_error, idle timeout, or the SDK's ~120K-token "Usage Policy" refusal, the proxy walks opus → sonnet → haiku and retries on a fresh AbortController. The IDE sees a status note in the stream.
Thinking visibility: surfaces extended-thinking deltas as <think>...</think> blocks (default), reasoning_content deltas (DeepSeek/Cursor convention), or both.
Image input: data URLs and image_url fields on user messages are decoded and passed through as native Claude image blocks.
Built-in FS tools across every drive: Read / Glob / Grep are executed inside the proxy via an internal text-tool protocol, so the model can search the whole filesystem (every drive root on Windows, / on Unix) instead of being capped at the IDE workspace. Optional Bash, Write/Edit, and WebSearch tools are registered as native SDK tools when enabled.
1M-context beta on by default: the context-1m-2025-08-07 beta is enabled out of the box so long conversations don't get clipped on plans where Opus isn't auto-upgraded. Toggle with CLAUDE_1M_CONTEXT=0 or override the full list with CLAUDE_BETAS.
Input-size guards: per-tool-result, per-message, and total-prompt char caps clip oversized history (e.g. a 5MB tool dump) before it reaches the SDK, preserving the tail of the conversation so the active query stays intact.
Two-tier timeouts: a hard wall-clock cap plus an idle-token cap. The idle cap resets on every streaming delta and catches silently stuck upstreams that would otherwise hang for minutes.

Configuration

All configuration is via environment variables. See .env.example for the full annotated list with defaults.

Core

Variable	Default	Notes
`PORT`	`4001`	HTTP listen port
`CLAUDE_MODEL`	`sonnet`	default alias when the request doesn't specify
`CLAUDE_THINKING`	`high`	`off` / `low` / `medium` / `high` / `max`
`STREAM_THINKING`	`tags`	`off` / `tags` / `reasoning_content` / `both`
`CLAUDE_FALLBACK`	`1`	`0` disables the opus→sonnet→haiku fallback
`INCLUDE_GLOBAL_NOTES`	`1`	`0` strips the IDE's global notes block entirely

Sessions & timeouts

Variable	Default	Notes
`CLAUDE_SESSIONS`	`1`	`0` disables the LRU session pool (one-shot)
`CLAUDE_SESSION_MAX`	`20`	warm sessions held in the pool
`CLAUDE_SESSION_TTL_MS`	`1800000`	per-session idle TTL
`CLAUDE_REQUEST_TIMEOUT_MS`	`600000`	hard wall-clock cap per request
`CLAUDE_IDLE_TIMEOUT_MS`	`120000`	abort if no streaming token within this window

Built-in SDK tools

Read / Glob / Grep are executed inside the proxy (this is what gives the model access to drives outside the IDE workspace); the rest are registered with the SDK as native tools when enabled.

Variable	Default	Notes
`ENABLE_FS_TOOLS`	`1`	`0` drops `Read`/`Glob`/`Grep`
`ENABLE_BASH_TOOL`	`0`	`1` adds `Bash` (shell exec)
`ENABLE_WRITE_TOOLS`	`0`	`1` adds `Write` + `Edit`
`ENABLE_WEB_SEARCH`	`0`	`1` adds the SDK `WebSearch` tool
`EXTRA_SDK_TOOLS`	(empty)	comma-separated extra SDK tool names
`FS_ADDITIONAL_DIRS`	(all drives)	comma-separated roots; defaults to every drive on Win, `/` elsewhere

Internal FS tool resource caps

Variable	Default	Notes
`INTERNAL_READ_MAX_BYTES`	`2097152`	max bytes returned by one `Read`
`INTERNAL_GLOB_MAX`	`1000`	max paths returned by `Glob`
`INTERNAL_GREP_MAX`	`200`	max matches returned by `Grep`
`INTERNAL_RESULT_MAX_CHARS`	`100000`	char cap on any single tool result
`INTERNAL_WALK_DEPTH`	`12`	max recursion depth
`MAX_INTERNAL_TURNS`	`20`	sequential internal-tool calls per request
`INTERNAL_TOOL_TIMEOUT_MS`	`60000`	per-call timeout

Input-size guards

Variable	Default	Notes
`MAX_TOOL_RESULT_CHARS`	`100000`	per-tool-result body cap (≈ 25K tokens)
`MAX_HISTORY_MESSAGE_CHARS`	`200000`	per non-tool history message cap
`MAX_TOTAL_PROMPT_CHARS`	`3000000`	total chars across all messages (≈ 750K tokens)
`PROTECT_LAST_N_MESSAGES`	`2`	never clip the tail (active query stays intact)

SDK betas

Variable	Default	Notes
`CLAUDE_1M_CONTEXT`	`1`	`0` disables the 1M-context beta
`CLAUDE_BETAS`	`context-1m-2025-08-07`	comma-separated; overrides `CLAUDE_1M_CONTEXT`

Observability

Variable	Default	Notes
`VERBOSE`	`0`	`1` enables `[proxy]` logs (also `--verbose`)
`LIVE`	`0`	`1` enables the colored live request feed (`--live`)
`LOG_REQUESTS`	`0`	`1` dumps every request body to `logs/requests/`
`PROXY_TEST_HOOKS`	`0`	`1` exposes internals on `module.exports` for tests

Endpoints

POST /v1/chat/completions     # OpenAI-compatible, streaming or one-shot
GET  /v1/models               # lists sonnet, opus, haiku, plus a few aliases
GET  /                        # health
GET  /debug/status            # loopback-only; runtime config + session count
GET  /debug/last-exchange     # loopback-only; the last full request/response
GET  /debug/last-request      # loopback-only; the last logged request body
GET  /debug/exchanges         # loopback-only; recent exchange ring
GET  /debug/tools             # loopback-only; tool catalog from the last request
GET  /debug/workspace         # loopback-only; resolved per-project workspace cache
POST /debug/workspace/reset   # loopback-only; clear the workspace cache

Debug endpoints are bound to loopback only and reject non-127.0.0.1 callers.

Per-project notes (`.proxy/context.md`)

Whenever the model invokes the IDE's update_notes tool, the proxy intercepts the content and writes it to <workspace_root>/.proxy/context.md instead of the IDE's single global notes pool. On the next request, the proxy strips the IDE's === YOUR WORKING NOTES === block from the system prompt and replaces it with the project's own context.

Workspace root is inferred from any absolute file path the model or IDE has mentioned. Detection is sticky across turns, since Perception elides tool_call/tool_result from history each turn. If the wrong root is picked up, POST /debug/workspace/reset clears the cache.

The INCLUDE_GLOBAL_NOTES=0 env var drops the IDE's globals entirely; otherwise they're re-emitted as a clearly-labelled read-only "GLOBAL NOTES" reference block alongside per-project notes.

Running tests

npm test

Boots the proxy on :4099, runs 28 unit + integration cases (message conversion, tool-call round-trip, image pass-through, usage mapping, hashing, session reuse, streaming, fallback). Requires a working Claude subscription since some tests hit the live API.

Repository layout

server.js              the whole proxy (one file, ~2100 lines)
.env.example           every configurable env var
package.json           start | dev | test scripts
tests/                 integration test harness
logs/requests/         request dumps when LOG_REQUESTS=1 (rotated, last 50)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.claude		.claude
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
server.js		server.js
writeup.md		writeup.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Perception Claude Proxy

Install

What it does

Configuration

Endpoints

Per-project notes (`.proxy/context.md`)

Running tests

Repository layout

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Perception Claude Proxy

Install

What it does

Configuration

Endpoints

Per-project notes (.proxy/context.md)

Running tests

Repository layout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Per-project notes (`.proxy/context.md`)

Packages