LLM Stream Optimizer is a lightweight Cloudflare Workers proxy for OpenAI-compatible clients. It can route requests to OpenAI-compatible, Anthropic, and Google Gemini upstream APIs, normalize selected responses into OpenAI-style output, and smooth streaming responses so large chunks are emitted more naturally.
Warning
This project is currently in paused / limited maintenance mode. The repository is kept available for existing users and self-hosting, but new features and security fixes are not guaranteed.
- OpenAI-compatible proxy endpoint for chat completions and model listing.
- Multiple OpenAI-compatible upstream endpoints with optional model routing.
- Anthropic and Gemini upstream support with OpenAI-style response conversion.
- Streaming response optimization with configurable delays and model exclusions.
- Web administration dashboard at
/admin. - Optional Cloudflare KV storage for runtime configuration.
- ShadowFetch / native fetch switching for upstream requests that need fewer Cloudflare-added headers.
- A Cloudflare account with Workers enabled.
- A Cloudflare Workers KV namespace if you want persistent dashboard configuration.
- Node.js 18 or newer for local Wrangler workflows.
- One or more upstream LLM API keys.
Install dependencies:
npm installCopy the local development environment example:
cp .dev.vars.example .dev.varsEdit .dev.vars and set at least:
PROXY_API_KEY="replace-with-your-password-and-proxy-key"Start local development:
npm run devOpen the local Worker URL, then go to /admin to configure upstream APIs.
Before deploying, create a KV namespace and fill the CONFIG_KV binding in wrangler.toml:
[[kv_namespaces]]
binding = "CONFIG_KV"
id = "your-production-kv-namespace-id"
preview_id = "your-preview-kv-namespace-id"Then deploy:
npm run deployIf you prefer the original copy-and-paste deployment flow:
- Create a new Cloudflare Worker.
- Copy all content from
worker.jsinto the Workers editor and deploy it. - In Workers settings, add a secret named
PROXY_API_KEY. This value is both the proxy API key and the/adminlogin password. - Create a Workers KV namespace.
- Add a KV binding named
CONFIG_KVand point it to the namespace you created. - Open your Worker domain and visit
/admin.
PROXY_API_KEY: Proxy API key and web dashboard login password. Use a strong value for any shared or production deployment.CONFIG_KV: KV namespace binding used to store API endpoint and stream optimization configuration. Without this binding, the Worker can still run from environment variables, but dashboard changes cannot be persisted.
OPENAI_API_KEY: Default OpenAI-compatible upstream API key.UPSTREAM_URL: Default OpenAI-compatible upstream base URL. Defaults tohttps://api.openai.com/v1.OPENAI_ENDPOINTS: JSON array for multiple OpenAI-compatible endpoints.GEMINI_API_KEY: Google Gemini API key.GEMINI_URL: Gemini API base URL. Defaults tohttps://generativelanguage.googleapis.com.GEMINI_USE_NATIVE_FETCH: Set tofalseto disable native fetch for Gemini.ANTHROPIC_API_KEY: Anthropic API key.ANTHROPIC_URL: Anthropic API base URL. Defaults tohttps://api.anthropic.com.ANTHROPIC_USE_NATIVE_FETCH: Set tofalseto disable native fetch for Anthropic.
Most runtime settings can also be configured from the /admin dashboard when CONFIG_KV is bound.
Use the deployed Worker URL as an OpenAI-compatible base URL:
curl https://your-worker.example.workers.dev/v1/models \
-H "Authorization: Bearer $PROXY_API_KEY"Model listing requests are intentionally permissive in the current Worker implementation so clients can discover configured models. Chat completion requests require the configured proxy API key when PROXY_API_KEY is set.
- Do not commit
.dev.vars, real API keys, KV namespace ids that you consider private, or dashboard credentials. - Use a strong
PROXY_API_KEY; it protects both proxy requests and the administration dashboard. - Treat the dashboard as an administrative surface. Avoid exposing it through shared credentials.
- This project is in limited maintenance mode, so review the code and Cloudflare settings before production use.
See SECURITY.md for more details.
npm run dev
npm run check
npm run deploynpm run check performs a Wrangler dry-run deployment validation.
Licensed under the Apache License 2.0.
CDN acceleration and security protection for this project are sponsored by Tencent EdgeOne.
Best Asian CDN, Edge, and Secure Solutions - Tencent EdgeOne
