LLM Stream Optimizer

LLM Stream Optimizer is a lightweight Cloudflare Workers proxy for OpenAI-compatible clients. It can route requests to OpenAI-compatible, Anthropic, and Google Gemini upstream APIs, normalize selected responses into OpenAI-style output, and smooth streaming responses so large chunks are emitted more naturally.

Warning

This project is currently in paused / limited maintenance mode. The repository is kept available for existing users and self-hosting, but new features and security fixes are not guaranteed.

Features

OpenAI-compatible proxy endpoint for chat completions and model listing.
Multiple OpenAI-compatible upstream endpoints with optional model routing.
Anthropic and Gemini upstream support with OpenAI-style response conversion.
Streaming response optimization with configurable delays and model exclusions.
Web administration dashboard at /admin.
Optional Cloudflare KV storage for runtime configuration.
ShadowFetch / native fetch switching for upstream requests that need fewer Cloudflare-added headers.

Requirements

A Cloudflare account with Workers enabled.
A Cloudflare Workers KV namespace if you want persistent dashboard configuration.
Node.js 18 or newer for local Wrangler workflows.
One or more upstream LLM API keys.

Quick Start With Wrangler

Install dependencies:

npm install

Copy the local development environment example:

cp .dev.vars.example .dev.vars

Edit .dev.vars and set at least:

PROXY_API_KEY="replace-with-your-password-and-proxy-key"

Start local development:

npm run dev

Open the local Worker URL, then go to /admin to configure upstream APIs.

Before deploying, create a KV namespace and fill the CONFIG_KV binding in wrangler.toml:

[[kv_namespaces]]
binding = "CONFIG_KV"
id = "your-production-kv-namespace-id"
preview_id = "your-preview-kv-namespace-id"

Then deploy:

npm run deploy

Manual Cloudflare Dashboard Deployment

If you prefer the original copy-and-paste deployment flow:

Create a new Cloudflare Worker.
Copy all content from worker.js into the Workers editor and deploy it.
In Workers settings, add a secret named PROXY_API_KEY. This value is both the proxy API key and the /admin login password.
Create a Workers KV namespace.
Add a KV binding named CONFIG_KV and point it to the namespace you created.
Open your Worker domain and visit /admin.

Configuration

Production Secrets And Bindings

PROXY_API_KEY: Proxy API key and web dashboard login password. Use a strong value for any shared or production deployment.
CONFIG_KV: KV namespace binding used to store API endpoint and stream optimization configuration. Without this binding, the Worker can still run from environment variables, but dashboard changes cannot be persisted.

Optional Environment Variables

OPENAI_API_KEY: Default OpenAI-compatible upstream API key.
UPSTREAM_URL: Default OpenAI-compatible upstream base URL. Defaults to https://api.openai.com/v1.
OPENAI_ENDPOINTS: JSON array for multiple OpenAI-compatible endpoints.
GEMINI_API_KEY: Google Gemini API key.
GEMINI_URL: Gemini API base URL. Defaults to https://generativelanguage.googleapis.com.
GEMINI_USE_NATIVE_FETCH: Set to false to disable native fetch for Gemini.
ANTHROPIC_API_KEY: Anthropic API key.
ANTHROPIC_URL: Anthropic API base URL. Defaults to https://api.anthropic.com.
ANTHROPIC_USE_NATIVE_FETCH: Set to false to disable native fetch for Anthropic.

Most runtime settings can also be configured from the /admin dashboard when CONFIG_KV is bound.

API Usage

Use the deployed Worker URL as an OpenAI-compatible base URL:

curl https://your-worker.example.workers.dev/v1/models \
  -H "Authorization: Bearer $PROXY_API_KEY"

Model listing requests are intentionally permissive in the current Worker implementation so clients can discover configured models. Chat completion requests require the configured proxy API key when PROXY_API_KEY is set.

Security Notes

Do not commit .dev.vars, real API keys, KV namespace ids that you consider private, or dashboard credentials.
Use a strong PROXY_API_KEY; it protects both proxy requests and the administration dashboard.
Treat the dashboard as an administrative surface. Avoid exposing it through shared credentials.
This project is in limited maintenance mode, so review the code and Cloudflare settings before production use.

See SECURITY.md for more details.

Development Commands

npm run dev
npm run check
npm run deploy

npm run check performs a Wrangler dry-run deployment validation.

License

Licensed under the Apache License 2.0.

Sponsor

CDN acceleration and security protection for this project are sponsored by Tencent EdgeOne.

Best Asian CDN, Edge, and Secure Solutions - Tencent EdgeOne

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.dev.vars.example		.dev.vars.example
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README-CN.md		README-CN.md
README.md		README.md
SECURITY.md		SECURITY.md
package-lock.json		package-lock.json
package.json		package.json
worker.js		worker.js
wrangler.toml		wrangler.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Stream Optimizer

Features

Requirements

Quick Start With Wrangler

Manual Cloudflare Dashboard Deployment

Configuration

Production Secrets And Bindings

Optional Environment Variables

API Usage

Security Notes

Development Commands

License

Sponsor

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Stream Optimizer

Features

Requirements

Quick Start With Wrangler

Manual Cloudflare Dashboard Deployment

Configuration

Production Secrets And Bindings

Optional Environment Variables

API Usage

Security Notes

Development Commands

License

Sponsor

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages