Releases · crossps/llm-bridge-cache · GitHub

31 May 18:35

crossps

v1.2.0 - GLM (Z.ai) provider Latest

Latest

Adds GLM (Zhipu / Z.ai) as a built-in provider.

GLM models (glm-4.6, glm-4.5, glm-4.5-flash, …) route to https://api.z.ai/api/paas/v4 — Zhipu's OpenAI-compatible endpoint — via the existing passthrough path.
Set GLM_API_KEY and use a glm-* model. Override baseUrl for the coding-plan (/api/coding/paas/v4) or mainland (open.bigmodel.cn) endpoints.
Works across all three inbound formats; cross-converts for /v1/messages.
18 tests, all passing.

Assets 2

31 May 17:06

crossps

v1.1.0 - /v1/messages and /v1/responses

Adds two more inbound API formats so almost any client works as-is.

New

POST /v1/messages (Anthropic Messages in) — passthrough to Anthropic with prompt-cache breakpoints + keepalive; cross-converts to OpenAI for gpt-* models, including streaming. Lets Claude Code / the Anthropic SDK finally get the cache keepalive.
POST /v1/responses (OpenAI Responses in) — passthrough to OpenAI. Lets Codex / the Agents SDK route through the bridge.

Notes

Each request replies in the same format it was sent in.
Responses->Anthropic translation is deferred (returns a clear 400) — see issue #1.
17 tests, all passing.

Assets 2

31 May 16:43

crossps

v1.0.0 - Initial release

First public release of LLM Bridge & Cache — a zero-dependency, bring-your-own-key local API bridge.

Features

Multi-provider routing — one OpenAI-compatible endpoint in front of OpenAI & Anthropic; route by model name or explicit provider/model.
Prompt-cache keepalive — keeps Anthropic's ~5-min prompt cache warm with minimal max_tokens=1 pings; multi-chat, auto-evicting.
Prompt injection — system prepend/append/replace + depth injection.
Streaming, tools, and vision translated between OpenAI and Anthropic formats.
Multi-key round-robin, CORS, and a /status endpoint that never exposes keys.

Requires Node >= 18.17. npx llm-bridge-cache to run.

Assets 2