Skip to content

Releases: crossps/llm-bridge-cache

v1.2.0 - GLM (Z.ai) provider

31 May 18:35

Choose a tag to compare

Adds GLM (Zhipu / Z.ai) as a built-in provider.

  • GLM models (glm-4.6, glm-4.5, glm-4.5-flash, …) route to https://api.z.ai/api/paas/v4 — Zhipu's OpenAI-compatible endpoint — via the existing passthrough path.
  • Set GLM_API_KEY and use a glm-* model. Override baseUrl for the coding-plan (/api/coding/paas/v4) or mainland (open.bigmodel.cn) endpoints.
  • Works across all three inbound formats; cross-converts for /v1/messages.
  • 18 tests, all passing.

v1.1.0 - /v1/messages and /v1/responses

31 May 17:06

Choose a tag to compare

Adds two more inbound API formats so almost any client works as-is.

New

  • POST /v1/messages (Anthropic Messages in) — passthrough to Anthropic with prompt-cache breakpoints + keepalive; cross-converts to OpenAI for gpt-* models, including streaming. Lets Claude Code / the Anthropic SDK finally get the cache keepalive.
  • POST /v1/responses (OpenAI Responses in) — passthrough to OpenAI. Lets Codex / the Agents SDK route through the bridge.

Notes

  • Each request replies in the same format it was sent in.
  • Responses->Anthropic translation is deferred (returns a clear 400) — see issue #1.
  • 17 tests, all passing.

v1.0.0 - Initial release

31 May 16:43

Choose a tag to compare

First public release of LLM Bridge & Cache — a zero-dependency, bring-your-own-key local API bridge.

Features

  • Multi-provider routing — one OpenAI-compatible endpoint in front of OpenAI & Anthropic; route by model name or explicit provider/model.
  • Prompt-cache keepalive — keeps Anthropic's ~5-min prompt cache warm with minimal max_tokens=1 pings; multi-chat, auto-evicting.
  • Prompt injection — system prepend/append/replace + depth injection.
  • Streaming, tools, and vision translated between OpenAI and Anthropic formats.
  • Multi-key round-robin, CORS, and a /status endpoint that never exposes keys.

Requires Node >= 18.17. npx llm-bridge-cache to run.