LLMix 2.0.0 is the production-ready rewrite of the LLM orchestration layer for AI agents, AI tools, and config-driven model workflows.
Highlights
- Config-driven model swaps with MDA presets, so provider, model, and runtime parameters can move out of application code.
- Keep the SDK you already use: OpenAI, Anthropic, Gemini, AI SDK v6, LiteLLM, OpenRouter, DeepInfra, Novita, Together, Sno GPU, or any async callable.
- Full call pipeline with cache lookup, circuit breaker, singleflight deduplication, adaptive concurrency, retries, key-pool rotation, thinking-token stripping, and telemetry.
- Two-tier response cache with L1 memory and optional Redis L2.
- Cross-runtime parity for Python, TypeScript, and Rust, including shared cache-key behavior and aligned retry semantics.
- Config Registry support for immutable, content-addressed preset snapshots and atomic runtime switches.
Runtime Support
- Python 3.14+
- TypeScript / Node 20+
- Rust 1.83+ via
llmix-rsbeta
Packages
Package publishing is handled separately from GitHub Releases:
- npm:
@snoai/llmix - PyPI:
llmix - crates.io:
llmix-rs