v1.0.4 - 2026-Q2 model pricing updates
Added — pricing for 2026-Q2 model releases
genai_otel/llm_pricing.json now covers models released since Feb 2026 across API, Hugging Face, and Ollama tag formats (following the existing alias convention for each family):
- Anthropic — Claude Opus 4.7 (
claude-opus-4-7,claude-opus-4.7) - OpenAI — GPT-5.3 family (
gpt-5.3,-chat-latest,-codex) and GPT-5.4 family (base / mini / nano / pro) - Google — Gemini 3.1 Flash Live Preview, Flash-Lite Preview; Gemma 4 series (
google/gemma-4-31B,-26B-A4B,-E4B,-E2B) with HF + short + Ollama aliases - xAI — Grok 4.20 dated snapshots (
-0309-reasoning,-non-reasoning,-multi-agent-0309) and Grok 4.1 Fast variants - MiniMax — M2.7 (+ highspeed) and M2.5 highspeed tier
- Zhipu / Z.ai — GLM-5.1 and GLM-5-Turbo (with
zai/andTHUDM/aliases) - Moonshot — Kimi K2.6 covering both Moonshot first-party and OpenRouter aggregate prices
- Alibaba Qwen — Qwen 3.5 Plus, Qwen 3.6 Plus, Qwen 3.6 35B MoE (
Qwen/Qwen3.6-35B-A3B) - Sarvam AI — Sarvam-30B and Sarvam-105B (free tier, 22 Indic + English)
- Liquid AI — LFM2-24B-A2B MoE (OpenRouter-verified) and LFM2.5-350M edge model
Fixed
- Grok 4.20 pricing correction —
grok-4.20prompt/completion prices corrected from the prior estimate ($3/$15 per 1M) to the xAI-documented $2/$6 per 1M tokens
Notes
- All pricing is USD per 1K tokens in the JSON (consistent with the existing schema).
- Where hosted pricing isn't yet published (e.g. Qwen3.6-35B-A3B, Gemma 4 26B-A4B / E4B / E2B, LFM2.5-350M), entries use size/tier estimates and are flagged with
estimatedin thenotefield. - Sarvam-30B / 105B priced at
0.0to reflect Sarvam's free first-party API tier as of this release; revisit when a paid tier is published.