Skip to content

v1.0.4 - 2026-Q2 model pricing updates

Choose a tag to compare

@Mandark-droid Mandark-droid released this 21 Apr 14:29
· 22 commits to main since this release

Added — pricing for 2026-Q2 model releases

genai_otel/llm_pricing.json now covers models released since Feb 2026 across API, Hugging Face, and Ollama tag formats (following the existing alias convention for each family):

  • Anthropic — Claude Opus 4.7 (claude-opus-4-7, claude-opus-4.7)
  • OpenAI — GPT-5.3 family (gpt-5.3, -chat-latest, -codex) and GPT-5.4 family (base / mini / nano / pro)
  • Google — Gemini 3.1 Flash Live Preview, Flash-Lite Preview; Gemma 4 series (google/gemma-4-31B, -26B-A4B, -E4B, -E2B) with HF + short + Ollama aliases
  • xAI — Grok 4.20 dated snapshots (-0309-reasoning, -non-reasoning, -multi-agent-0309) and Grok 4.1 Fast variants
  • MiniMax — M2.7 (+ highspeed) and M2.5 highspeed tier
  • Zhipu / Z.ai — GLM-5.1 and GLM-5-Turbo (with zai/ and THUDM/ aliases)
  • Moonshot — Kimi K2.6 covering both Moonshot first-party and OpenRouter aggregate prices
  • Alibaba Qwen — Qwen 3.5 Plus, Qwen 3.6 Plus, Qwen 3.6 35B MoE (Qwen/Qwen3.6-35B-A3B)
  • Sarvam AI — Sarvam-30B and Sarvam-105B (free tier, 22 Indic + English)
  • Liquid AI — LFM2-24B-A2B MoE (OpenRouter-verified) and LFM2.5-350M edge model

Fixed

  • Grok 4.20 pricing correctiongrok-4.20 prompt/completion prices corrected from the prior estimate ($3/$15 per 1M) to the xAI-documented $2/$6 per 1M tokens

Notes

  • All pricing is USD per 1K tokens in the JSON (consistent with the existing schema).
  • Where hosted pricing isn't yet published (e.g. Qwen3.6-35B-A3B, Gemma 4 26B-A4B / E4B / E2B, LFM2.5-350M), entries use size/tier estimates and are flagged with estimated in the note field.
  • Sarvam-30B / 105B priced at 0.0 to reflect Sarvam's free first-party API tier as of this release; revisit when a paid tier is published.