Skip to content

refactor(presets): add Z.ai GLM + MiniMax + Mercury + DeepSeek direct#301

Merged
Destynova2 merged 1 commit intomainfrom
refactor/ultra-cheap-add-glm-minimax
Apr 27, 2026
Merged

refactor(presets): add Z.ai GLM + MiniMax + Mercury + DeepSeek direct#301
Destynova2 merged 1 commit intomainfrom
refactor/ultra-cheap-add-glm-minimax

Conversation

@Destynova2
Copy link
Copy Markdown
Contributor

Summary

Adds 4 more free-tier sources to ultra-cheap, verified against April 2026 source docs.

New providers

Provider Type What you get Source
Z.ai (api.z.ai) ⭐ ONGOING free GLM-4.7-Flash + GLM-4.5-Flash recurring free for life (rate-limited, ~1M tok/day reported). GLM-4.7-Flash = 30B-A3B MoE coder released January 2026. docs.z.ai
DeepSeek direct (api.deepseek.com) 5M signup tokens (30d expiry), then pay-as-you-go $0.14/$0.28 V4 Flash, $0.55/$2.19 R1. Cheaper direct than via OpenRouter pass-through (no OR markup). api-docs.deepseek.com
Inception Mercury (api.inceptionlabs.ai) 10M signup tokens (30d expiry) Mercury 2 = diffusion LLM, vendor 1109 t/s (~155 t/s effective). Best for trivial edits at extreme speed. docs.inceptionlabs.ai
MiniMax (api.minimaxi.com) Trial free through 2026-11-07 MiniMax-M2.5 = 80.2% SWE-V, Apache 2.0 base, strong agentic reasoning. platform.minimax.io

Routing changes

Slot Was Now (priority order)
default Groq Llama 70B → OR :free → V4 Flash → Grok Z.ai GLM-4.7-Flash → Groq Llama 70B → MiniMax M2.5 → OR :free → DeepSeek direct → OR paid → Grok
think OR R1 :free → R1 paid → Groq OR R1 :free → Z.ai GLM-4.7-Flash → DeepSeek R1 direct → OR R1 paid → Groq → Grok
background Cerebras → Groq → OR :free Cerebras → Z.ai GLM-4.5-Flash → Groq gpt-oss-20b → Llama 8B → OR :free
trivial Cerebras → Groq → Llama Mercury 2 (1000 t/s diffusion) → Cerebras → Groq → Llama
search Grok → Groq → OR Grok → Z.ai GLM-4.7-Flash → Groq → DeepSeek direct

The default-model chain now has 7 mappings with 4 free options before any paid call.

Tier guards updated

  • trivial (max_tokens_below=500) : added mercury (best diffusion speed for tiny edits)
  • complex (>100K input) : added deepseek + minimax (both 131K+ ctx native)
  • medium (>7K input) : added zai (GLM-4.7-Flash native 131K ctx)

Account requirements (documented at top of file)

Account Free tier? Cost after
GROQ_API_KEY ✅ Ongoing
CEREBRAS_API_KEY ✅ Ongoing 1M tok/day
ZAI_API_KEY ✅ Ongoing
OPENROUTER_API_KEY 50 RPD base / 1000 RPD with $10 lifetime $10 one-shot
DEEPSEEK_API_KEY 5M signup, 30d $0.14/$0.28 V4 Flash
MERCURY_API_KEY 10M signup, 30d varies
MINIMAX_API_KEY Trial until 2026-11-07 varies post-trial
XAI_API_KEY ❌ paid only $0.20/$0.50 Grok 4.1 Fast
GEMINI_API_KEY (opt-in) 250+100 RPD trains on free data ⚠️
NEBIUS_API_KEY (opt-in) $0.02/$0.06 floor

Test plan

  • grob preset info ultra-cheap — parses, 10 providers, 5 models with rich fallback chains
  • No hardcoded version strings (CI guard)
  • All endpoints/model IDs from official source docs
  • Account requirements documented with rationale per provider
  • Docs lint passes in CI
  • Smoke after merge

🤖 Generated with Claude Code

@Destynova2 Destynova2 enabled auto-merge April 27, 2026 20:26
Adds 4 more free-tier sources to ultra-cheap, verified against the
April 2026 source docs of each provider.

New providers:

- **Z.ai** (api.z.ai) — ONGOING free GLM-4.7-Flash and GLM-4.5-Flash.
  Not a trial, not signup credits — recurring free for life,
  rate-limited (~1M tok/day reported). GLM-4.7-Flash is a 30B-A3B
  MoE coder released January 2026.

- **DeepSeek direct** (api.deepseek.com) — 5M tokens free at signup
  (30-day expiry), then pay-as-you-go at $0.14/$0.28 V4 Flash and
  $0.55/$2.19 R1. Cheaper direct than via OpenRouter pass-through
  (no OR markup).

- **Inception Mercury** (api.inceptionlabs.ai) — 10M tokens free at
  signup (30-day expiry). Mercury 2 is a diffusion LLM, vendor
  claim 1109 t/s (~155 t/s effective due to reasoning overhead).
  Best for trivial / edits at high speed.

- **MiniMax** (api.minimaxi.com) — TRIAL free through 2026-11-07.
  MiniMax-M2.5 = 80.2% SWE-V, Apache 2.0 base, strong agentic
  reasoning. Free until trial expires.

Routing changes:

- default-model p1 = Z.ai GLM-4.7-Flash (was Groq Llama 70B). The
  30B-A3B coding-specific MoE outperforms generalist Llama on dev
  tasks at the same free cost. Groq Llama 70B drops to p2.
- default chain expanded to 7 mappings (zai → groq → minimax →
  OR :free → deepseek direct → OR paid → grok). Deepseek direct
  comes before OR paid because direct is ~10% cheaper post-bonus.
- think p2 added Z.ai GLM-4.7-Flash (free reasoning fallback).
- think p3 added DeepSeek direct (cheaper than OR for R1 paid).
- background p2 added Z.ai GLM-4.5-Flash (second free chinese coder).
- trivial p1 = Mercury 2 (was Cerebras). 1000 t/s diffusion edits
  beat Cerebras at the trivial slot when the 10M signup pool is
  available; Cerebras drops to p2 (1M tok/day ongoing).
- search p2 added Z.ai GLM-4.7-Flash for free fallback.

Tier guards updated:

- trivial : added mercury (best for tiny edits at extreme speed)
- complex >100K input : added deepseek + minimax (both 131K+ ctx)
- medium >7K input : added zai (131K ctx on GLM-4.7-Flash)

Documented account requirements at the top with rationale for each
of the 8 enabled providers (groq, cerebras, zai, openrouter, deepseek,
mercury, minimax, xai). Gemini and Nebius stay disabled by default.

Sources verified 2026-04-27:
  https://docs.z.ai/devpack/overview
  https://api-docs.deepseek.com/quick_start/pricing
  https://docs.inceptionlabs.ai/get-started/get-started
  https://platform.minimax.io/docs/guides/quickstart-preparation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Destynova2 Destynova2 force-pushed the refactor/ultra-cheap-add-glm-minimax branch from fcfcbff to ae03eb0 Compare April 27, 2026 20:27
@Destynova2 Destynova2 merged commit 6a7dcf6 into main Apr 27, 2026
29 of 30 checks passed
@Destynova2 Destynova2 deleted the refactor/ultra-cheap-add-glm-minimax branch April 27, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant