refactor(presets): add Z.ai GLM + MiniMax + Mercury + DeepSeek direct by Destynova2 · Pull Request #301 · azerozero/grob

Destynova2 · 2026-04-27T20:26:31Z

Summary

Adds 4 more free-tier sources to ultra-cheap, verified against April 2026 source docs.

New providers

Provider	Type	What you get	Source
Z.ai (`api.z.ai`)	⭐ ONGOING free	GLM-4.7-Flash + GLM-4.5-Flash recurring free for life (rate-limited, ~1M tok/day reported). GLM-4.7-Flash = 30B-A3B MoE coder released January 2026.	docs.z.ai
DeepSeek direct (`api.deepseek.com`)	5M signup tokens (30d expiry), then pay-as-you-go	$0.14/$0.28 V4 Flash, $0.55/$2.19 R1. Cheaper direct than via OpenRouter pass-through (no OR markup).	api-docs.deepseek.com
Inception Mercury (`api.inceptionlabs.ai`)	10M signup tokens (30d expiry)	Mercury 2 = diffusion LLM, vendor 1109 t/s (~155 t/s effective). Best for trivial edits at extreme speed.	docs.inceptionlabs.ai
MiniMax (`api.minimaxi.com`)	Trial free through 2026-11-07	MiniMax-M2.5 = 80.2% SWE-V, Apache 2.0 base, strong agentic reasoning.	platform.minimax.io

Routing changes

Slot	Was	Now (priority order)
default	Groq Llama 70B → OR :free → V4 Flash → Grok	Z.ai GLM-4.7-Flash → Groq Llama 70B → MiniMax M2.5 → OR :free → DeepSeek direct → OR paid → Grok
think	OR R1 :free → R1 paid → Groq	OR R1 :free → Z.ai GLM-4.7-Flash → DeepSeek R1 direct → OR R1 paid → Groq → Grok
background	Cerebras → Groq → OR :free	Cerebras → Z.ai GLM-4.5-Flash → Groq gpt-oss-20b → Llama 8B → OR :free
trivial	Cerebras → Groq → Llama	Mercury 2 (1000 t/s diffusion) → Cerebras → Groq → Llama
search	Grok → Groq → OR	Grok → Z.ai GLM-4.7-Flash → Groq → DeepSeek direct

The default-model chain now has 7 mappings with 4 free options before any paid call.

Tier guards updated

trivial (max_tokens_below=500) : added mercury (best diffusion speed for tiny edits)
complex (>100K input) : added deepseek + minimax (both 131K+ ctx native)
medium (>7K input) : added zai (GLM-4.7-Flash native 131K ctx)

Account requirements (documented at top of file)

Account	Free tier?	Cost after
GROQ_API_KEY	✅ Ongoing	—
CEREBRAS_API_KEY	✅ Ongoing 1M tok/day	—
ZAI_API_KEY	✅ Ongoing	—
OPENROUTER_API_KEY	50 RPD base / 1000 RPD with $10 lifetime	$10 one-shot
DEEPSEEK_API_KEY	5M signup, 30d	$0.14/$0.28 V4 Flash
MERCURY_API_KEY	10M signup, 30d	varies
MINIMAX_API_KEY	Trial until 2026-11-07	varies post-trial
XAI_API_KEY	❌ paid only	$0.20/$0.50 Grok 4.1 Fast
GEMINI_API_KEY (opt-in)	250+100 RPD	trains on free data ⚠️
NEBIUS_API_KEY (opt-in)	❌	$0.02/$0.06 floor

Test plan

grob preset info ultra-cheap — parses, 10 providers, 5 models with rich fallback chains
No hardcoded version strings (CI guard)
All endpoints/model IDs from official source docs
Account requirements documented with rationale per provider
Docs lint passes in CI
Smoke after merge

🤖 Generated with Claude Code

Adds 4 more free-tier sources to ultra-cheap, verified against the April 2026 source docs of each provider. New providers: - **Z.ai** (api.z.ai) — ONGOING free GLM-4.7-Flash and GLM-4.5-Flash. Not a trial, not signup credits — recurring free for life, rate-limited (~1M tok/day reported). GLM-4.7-Flash is a 30B-A3B MoE coder released January 2026. - **DeepSeek direct** (api.deepseek.com) — 5M tokens free at signup (30-day expiry), then pay-as-you-go at $0.14/$0.28 V4 Flash and $0.55/$2.19 R1. Cheaper direct than via OpenRouter pass-through (no OR markup). - **Inception Mercury** (api.inceptionlabs.ai) — 10M tokens free at signup (30-day expiry). Mercury 2 is a diffusion LLM, vendor claim 1109 t/s (~155 t/s effective due to reasoning overhead). Best for trivial / edits at high speed. - **MiniMax** (api.minimaxi.com) — TRIAL free through 2026-11-07. MiniMax-M2.5 = 80.2% SWE-V, Apache 2.0 base, strong agentic reasoning. Free until trial expires. Routing changes: - default-model p1 = Z.ai GLM-4.7-Flash (was Groq Llama 70B). The 30B-A3B coding-specific MoE outperforms generalist Llama on dev tasks at the same free cost. Groq Llama 70B drops to p2. - default chain expanded to 7 mappings (zai → groq → minimax → OR :free → deepseek direct → OR paid → grok). Deepseek direct comes before OR paid because direct is ~10% cheaper post-bonus. - think p2 added Z.ai GLM-4.7-Flash (free reasoning fallback). - think p3 added DeepSeek direct (cheaper than OR for R1 paid). - background p2 added Z.ai GLM-4.5-Flash (second free chinese coder). - trivial p1 = Mercury 2 (was Cerebras). 1000 t/s diffusion edits beat Cerebras at the trivial slot when the 10M signup pool is available; Cerebras drops to p2 (1M tok/day ongoing). - search p2 added Z.ai GLM-4.7-Flash for free fallback. Tier guards updated: - trivial : added mercury (best for tiny edits at extreme speed) - complex >100K input : added deepseek + minimax (both 131K+ ctx) - medium >7K input : added zai (131K ctx on GLM-4.7-Flash) Documented account requirements at the top with rationale for each of the 8 enabled providers (groq, cerebras, zai, openrouter, deepseek, mercury, minimax, xai). Gemini and Nebius stay disabled by default. Sources verified 2026-04-27: https://docs.z.ai/devpack/overview https://api-docs.deepseek.com/quick_start/pricing https://docs.inceptionlabs.ai/get-started/get-started https://platform.minimax.io/docs/guides/quickstart-preparation Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Destynova2 enabled auto-merge April 27, 2026 20:26

Destynova2 force-pushed the refactor/ultra-cheap-add-glm-minimax branch from fcfcbff to ae03eb0 Compare April 27, 2026 20:27

Destynova2 merged commit 6a7dcf6 into main Apr 27, 2026
29 of 30 checks passed

Destynova2 deleted the refactor/ultra-cheap-add-glm-minimax branch April 27, 2026 20:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(presets): add Z.ai GLM + MiniMax + Mercury + DeepSeek direct#301

refactor(presets): add Z.ai GLM + MiniMax + Mercury + DeepSeek direct#301
Destynova2 merged 1 commit intomainfrom
refactor/ultra-cheap-add-glm-minimax

Destynova2 commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Destynova2 commented Apr 27, 2026

Summary

New providers

Routing changes

Tier guards updated

Account requirements (documented at top of file)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant