refactor(presets): add Z.ai GLM + MiniMax + Mercury + DeepSeek direct#301
Merged
Destynova2 merged 1 commit intomainfrom Apr 27, 2026
Merged
refactor(presets): add Z.ai GLM + MiniMax + Mercury + DeepSeek direct#301Destynova2 merged 1 commit intomainfrom
Destynova2 merged 1 commit intomainfrom
Conversation
Adds 4 more free-tier sources to ultra-cheap, verified against the April 2026 source docs of each provider. New providers: - **Z.ai** (api.z.ai) — ONGOING free GLM-4.7-Flash and GLM-4.5-Flash. Not a trial, not signup credits — recurring free for life, rate-limited (~1M tok/day reported). GLM-4.7-Flash is a 30B-A3B MoE coder released January 2026. - **DeepSeek direct** (api.deepseek.com) — 5M tokens free at signup (30-day expiry), then pay-as-you-go at $0.14/$0.28 V4 Flash and $0.55/$2.19 R1. Cheaper direct than via OpenRouter pass-through (no OR markup). - **Inception Mercury** (api.inceptionlabs.ai) — 10M tokens free at signup (30-day expiry). Mercury 2 is a diffusion LLM, vendor claim 1109 t/s (~155 t/s effective due to reasoning overhead). Best for trivial / edits at high speed. - **MiniMax** (api.minimaxi.com) — TRIAL free through 2026-11-07. MiniMax-M2.5 = 80.2% SWE-V, Apache 2.0 base, strong agentic reasoning. Free until trial expires. Routing changes: - default-model p1 = Z.ai GLM-4.7-Flash (was Groq Llama 70B). The 30B-A3B coding-specific MoE outperforms generalist Llama on dev tasks at the same free cost. Groq Llama 70B drops to p2. - default chain expanded to 7 mappings (zai → groq → minimax → OR :free → deepseek direct → OR paid → grok). Deepseek direct comes before OR paid because direct is ~10% cheaper post-bonus. - think p2 added Z.ai GLM-4.7-Flash (free reasoning fallback). - think p3 added DeepSeek direct (cheaper than OR for R1 paid). - background p2 added Z.ai GLM-4.5-Flash (second free chinese coder). - trivial p1 = Mercury 2 (was Cerebras). 1000 t/s diffusion edits beat Cerebras at the trivial slot when the 10M signup pool is available; Cerebras drops to p2 (1M tok/day ongoing). - search p2 added Z.ai GLM-4.7-Flash for free fallback. Tier guards updated: - trivial : added mercury (best for tiny edits at extreme speed) - complex >100K input : added deepseek + minimax (both 131K+ ctx) - medium >7K input : added zai (131K ctx on GLM-4.7-Flash) Documented account requirements at the top with rationale for each of the 8 enabled providers (groq, cerebras, zai, openrouter, deepseek, mercury, minimax, xai). Gemini and Nebius stay disabled by default. Sources verified 2026-04-27: https://docs.z.ai/devpack/overview https://api-docs.deepseek.com/quick_start/pricing https://docs.inceptionlabs.ai/get-started/get-started https://platform.minimax.io/docs/guides/quickstart-preparation Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fcfcbff to
ae03eb0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 4 more free-tier sources to
ultra-cheap, verified against April 2026 source docs.New providers
api.z.ai)api.deepseek.com)api.inceptionlabs.ai)api.minimaxi.com)Routing changes
The default-model chain now has 7 mappings with 4 free options before any paid call.
Tier guards updated
trivial(max_tokens_below=500) : added mercury (best diffusion speed for tiny edits)complex(>100K input) : added deepseek + minimax (both 131K+ ctx native)medium(>7K input) : added zai (GLM-4.7-Flash native 131K ctx)Account requirements (documented at top of file)
Test plan
grob preset info ultra-cheap— parses, 10 providers, 5 models with rich fallback chains🤖 Generated with Claude Code