An MCP (Model Context Protocol) server that connects Claude Code to GLM/Z.AI models. Designed as a sub-agent that Claude delegates to — minimizing Claude token usage while Claude stays in control as the orchestrator.
Claude is the conductor. GLM is a specialist musician. This MCP lets Claude:
- Delegate entire coding tasks to GLM (fire-and-forget) instead of doing everything itself
- Get second opinions from a different model cheaply
- Offload mechanical work (tests, boilerplate, refactoring) to save Claude tokens
- Review code through a separate model's lens
- Delegate — Fire-and-forget autonomous task execution. Claude sends a task, GLM does all the work, returns a summary. Biggest token saver.
- Review — Structured code review with categorized issues (bugs, security, performance, style)
- Generate — Code generation from specs, matching existing project style
- Chat — Quick questions or second opinions from another LLM
- Vision — Analyze images using GLM-4V-Plus multimodal capabilities
- Embeddings — Generate text embeddings for semantic search and similarity
- Agent (step-by-step) — Fine-grained control when Claude needs to monitor between steps
- Node.js (v18 or higher)
- npm
- Z.AI API key (get it from z.ai)
- Z.AI Coding Plan (Pro or Max required for GLM-5)
claude mcp add mcp-glm -e GLM_API_KEY=your-api-key-here -- npx -y github:mathi5/mcp-glmFor global installation (across all projects):
claude mcp add mcp-glm -s user -e GLM_API_KEY=your-api-key-here -- npx -y github:mathi5/mcp-glm| Variable | Required | Default | Description |
|---|---|---|---|
GLM_API_KEY |
Yes | - | Your Z.AI API key |
GLM_MODEL |
No | glm-5 |
Default model for chat/delegate operations |
GLM_API_BASE |
No | https://api.z.ai/api/coding/paas/v4 |
API base URL |
| Tool | Description |
|---|---|
glm_delegate |
Fire-and-forget — Delegate an entire task to GLM. It reads files, edits code, runs commands autonomously and returns a summary. Best for self-contained tasks. |
glm_review |
Code review — Send code for structured review. Returns JSON with categorized issues, severity, and suggestions. |
glm_generate |
Code generation — Generate code from a spec. Pass existing code as context for style matching. |
glm_chat |
Quick questions — Ask GLM anything. Second opinions, API lookups, brainstorming. |
| Tool | Description |
|---|---|
glm_vision |
Analyze images with GLM-4V-Plus multimodal model. |
glm_embeddings |
Generate text embeddings for semantic search/similarity. |
| Tool | Description |
|---|---|
glm_agent_start |
Start a step-by-step agent session. Use when you need to inspect/intervene between steps. |
glm_agent_step |
Execute the next step of an agent session. |
glm_agent_stop |
Stop a session and get the full summary. |
Use glm_delegate with task: "Add unit tests for the calculator module in src/calc.ts"
and context: <paste the file content>
Claude calls once, GLM does everything, Claude reviews the result.
Use glm_review with the code from src/auth.ts, focus on ["security", "bugs"]
Use glm_generate to create a TypeScript interface for the User API response,
with context from the existing types in src/types.ts
Use glm_chat to ask: "What's the difference between Promise.all and Promise.allSettled?"
Use glm_agent_start with task: "Refactor the database module"
Then call glm_agent_step and inspect each result
Call glm_agent_stop when done
MIT