A growing collection of scripts, configs, and notes for running Claude Code (and other coding agents) against local LLMs on Apple Silicon and Linux.
Companion to the gist:
Claude Code with local LLMs and ANTHROPIC_BASE_URL: Ollama, LM Studio, llama.cpp, vLLM
scripts/
└── start-claude-code-local.sh — drop-in starter for Ollama + Claude Code with sane defaults (32K context, Gemma 4 26B-A4B)
More to come — LM Studio launchers, llama.cpp build presets, LiteLLM router configs, VS Code workspace settings.
git clone https://github.com/renezander030/local-ai-coding-stack
cd local-ai-coding-stack/scripts
chmod +x start-claude-code-local.sh
./start-claude-code-local.shOverride defaults via env:
OLLAMA_CONTEXT_LENGTH=65536 \
ANTHROPIC_DEFAULT_OPUS_MODEL=qwen3-coder \
./start-claude-code-local.sh- Backend: Ollama (≥ v0.14.0) or LM Studio (≥ 0.4.1)
- Model:
gemma4:26b-a4b(Gemma 4 26B-A4B-it Q4) - Context: 32K on a MacBook Air, 64K on a MacBook Pro M5 Pro/Max
- Machine: 32 GB+ RAM strongly preferred
The gist has the full hardware × model × context × backend matrix and the debug flow when something breaks.
PRs welcome for additional scripts, alternate backends, or hardware presets. Comment on the gist if you've verified a config on a different Mac/RAM/model combo.
MIT — see LICENSE.