pron. "CLAW-dlee" — Claude, locally.
Launch Claude Code against a
local LLM (LM Studio, Ollama, llama.cpp, or any Anthropic-compatible
endpoint), without affecting the regular claude command — that one keeps
talking to the official Anthropic API.
Disclaimer. This is an unofficial, community-maintained helper. It is not affiliated with, endorsed by, or sponsored by Anthropic. Claude and Claude Code are trademarks of Anthropic, used here only descriptively to identify the upstream tool this CLI wraps.
claudelydoes not modify theclaudebinary; it only sets documented environment variables and spawnsclaudeunchanged.
Plenty of CLI coding agents will happily talk to a local LLM. The catch is the ecosystem: skills, slash commands, MCP servers, plugins, hooks — the interesting tooling has been built specifically for Claude Code, and the parity story on every other agent is patchy at best. Trying to reuse a Claude-shaped workflow on a different agent quickly turns into "rewrite all the plugins" or "do without."
claudely skips that fight. Keep Claude Code as the client (and its entire
plugin / skill / MCP ecosystem with it), and just point it at a model
running on your own hardware. The official claude command still talks to
Anthropic; claudely is a separate entrypoint that sets the right
environment variables for the local path and spawns claude unchanged.
Published on npm as claudely.
# global (recommended)
npm i -g claudely
# or one-shot, no install
npx claudelyRequires Node.js ≥ 20 and the claude CLI on your PATH. (npm 7+ will
install Claude Code automatically as a peer dependency; users who got
claude via the native installer or Homebrew are unaffected.)
# LM Studio (default), interactive picker over your downloaded models
claudely
# Ollama
claudely -p ollama
# llama.cpp (whichever GGUF llama-server is currently serving)
claudely -p llamacpp
# Skip the picker by naming a model
claudely -p ollama -m gpt-oss:20b
claudely -p lmstudio -m openai/gpt-oss-20b
# Just print what's available, don't launch claude
claudely -p ollama --list
# Custom Anthropic-compatible endpoint (e.g. a litellm proxy)
claudely -p custom -u http://localhost:4000 -t sk-anything -m my-model
# Any flag claudely doesn't recognize is forwarded verbatim to claude
claudely -p ollama -m gpt-oss:20b --print "explain this repo"
# `--` is an escape hatch to force a token through, e.g. if claude grows
# a flag whose name collides with one of claudely's own
claudely -p ollama -- --provider force-this-to-claude| Provider | Default base URL | Native? | Docs |
|---|---|---|---|
lmstudio (default) |
http://localhost:1234 |
yes | https://lmstudio.ai/blog/claudecode |
ollama |
http://localhost:11434 |
yes | https://docs.ollama.com/integrations/claude-code |
llamacpp |
http://localhost:8080 |
yes | https://unsloth.ai/docs/basics/claude-code |
custom |
(you supply it) | depends | point at any Anthropic-compatible endpoint or proxy |
For backends that only speak the OpenAI protocol (vLLM, text-generation-webui,
TabbyAPI, …), front them with a translation proxy such as
litellm or
claude-code-router and
point claudely at the proxy via -p custom.
- Node.js ≥ 20 and the
claudeCLI on yourPATH - A running local server for the provider you want:
- LM Studio —
lms server start --port 1234plus at least one downloaded model (lms ls --llm) - Ollama —
ollama serveplus at least one pulled model (ollama list) - llama.cpp —
llama-server --port 8080 -m /path/to/model.gguf(single model per server instance)
- LM Studio —
| Setting | Sources, first match wins |
|---|---|
| Provider | -p flag → $CLAUDELY_PROVIDER → lmstudio |
| Model | -m flag → $CLAUDELY_MODEL → $LMSTUDIO_MODEL / $OLLAMA_MODEL / $LLAMACPP_MODEL → interactive picker |
| Base URL | -u flag → $CLAUDELY_BASE_URL → provider default |
| Token | -t flag → $CLAUDELY_TOKEN → provider default |
| Port | $LMSTUDIO_PORT / $OLLAMA_PORT / $LLAMACPP_PORT (only affect provider defaults) |
Every variable is set in the spawned process only — your shell (and the
regular claude command) is untouched.
ANTHROPIC_BASE_URL=<provider base URL>
# auth_token style (lmstudio, ollama, custom):
ANTHROPIC_AUTH_TOKEN=<provider token>
ANTHROPIC_API_KEY="" # blanks any inherited real Anthropic key
# api_key style (llamacpp, per unsloth's docs):
ANTHROPIC_API_KEY=<provider token>
# ANTHROPIC_AUTH_TOKEN unset
# KV-cache fix (only set if not already in your env):
CLAUDE_CODE_ATTRIBUTION_HEADER=0Claude Code prepends an attribution string to the system prompt that contains
a per-request hash (x-anthropic-billing-header: cc_version=…; cch=…;). On
a local server every turn hashes differently, so the prompt cache misses
every single time — unsloth measured ~90% slowdown.
The fix is a single env var: CLAUDE_CODE_ATTRIBUTION_HEADER=0. claudely
sets it for you in the spawned process, so the regular claude command is
unaffected. Override per-invocation by exporting your own value first.
References: official env-vars docs, claude-code#50085.
# Skip Claude Code's telemetry / feedback traffic. Useful when the model is
# local, but it's left to your judgment — claudely does not disable analytics
# Anthropic uses to improve Claude Code without an explicit opt-in.
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1- LM Studio and Ollama JIT-load models on first request. llama.cpp serves
whichever GGUF was passed at startup; switch models by restarting
llama-serverwith a different-mpath. - Claude Code's in-session
/modelcommand does not auto-discover backend models; it accepts an arbitrary id string. To switch mid-session, type/model <id>with one of the ids shown byclaudely --list. - Effort levels:
effortLevel: "xhigh"(Anthropic Opus 4.7 only) gets rejected by local Anthropic-compatible servers (LM Studio, Ollama, etc.) with HTTP 400. When claudely detectsxhighin~/.claude/settings.jsonand the target is notapi.anthropic.com, it prints a one-line stderr warning and injects--effort highinto the spawnedclaudeargv for that session — your settings file is left untouched. To make it permanent, run/effort highinside Claude Code or editsettings.json. An explicit--effortyou pass after--always wins over the override.