Localpi is a local Pi launcher for open-weight models.
The intended default runtime is a managed llama-server process. Localpi should be able to start or reuse that server, point Pi at it, install a small set of default tools, require approval before tool calls, and show local generation speed while the session runs.
LM Studio remains supported as an alternate runtime for people who already have an OpenAI-compatible LM Studio server running.
Localpi is intentionally generic. It should not contain classifier prompts, dataset workflows, GitHub routing logic, or final-schema output machinery. Structured classifier runs belong in caller tools such as localpager-agent.
Implementation status: this repository has been renamed, but the TypeScript CLI still needs the rename/runtime implementation plan applied before every localpi command below exists. This README documents the intended localpi surface.
See:
npm install
npm run buildDuring development:
npm run localpi -- --statusAfter build:
node dist/src/cli/main.js --statusTarget default:
localpi --model gemma-12bThis should use llama-server by default.
LM Studio should be explicit:
localpi --runtime lmstudio --model gemma-4-e4b-itCustom OpenAI-compatible endpoints should also remain possible:
localpi --runtime openai-compatible --base-url http://127.0.0.1:8000/v1 --model my-modelLocalpi should avoid loading multiple heavyweight local runtimes at the same time. When using the managed llama-server runtime, it should either stop its previous managed server or clearly report what is already running before starting another model.
Localpi should launch Pi with:
- default tools:
read,bash,edit,write,grep,find,ls - a system prompt that explains local tool approval and local-model limits
- an approval gate before every tool call
- token speed and token count status while responses stream
- local state under
~/.local/state/localpi
The approval gate should make failed or denied tool calls explicit to the model so the model does not claim that a blocked command ran.
LM Studio exposes an OpenAI-compatible endpoint, usually:
http://127.0.0.1:1234/v1
Load Gemma in LM Studio:
~/.lmstudio/bin/lms server start
~/.lmstudio/bin/lms load gemma-4-e4b-it -yThen run localpi against LM Studio explicitly:
localpi --runtime lmstudio --model gemma-4-e4b-itRun Pi interactively on the default local model:
localpiRun a non-interactive Pi prompt:
localpi -p "summarize this repo"Pin a model alias:
localpi --model gemma-e4b -p "write a detailed implementation plan"Point at a different OpenAI-compatible local server:
localpi --runtime openai-compatible --base-url http://127.0.0.1:8000/v1 -p "review the src directory"Pass a Pi flag that localpi also owns after --:
localpi --model gemma-e4b -- --model some-pi-level-valueStop the managed llama-server runtime:
localpi --stopThese are the target options for the renamed tool:
--runtime <llama-server|lmstudio|openai-compatible>: runtime backend. Default:llama-server--model <alias|id|path|auto>: model alias, model id, or GGUF path--ctx <n>/--context-window <n>: model context window--max-tokens <n>: generated model max output tokens--base-url <url>: OpenAI-compatible endpoint for LM Studio or custom endpoints--server-command <path>:llama-serverexecutable path--chat-template <path>: optional llama.cpp chat template file--state-dir <path>: runtime state directory. Default:~/.local/state/localpi--session-dir <path>: Pi session directory. Default:<state-dir>/sessions--pi-command <command>: Pi launch command--tools <list>: Pi tools allow list. Default:read,bash,edit,write,grep,find,ls--no-approval: disable the tool approval gate--status: print runtime, model, and Pi config status--stop: stop the managedllama-serverprocess--list: list configured model aliases
LOCALPI_RUNTIMELOCALPI_MODELLOCALPI_BASE_URLLOCALPI_STATE_DIRLOCALPI_SESSION_DIRLOCALPI_PI_CMDLOCALPI_CONTEXT_WINDOWLOCALPI_MAX_TOKENSLOCALPI_LLAMA_SERVERLOCALPI_CHAT_TEMPLATELOCALPI_TOOLS
npm run format
npm run lint
npm run typecheck
npm test
npm run build
npm run check