v2.4.0
New models (Claude Opus 4.8, Gemini 3.5 Flash) plus a better local-model experience: browse a curated catalog of coding models, remove models, and see on-disk sizes — all from
/model.
Added
- Claude Opus 4.8 — added to the Anthropic provider and set as the new
default model. Pricing: $5 / $25 per 1M tokens (input / output), 1M context. - Gemini 3.5 Flash (
gemini-3.5-flash) — replaces the preview Flash in the
Google provider list. Pricing: $1.50 / $9.00 per 1M tokens, 1M context. /model browse(Ollama) — a curated catalog of recommended local coding
models (Qwen2.5 Coder, DeepSeek Coder V2, Llama 3.1, DeepSeek R1, …) with
parameter sizes, rough VRAM, and an agent-mode suitability hint. Pick one to
pull it. Mirrors the MCP/skills catalog pattern./model rm <name>(Ollama) — remove a locally-installed model to reclaim
disk, without leaving Codeep. Remote-server guard like/model pull.- On-disk size in
/modelpicker — the Ollama model list now shows each
model's size on disk alongside the agent-mode hint. - Native Ollama API (beta, opt-in) — set Ollama Native API (beta) → On
in/settingsto route Ollama through its native/api/chatendpoint instead
of the OpenAI-compatible/v1shim. Honorsnum_ctx(the model uses its
full context window instead of Ollama's small default) andkeep_alive
(keeps the model resident, avoiding reload latency every turn). Tunable via
ollamaKeepAlive(default30m) andollamaNumCtx(0= auto-detect via
/api/show). Off by default — existing transport unchanged unless you opt
in. Verified against Ollama 0.24 (chat, streaming, usage, native tool calls);
marked beta while it gets coverage across more models and longer sessions.
Please report issues at https://github.com/VladoIvankovic/Codeep/issues —
feedback decides when it becomes the default.
Notes
- Pricing tables (
/cost, dashboard) updated so the new models bill at the
right rates. Previous models (Opus 4.7 / 4.6, Flash preview) stay listed for
back-compat. VS Code/Zed inherit the new catalog automatically over ACP; the
native macOS / iOS apps get the same update via the shared CodeepCore catalog. /model browseand/model rmshell out to the localollamabinary, so
they only run when Ollama is local (remote servers get an SSH hint instead).