Run chat, code, and local model workflows on-device — without shipping your work to the cloud.
MLX-native. Ollama/OpenAI API compatible. Zero API keys.
playovo_small.mp4
Document parsing & RAG — PPTX, HWP, HWPX parsing via kordoc. OCR for scanned PDFs. Knowledge Base auto-injects relevant docs into chat context.
LoRA fine-tuning — Dataset creation, training, adapter merge. Full mlx-lm pipeline with configurable rank/layers/epochs.
Model blending — Merge two same-architecture models via SLERP, Linear, TIES, or DARE. Progress tracking GUI included.
Ping Pong upgrades — File/code/URL attachments, editor-style code blocks (dark bg + line numbers), auto-split long responses into multiple bubbles, typing indicator, thinking content stripped from output.
Other changes — Web search only fires on explicit request. Settings page grouped into card sections. File extraction limit raised to 200 KB. Mac chip name shown in hardware cards.
Monaco editor + file explorer + Git panel + PTY terminal + AI inline completion. The Agent Chat on the right gets file read/write/search/exec tools and MCP server integration — it can actually do the work, not just describe it. Like Claude Code, but running on your Mac with any open LLM.
Native Ollama/OpenAI API compatibility, streaming responses, persona switching, file attachments (PDF / Excel / Word / images), voice input + TTS.
Local text-to-image via diffusers. Sampler / steps / CFG / LoRA controls.
Curated notes + auto-captured session logs with BM25 + semantic search. Context that survives restarts.
Auto-detects ~/.cache/huggingface/hub/ + LM Studio cache — models you already have just show up. Download from URL or search HuggingFace directly.
OVO is model-agnostic. Recommended by origin:
- US — Llama (Meta), Gemma (Google), Phi (Microsoft)
- EU — Mistral / Mixtral (Mistral AI, France)
- Asia — Qwen (Alibaba), DeepSeek, GLM
All models run fully local — no data leaves your machine.
Scores every model against your RAM / GPU / context headroom. Know before you download.
An SVG owl that reacts to your coding state. Double-click to summon the main window.
- Download the latest
OVO_x.y.z_aarch64.dmgfrom Releases. - Open the DMG and drag OVO.app onto the Applications shortcut.
- Back in the DMG window, double-click
Install OVO.command. It shows you exactly the one command it will run, you click Run, done.
That's it — no Terminal required.
Why the third step? (click to expand)
OVO's build is not yet signed with an Apple Developer ID (the $99/yr
membership is on the roadmap — see the milestone in Issues).
Without a signature, macOS flags the app with com.apple.quarantine and
refuses to launch it with the classic "OVO is damaged and can't be opened"
dialog.
Install OVO.command runs a single command to clear that flag:
xattr -rd com.apple.quarantine /Applications/OVO.appNo sudo, no network, no background processes. The script is short and
auditable — read it here before running:
scripts/dmg-templates/Install OVO.command
Prefer to do it by hand?
xattr -rd com.apple.quarantine /Applications/OVO.app
open /Applications/OVO.appIf your /Applications/OVO.app happens to be owned by root (rare on
recent macOS), prefix with sudo.
First launch bootstraps a Python runtime into ~/Library/Application Support/com.ovoment.ovo/runtime/ (≈1.5 GB, ~3 min, one-time). Subsequent launches are instant.
- macOS 13+ on Apple Silicon (M1 / M2 / M3 / M4). Intel Macs are not supported.
- 16 GB RAM minimum (7B Q4 models only); 32 GB+ recommended for 14B and above.
- 10 GB free disk for runtime + a couple of models.
16 GB users: Only 7B quantized (Q4) models run comfortably. Disable extra features in Settings → Feature Flags (Wiki, Skills, MCP) to keep the system prompt lean and maximize response speed. The Hardware Fit tab shows which models actually fit your machine.
- Launch OVO.
- Go to Models, pick a model (Qwen3, Llama 3.3, Gemma, Mistral, DeepSeek, …), click download.
- Open Chat and send a message — the local model answers, no network calls.
- Open a project folder in Code to use the IDE + Agent Chat.
| Flavor | Port | Use case |
|---|---|---|
| Ollama | 11435 |
Drop-in replacement for Ollama clients (Open WebUI, Page Assist, …) |
| OpenAI | 11436 |
Point any OpenAI SDK at http://localhost:11436/v1 |
| Native | 11437 |
OVO-specific endpoints — model management, Wiki, streaming, voice |
OVO can read your local Claude Code config so the same context reaches your local model:
CLAUDE.md— injected as system context.claude/settings.json— preferences honoured.claude/plugins/**— behaviour hints
Disabled by default. Flip it on in Settings → Claude Integration. OVO never touches claude.ai, API keys, session tokens, or anything that could affect your Claude account.
git clone https://github.com/ovoment/ovo-local-llm.git
cd ovo-local-llm
# frontend + Rust deps
npm install
# Python sidecar venv (dev uses $HOME cache, avoids SMB locks)
cd sidecar && uv sync && cd ..
# run the full stack in dev mode
npm run tauri devRelease build: npm run tauri build — produces .app + .dmg under your Cargo target dir.
Deeper docs: docs/ARCHITECTURE.md · docs/ARCHITECTURE.en.md · docs/release/BUILD.md · docs/release/SECURITY.md · docs/release/PRIVACY.md
- Shell — Tauri 2 (Rust)
- Frontend — React 18 + TypeScript + Tailwind + shadcn/ui + Monaco
- Sidecar — Python 3.12 FastAPI, spawned by Rust, user-cached venv bootstrapped with a bundled
uv - Runtimes —
mlx-lm,mlx-vlm,mlx-whisper,transformers,diffusers - Storage — SQLite (chats + Wiki), local filesystem (attachments, models)
OVO is a solo-developer project. Every coffee funds one more model architecture I can patch and support.
MIT — use it, fork it, ship it.
Made with 🦉 by ben @ ovoment









