Pipe AI Models to your Terminal. Give Your Agents Hands and Eyes.
OpenCLI is the native Swift/MLX capability engine for the command line. Convert local models into modular Agent Skills. High performance, zero Python, 100% private. Optimized for OpenClaw and MCP.
An agent without sensors is just a chatbox. OpenCLI provides the physical layer for local AI. Built natively with Swift for Apple Silicon, it delivers the cold-start speed and modality support that server-side LLM runners lack.
- Native OpenClaw & MCP Support
- Unified Memory Hardware Sensing
- Zero Python dependencies at runtime
brew tap openclirun/opencli
brew install opencli(Or build from source using Swift Package Manager)
OpenCLI features a built-in fit command to instantly evaluate your hardware (RAM/Unified Memory) and score models based on fit, speed, and context limits.
$ opencli fit
Device: Apple M2 | total 16.0 GB | available 4.6 GB | model budget 3.9 GB
GPU: Apple M2 | backend: metal | unified_memory: true
Recommendations by task:
- [asr] Qwen3-ASR 1.7B 4bit | 🟡 Good | score 86.5 | GPU
- [chat] Qwen3 Chat 1.7B 4bit | 🟠 Marginal | score 82.5 | GPU
- [embedding] Qwen3 Embedding 0.6B 4bit DWQ | 🟢 Perfect | score 74.3 | GPU
- [i2i] Qwen Image Edit 2511 | 🔴 TooTight | score 58.9 | CPU+GPU
- [i2t] Qwen3 VL 4B Instruct 3bit | 🟠 Marginal | score 83.9 | GPU
- [i2v] LTX-2 Distilled (I2V) | 🔴 TooTight | score 59.7 | CPU+GPU
- [ocr] DeepSeek OCR | 🟠 Marginal | score 76.0 | GPU
- [rerank] Qwen3 Reranker 0.6B 4bit | 🟢 Perfect | score 71.9 | GPU
- [sr] SeedVR2 3B | 🟠 Marginal | score 77.9 | GPU
- [sts] LFM2.5 Audio 1.5B 6bit | 🟡 Good | score 84.7 | GPU
- [t2i] Qwen Image 2512 | 🔴 TooTight | score 58.7 | CPU+GPU
- [t2m] ACE-Step 1.5 | 🔴 TooTight | score 57.0 | CPU+GPU
- [t2v] LTX-2 Distilled (T2V) | 🔴 TooTight | score 60.3 | CPU+GPU
- [tts] Orpheus 3B 0.1 FT bf16 | 🟠 Marginal | score 85.3 | GPU
- [vad] Sortformer 4SPK v2.1 fp16 | 🟢 Perfect | score 68.3 | GPUOpenCLI focuses on running right-sized, hardware-optimized models that fit perfectly in your Mac's unified memory, bringing true multimodal capabilities directly to your terminal.
See everything locally. From structured documents to real-time screen analysis for autonomous agents.
- Qwen3-VL 4B (Instruct 3bit): Fast and highly capable small multimodal vision.
- DeepSeek OCR / GLM-OCR: Lightning-fast, accurate local text extraction.
- Qwen3 Embedding & Reranker (0.6B 4bit): Ultra-efficient perfect fit for local semantic search.
- SeedVR2 3B: Spatial understanding and super-resolution models.
Hear and speak natively. Ultra-low latency voice perception and multi-speaker cloned synthesis.
- Qwen3-ASR (1.7B 4bit) / Parakeet: Native speech-to-text with exceptional speed.
- Orpheus (3B bf16) / Qwen3-TTS / Pocket TTS: Lightweight, low-latency text-to-speech perfect for instant agent responses.
- LFM2.5 Audio (1.5B 6bit): Direct Speech-to-Speech (STS) handling.
- Sortformer (4SPK v2.1 fp16): Perfect-fit Voice Activity Detection (VAD) and speaker diarization.
Create across dimensions. High-performance local generation for visual assets and 3D meshes.
- Flux.2 (Klein 4B): Pure Swift implementation of Flux.2 image generation. On-the-fly quantization (qint8/int4) ensures it runs efficiently on standard M-series Macs.
- Qwen Image 2512 & Image Edit: Advanced Image-to-Image (I2I) and Text-to-Image (T2I) generation.
- LTX-2 Distilled: Video generation bridging Text-to-Video (T2V) and Image-to-Video (I2V).
- ACE-Step 1.5: Advanced Text-to-Music/Audio generation.
Think and build locally. Private reasoning, instruction following, and coding capabilities optimized for MLX.
- Qwen3-Instruct (1.7B 4bit): Highly capable reasoning and coding models optimized for Apple Silicon.
- Llama-Series: Built-in support for standard instruct and chat architectures.
Combine OpenCLI commands to build instant multimodal workflows using standard Unix pipes:
# A complete Voice-to-Voice pipeline in one line
opencli asr | opencli chat | opencli tts- Website: opencli.run
- Documentation: See the
docs/folder for specific model usage (e.g.,asr-qwen3.md,t2i-flux2.md).
This project is licensed under the MIT License.