OpenCLI

Pipe AI Models to your Terminal. Give Your Agents Hands and Eyes.

OpenCLI is the native Swift/MLX capability engine for the command line. Convert local models into modular Agent Skills. High performance, zero Python, 100% private. Optimized for OpenClaw and MCP.

An agent without sensors is just a chatbox. OpenCLI provides the physical layer for local AI. Built natively with Swift for Apple Silicon, it delivers the cold-start speed and modality support that server-side LLM runners lack.

Native OpenClaw & MCP Support
Unified Memory Hardware Sensing
Zero Python dependencies at runtime

Quick Install (macOS)

brew tap openclirun/opencli
brew install opencli

(Or build from source using Swift Package Manager)

Know Your Hardware, Run Right-Sized Models

OpenCLI features a built-in fit command to instantly evaluate your hardware (RAM/Unified Memory) and score models based on fit, speed, and context limits.

$ opencli fit

Device: Apple M2 | total 16.0 GB | available 4.6 GB | model budget 3.9 GB
GPU: Apple M2 | backend: metal | unified_memory: true

Recommendations by task:
- [asr] Qwen3-ASR 1.7B 4bit | 🟡 Good | score 86.5 | GPU
- [chat] Qwen3 Chat 1.7B 4bit | 🟠 Marginal | score 82.5 | GPU
- [embedding] Qwen3 Embedding 0.6B 4bit DWQ | 🟢 Perfect | score 74.3 | GPU
- [i2i] Qwen Image Edit 2511 | 🔴 TooTight | score 58.9 | CPU+GPU
- [i2t] Qwen3 VL 4B Instruct 3bit | 🟠 Marginal | score 83.9 | GPU
- [i2v] LTX-2 Distilled (I2V) | 🔴 TooTight | score 59.7 | CPU+GPU
- [ocr] DeepSeek OCR | 🟠 Marginal | score 76.0 | GPU
- [rerank] Qwen3 Reranker 0.6B 4bit | 🟢 Perfect | score 71.9 | GPU
- [sr] SeedVR2 3B | 🟠 Marginal | score 77.9 | GPU
- [sts] LFM2.5 Audio 1.5B 6bit | 🟡 Good | score 84.7 | GPU
- [t2i] Qwen Image 2512 | 🔴 TooTight | score 58.7 | CPU+GPU
- [t2m] ACE-Step 1.5 | 🔴 TooTight | score 57.0 | CPU+GPU
- [t2v] LTX-2 Distilled (T2V) | 🔴 TooTight | score 60.3 | CPU+GPU
- [tts] Orpheus 3B 0.1 FT bf16 | 🟠 Marginal | score 85.3 | GPU
- [vad] Sortformer 4SPK v2.1 fp16 | 🟢 Perfect | score 68.3 | GPU

Capabilities & Local Models

OpenCLI focuses on running right-sized, hardware-optimized models that fit perfectly in your Mac's unified memory, bringing true multimodal capabilities directly to your terminal.

👁️ Vision (OCR, VLM, Embeddings)

See everything locally. From structured documents to real-time screen analysis for autonomous agents.

Qwen3-VL 4B (Instruct 3bit): Fast and highly capable small multimodal vision.
DeepSeek OCR / GLM-OCR: Lightning-fast, accurate local text extraction.
Qwen3 Embedding & Reranker (0.6B 4bit): Ultra-efficient perfect fit for local semantic search.
SeedVR2 3B: Spatial understanding and super-resolution models.

🎙️ Audio (ASR, TTS, VAD, STS)

Hear and speak natively. Ultra-low latency voice perception and multi-speaker cloned synthesis.

Qwen3-ASR (1.7B 4bit) / Parakeet: Native speech-to-text with exceptional speed.
Orpheus (3B bf16) / Qwen3-TTS / Pocket TTS: Lightweight, low-latency text-to-speech perfect for instant agent responses.
LFM2.5 Audio (1.5B 6bit): Direct Speech-to-Speech (STS) handling.
Sortformer (4SPK v2.1 fp16): Perfect-fit Voice Activity Detection (VAD) and speaker diarization.

🪄 Generator (Image, Video, Audio)

Create across dimensions. High-performance local generation for visual assets and 3D meshes.

Flux.2 (Klein 4B): Pure Swift implementation of Flux.2 image generation. On-the-fly quantization (qint8/int4) ensures it runs efficiently on standard M-series Macs.
Qwen Image 2512 & Image Edit: Advanced Image-to-Image (I2I) and Text-to-Image (T2I) generation.
LTX-2 Distilled: Video generation bridging Text-to-Video (T2V) and Image-to-Video (I2V).
ACE-Step 1.5: Advanced Text-to-Music/Audio generation.

🧠 LLM (Chat & Coding)

Think and build locally. Private reasoning, instruction following, and coding capabilities optimized for MLX.

Qwen3-Instruct (1.7B 4bit): Highly capable reasoning and coding models optimized for Apple Silicon.
Llama-Series: Built-in support for standard instruct and chat architectures.

Workflow Examples

Combine OpenCLI commands to build instant multimodal workflows using standard Unix pipes:

# A complete Voice-to-Voice pipeline in one line
opencli asr | opencli chat | opencli tts

Community & Docs

Website: opencli.run
Documentation: See the docs/ folder for specific model usage (e.g., asr-qwen3.md, t2i-flux2.md).

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github/workflows		.github/workflows
Sources		Sources
Tests		Tests
assets		assets
docs		docs
examples/flux2-klein-4b		examples/flux2-klein-4b
references/mlx-audio-tools		references/mlx-audio-tools
scripts		scripts
.gitignore		.gitignore
Makefile		Makefile
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenCLI

Quick Install (macOS)

Know Your Hardware, Run Right-Sized Models

Capabilities & Local Models

👁️ Vision (OCR, VLM, Embeddings)

🎙️ Audio (ASR, TTS, VAD, STS)

🪄 Generator (Image, Video, Audio)

🧠 LLM (Chat & Coding)

Workflow Examples

Community & Docs

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenCLI

Quick Install (macOS)

Know Your Hardware, Run Right-Sized Models

Capabilities & Local Models

👁️ Vision (OCR, VLM, Embeddings)

🎙️ Audio (ASR, TTS, VAD, STS)

🪄 Generator (Image, Video, Audio)

🧠 LLM (Chat & Coding)

Workflow Examples

Community & Docs

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

Packages