Turn your streams and video into viral clips with AI — 100% locally
Powered by Ollama, Whisper, and FFmpeg. Your videos never leave your machine.
Home screen Detected clips with scores
Extracted clip CLI performance analysis
| MXCLip | Opus Clip | Captions | |
|---|---|---|---|
| Price | Free | ~$30/mo | ~$20/mo |
| Runs locally | ✅ | ❌ | ❌ |
| Privacy | 100% offline | Cloud upload | Cloud upload |
| Custom AI model | Yes (any Ollama model) | ❌ | ❌ |
| Open source | ✅ | ❌ | ❌ |
- Automatic clip detection — Analyzes your video to find the most engaging moments
- Live stream monitoring — Works with Twitch streams in real time
- Face detection — Tracks speaker presence throughout the video
- Audio transcription — Whisper-powered speech-to-text, no API needed
- Vision analysis — Understands what's happening on screen via local LLM
- Q&A mode — Ask questions about any moment in your video
- Electron GUI + CLI — Pick whichever interface you prefer
- Vector search — Semantic search over embedded video events
- Node.js 18+
- Python 3.10+
- Ollama running locally
- ffmpeg (
brew install ffmpegon macOS,apt install ffmpegon Linux) - A Twitch account (only for stream mode)
git clone https://github.com/YOUR_USERNAME/mxclip.git
cd mxclip
# Install Node dependencies
npm install
# Install Python dependencies
cd scripts && pip install -r requirements.txt && cd ..
# Set up environment
cp .env.example .env
# Edit .env and add your Twitch credentials (only needed for stream mode)
# Build
npm run buildThat's it — no need to install models manually. MXCLip downloads and configures everything automatically on first launch.
MXCLip downloads the following models automatically the first time you run it:
| Model | Role | Source |
|---|---|---|
Whisper (ggml-small) |
Audio transcription | whisper.cpp / HuggingFace |
| nomic-embed-text | Vector embeddings | Ollama |
| gemma4:e4b | Semantic analysis & Q&A | Ollama |
| FastVLM (Apple, 4-bit) | Vision understanding | Apple CDN + MLX |
Progress is shown in real time — a dedicated screen in the Electron GUI, and progress bars in the CLI terminal. Once all models are ready the app starts normally. Subsequent launches skip the download entirely.
npm run electronnpm startYou will be prompted to choose between:
- Video mode — Analyze a local video file (
video.mp4in project root) - Stream mode — Connect to a live Twitch stream
Copy .env.example to .env and fill in:
TWITCH_CLIENT_ID=your_client_id_here
TWITCH_CLIENT_SECRET=your_client_secret_hereGet your Twitch credentials at dev.twitch.tv/console/apps.
Video / Stream
│
▼
Frame extraction (FFmpeg)
│
├── OCR (Tesseract)
├── Audio transcription (Whisper)
├── Face detection (InsightFace + YOLOv8)
└── Vision analysis (FastVLM / Ollama)
│
▼
Event embedding + Vector search
│
▼
Clip scoring & detection
│
▼
Output clips (MP4)
Clips are saved to output/clips/. Each video gets its own cache directory based on a hash of the file path, so re-runs are fast.
See CONTRIBUTING.md.




