Skip to content

maximehip/MXClip

MXCLip logo

MXCLip

Turn your streams and video into viral clips with AI — 100% locally

Powered by Ollama, Whisper, and FFmpeg. Your videos never leave your machine.


MXCLip demo


Screenshots

Home screen   Detected clips

Home screen                                          Detected clips with scores

Extracted clip     CLI output

Extracted clip                                               CLI performance analysis


Why MXClip ?

MXCLip Opus Clip Captions
Price Free ~$30/mo ~$20/mo
Runs locally
Privacy 100% offline Cloud upload Cloud upload
Custom AI model Yes (any Ollama model)
Open source

Features

  • Automatic clip detection — Analyzes your video to find the most engaging moments
  • Live stream monitoring — Works with Twitch streams in real time
  • Face detection — Tracks speaker presence throughout the video
  • Audio transcription — Whisper-powered speech-to-text, no API needed
  • Vision analysis — Understands what's happening on screen via local LLM
  • Q&A mode — Ask questions about any moment in your video
  • Electron GUI + CLI — Pick whichever interface you prefer
  • Vector search — Semantic search over embedded video events

Requirements

  • Node.js 18+
  • Python 3.10+
  • Ollama running locally
  • ffmpeg (brew install ffmpeg on macOS, apt install ffmpeg on Linux)
  • A Twitch account (only for stream mode)

Installation

git clone https://github.com/YOUR_USERNAME/mxclip.git
cd mxclip

# Install Node dependencies
npm install

# Install Python dependencies
cd scripts && pip install -r requirements.txt && cd ..

# Set up environment
cp .env.example .env
# Edit .env and add your Twitch credentials (only needed for stream mode)

# Build
npm run build

That's it — no need to install models manually. MXCLip downloads and configures everything automatically on first launch.

Models

MXCLip downloads the following models automatically the first time you run it:

Model Role Source
Whisper (ggml-small) Audio transcription whisper.cpp / HuggingFace
nomic-embed-text Vector embeddings Ollama
gemma4:e4b Semantic analysis & Q&A Ollama
FastVLM (Apple, 4-bit) Vision understanding Apple CDN + MLX

Progress is shown in real time — a dedicated screen in the Electron GUI, and progress bars in the CLI terminal. Once all models are ready the app starts normally. Subsequent launches skip the download entirely.

Usage

Electron GUI (recommended)

npm run electron

CLI

npm start

You will be prompted to choose between:

  • Video mode — Analyze a local video file (video.mp4 in project root)
  • Stream mode — Connect to a live Twitch stream

Configuration

Copy .env.example to .env and fill in:

TWITCH_CLIENT_ID=your_client_id_here
TWITCH_CLIENT_SECRET=your_client_secret_here

Get your Twitch credentials at dev.twitch.tv/console/apps.

How it works

Video / Stream
     │
     ▼
Frame extraction (FFmpeg)
     │
     ├── OCR (Tesseract)
     ├── Audio transcription (Whisper)
     ├── Face detection (InsightFace + YOLOv8)
     └── Vision analysis (FastVLM / Ollama)
     │
     ▼
Event embedding + Vector search
     │
     ▼
Clip scoring & detection
     │
     ▼
Output clips (MP4)

Output

Clips are saved to output/clips/. Each video gets its own cache directory based on a hash of the file path, so re-runs are fast.

Contributing

See CONTRIBUTING.md.

License

MIT

About

🎥 AI pipeline that turn your streams and video into viral clips

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors