WowClip is a free, local-first agent skill for turning long videos into short vertical clips.
It is designed to be installed into Codex, OpenClaw, Claude Code, or another SKILL.md-compatible coding agent. WowClip is not packaged as a standalone desktop app or consumer video editor. The skill gives your agent a repeatable local workflow for analyzing media, creating subtitles, selecting highlights, planning portrait crops, building an editable edl.json, and exporting MP4 files with local tools.
- A reusable Agent Skill anchored by
SKILL.md. - A local video clipping workflow your coding agent can follow.
- A set of Python and Node.js helper scripts the agent can run on your machine.
- A fully local pipeline for subtitles, highlight selection, face-centered portrait reframing, EDL assembly, previews, and FFmpeg export.
- Not a SaaS product.
- Not a hosted editing API.
- Not a one-click desktop video editor.
- Not a GUI timeline editor.
- Not a model-weight bundle.
The sample assets in assets/demo/ show a landscape source, the planned portrait crop, and the final exported clip.
Source video frames
9:16 portrait crop result
If your agent system supports installing skills from a Git URL, give it this repository URL:
https://github.com/partyfly/wowclip.git
You can also clone this repository into your agent's skills directory manually.
For Codex:
mkdir -p ~/.codex/skills
git clone https://github.com/partyfly/wowclip.git ~/.codex/skills/wowclipFor OpenClaw or Claude Code, install the repository into the skills/plugins directory that your agent runtime scans for SKILL.md files.
Then start a new agent session and ask for WowClip explicitly:
Use the wowclip skill to analyze this local video and create 3 vertical short clips with burned subtitles.
or:
Use the wowclip skill on this YouTube URL. Keep everything local after download, generate subtitles, build edl.json, preview frames, and export MP4.
When the skill is active, the agent should:
- Probe the source media.
- Generate full-source local subtitles before selecting clips.
- Select candidate highlight ranges from transcript timing and text.
- Extract representative frames and detect shot boundaries.
- Detect and track faces with local OpenCV models.
- Build deterministic 9:16 face-centered portrait crop plans.
- Compile selected ranges, crop placement, audio, video, and subtitle clips into
edl.json. - Validate the timeline and portrait plan.
- Generate preview frames or montages.
- Export MP4 locally with FFmpeg.
The editable source of truth is edl.json, not an ad hoc FFmpeg command.
.
├── SKILL.md # Agent workflow contract
├── agents/openai.yaml # Optional agent metadata
├── assets/demo/ # README-visible sample frames and final clip
├── assets/editor/EDITOR_CONTRACT.md # Optional editor integration contract
├── references/ # Pipeline design notes
├── schemas/ # Timeline, highlight, and portrait JSON schemas
├── scripts/ # Local helper scripts used by the skill
└── tests/ # Unit tests for planning and overlay utilities
Install system tools:
- Python 3.10+
- Node.js 18+
- FFmpeg and FFprobe
yt-dlpfor YouTube ingest
Install Python packages for the local helper scripts:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtfaster-whisper, OpenCV, NumPy, Pillow, and yt-dlp are the default Python dependencies. Some features are optional:
faster-whisperis needed for local ASR unless transcript segments are provided.- OpenCV YuNet is needed for face detection.
- OpenCV SFace is recommended for same-person tracking.
- ProPainter is optional for hard subtitle or watermark removal.
Prepare the expected local model directory structure:
python3 scripts/wowclip-bootstrap-models.py '{"modelsDir":"./assets/models"}'This command creates directories and reports missing files. It does not download models automatically.
Expected local model locations include:
assets/models/
├── opencv/
│ ├── face_detection_yunet.onnx
│ └── face_recognition_sface.onnx
├── silero/
│ └── silero_vad.onnx
├── whisper/
└── propainter/
See references/local-models.md for details.
The recommended entry point is through an agent using SKILL.md. Advanced users can also run the helper scripts directly.
Probe a source file:
node scripts/wowclip-probe.mjs '{"path":"/abs/source.mp4"}'Generate local subtitles:
python3 scripts/wowclip-transcribe-local.py '{"sourcePath":"/abs/source.mp4","projectRootDir":"/abs/wowclip-project","language":"en"}'Run the YouTube-to-vertical workflow:
python3 scripts/wowclip-auto-youtube.py '{"url":"https://www.youtube.com/watch?v=...","projectRootDir":"/abs/wowclip-project","language":"en","maxClips":3,"portraitMode":"auto","export":true}'The command writes intermediate artifacts under projectRootDir, including transcripts, engineered subtitles, highlight plans, portrait crop plans, preview caches, edl.json, and exported MP4 files.
Run the current WowClip tests:
python3 -m unittest discover -s testsWowClip is designed so runtime media processing can happen locally. The project scripts should not upload source media, audio, transcripts, face detections, or timelines. You are still responsible for reviewing any third-party tools or models you install separately.
Follow WowClip at x.com/waoclip.
This repository is licensed under MIT for the WowClip code in this project. Third-party tools, models, and datasets are governed by their own licenses and terms, including FFmpeg, yt-dlp, faster-whisper, Whisper model weights, OpenCV models, Silero VAD, and ProPainter.
Do not redistribute model weights or third-party binaries unless their licenses allow it.
MIT. See LICENSE.

