shorts

Modular shorts production. Every step of turning a long horizontal video into a vertical short is an atomic Claude Code skill — each callable on its own, chainable into a full pipeline.

Stack

whisper.cpp (local, Metal) — transcription with word timestamps, using a local GGML model
Claude (via Claude Code session + optional /crew tmux members) — segment ranking, active-speaker picking, semantic judgment. No API key required.
ffmpeg — all media operations (cut, crop, loudnorm, burn-in)
MediaPipe — face detection (boxes only; Claude picks the active speaker)
Rails + Sidekiq — optional thin wrapper for upload UI + job queue (not required to use the skills)

Goal

Upload a horizontal video → identify the speaker → reframe to keep them in shot → burn subtitles → cut N shorts → drop them in ./output/<source-video-name>/ for easy browsing.

Skills

Skills live in .claude/skills/<name>/SKILL.md. Invoke any one directly via Claude Code, or chain them.

Skill	Purpose
`ingest`	YouTube URL → `work/<id>/source.mp4` + `ingest.json` (yt-dlp)
`transcribe`	Video → JSON transcript with word timestamps (whisper.cpp local)
`fit-vertical`	9:16 reframe with blurred bars top/bottom (no speaker tracking)
`pick-segments`	Claude ranks transcript spans (+ RMS energy) → N clip-worthy spans
`burn-subtitles`	Word-timed subtitles burned in as a PNG overlay sequence
`loudnorm`	Two-pass ffmpeg loudnorm to broadcast levels
`cut-clip`	ffmpeg trim to span
`qc-clip`	ffprobe sanity (duration, size)
`save-local`	Drop renders into `./output/<source-name>/`

Setup

cp .env.example .env  # paths to whisper-cli + your local GGML model
brew install ffmpeg whisper-cpp yt-dlp
pip install mediapipe opencv-python numpy

Run the whole pipeline

./shorts.sh <youtube-url> [n=5] [dmin=20] [dmax=60]

Drives the full chain — ingest → transcribe → segment-topics → pick-segments → verify-coherence → bookend-trim → per span (cut → trim-filler → tighten-pace → fit-vertical → subtitles → title → loudnorm → CTA → bg-music → qc → save). Intermediate JSON/video lands in work/<id>/; finished shorts in ./output/<source-name>/. Per-clip transcript is sliced to clip-local time by rebase.py. Re-runs are cheap — every skill caches on mtime.

Status

All skills implemented; ./shorts.sh drives them end to end.

Pre-pivot archive

The previous Python pipeline (whisperX, mlx-whisper, ranker/grader/subtitler in pipeline_v2.py) is preserved on the archive/pre-pivot branch.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.beads		.beads
.claude		.claude
ralph		ralph
reference_shorts		reference_shorts
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
May26-spec.md		May26-spec.md
README.md		README.md
SPEC-broll.md		SPEC-broll.md
SPEC-fill-vertical.md		SPEC-fill-vertical.md
SPEC-pick-segments.md		SPEC-pick-segments.md
SPEC.md		SPEC.md
gif.gif		gif.gif
ralph.sh		ralph.sh
rebase.py		rebase.py
shorts.sh		shorts.sh
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

shorts

Stack

Goal

Skills

Setup

Run the whole pipeline

Status

Pre-pivot archive

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

shorts

Stack

Goal

Skills

Setup

Run the whole pipeline

Status

Pre-pivot archive

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages