🎬 ClipRank

ClipRank is an AI-assisted video analysis system that automatically identifies, ranks, and exports high-quality short-form clips (≈50–60 seconds) from longer video content.

It is designed to simulate how a human editor finds compelling moments — using transcription, structured segmentation, heuristic scoring, and diversity filtering to surface the most engaging segments of a video.

🚀 Overview

Modern content workflows rely heavily on repurposing long-form content into short-form clips for platforms like:

YouTube Shorts
TikTok
Instagram Reels
X (Twitter) video

ClipRank automates this process.

Instead of manually scrubbing through video timelines, ClipRank:

analyzes spoken language
identifies high-value segments
ranks them using multiple signals
outputs ready-to-use clips

🧠 Core Pipeline

The system operates as a multi-stage processing pipeline:

Video Input
    ↓
Transcription (faster-whisper)
    ↓
Timestamped Segments
    ↓
Candidate Window Generation
    ↓
Multi-Factor Scoring Engine
    ↓
Diversity Filtering (timeline-aware)
    ↓
Top Clip Selection
    ↓
FFmpeg Clip Export
    ↓
Report Generation

⚙️ How It Works

1. Transcription

Uses faster-whisper for speech-to-text
Produces timestamped segments
Detects language automatically
Saves a human-readable transcript for review
Saves structured transcript JSON for downstream use
Exposes Whisper settings through config.py

Example:

[0.0s → 7.6s] Opening statement...
[7.6s → 13.2s] Follow-up commentary...

2. Segmentation

Builds candidate windows from adjacent transcript segments
Prefers cleaner starts after pauses, sentence breaks, or stronger openers
Scores likely end points before keeping windows
Targets ~50–60 second clips while allowing a wider generation range
Avoids flooding the scorer with near-duplicate candidates

Example run:

135 transcript segments
→ 46 candidate clip windows

3. Scoring Engine (v2)

Each clip is evaluated using multiple heuristics:

🎯 Signals Used

Hook Strength
- Detects attention-grabbing language
- Questions, tension, contrast, strong openings
Emotional Intensity
- Emphatic, reactive, or emotionally charged wording
Value Density
- Explanatory or insight-heavy language
Pacing
- Words per second vs clip duration
Duration Fit
- Alignment with ideal short-form length

🧮 Final Score

total_score = hook + opening_hook + emotional + value + pacing + duration

Each clip also includes:

scoring breakdown
human-readable notes explaining WHY it ranked

4. Diversity Filtering (v2)

Prevents redundant or overlapping clips.

Current rules:

Limits timeline overlap ratio
Enforces minimum start-time gap
Checks lightweight transcript similarity
Uses a stricter first pass with a fallback pass if too few clips survive
Ensures clips are spread across the video timeline

This transforms raw ranking into usable output.

5. Clip Export

Uses FFmpeg
Extracts clips using timestamps
Builds safer export filenames from source title + clip timing + clip id
Outputs .mp4 files

workspace/runs/<source>_<timestamp>/clips/

6. Report Generation

Each run produces a detailed report:

workspace/runs/<source>_<timestamp>/reports/

Includes:

timestamps
full score breakdown
transcript preview
reasoning notes
exported clip file path

✅ Current State

The project is currently a working end-to-end local pipeline.

What is already implemented:

local video file validation
real Whisper-based transcription
timestamped transcript segments
saved transcript text and transcript JSON
smarter candidate generation
multi-factor heuristic scoring
timeline-aware diversity filtering
transcript-aware diversity filtering
FFmpeg clip export
text report generation with transcript previews

Recent validated run:

Input: workspace/input/MTG on Trump’s Iran war ‘Why would an American president do that’.mp4
Result: 135 transcript segments, 46 candidate clips, 5 selected clips, 5 exported files

Additional real-world validation:

Input: workspace/input/Weve won - Trump speaks on Iran, Straight of Hormuz, NATO, executions, Israel.mp4
Result: 112 transcript segments, 23 candidate clips, 5 selected clips, 5 exported files

Current conclusion:

The system is robust enough to move on from heavy tuning
The biggest remaining quality limiter is messy-source transcription accuracy
A final stronger-model transcription pass was tested and rejected for this phase because the latency cost was too high relative to the gain

📂 Project Structure

cliprank/
├── main.py                 # Pipeline entry point
├── streamlit_app.py        # Streamlit demo interface
├── pipeline.py             # Reusable engine runner for CLI/UI
├── profiles.py             # Demo content profiles
├── run_demo.command        # macOS demo launcher
├── run_demo.bat            # Windows demo launcher
├── config.py               # Global configuration
├── models.py               # Data models
│
├── transcription/          # Speech-to-text logic
├── segmentation/           # Clip window generation
├── scoring/                # Heuristics + ranking + diversity
├── export/                 # FFmpeg + reporting
├── ingest/                 # Input validation
├── utils/                  # Helpers
│
├── workspace/
│   ├── input/              # Source videos
│   ├── transcript/         # Generated transcripts
│   ├── reports/            # Analysis reports
│   ├── clips/              # Exported clips
│
└── docs/                   # Documentation

✅ Requirements

Before running this project, ensure you have:

Python 3.10+
FFmpeg installed and available in your system PATH

Install FFmpeg

Mac (Homebrew):

brew install ffmpeg

Ubuntu/Debian:

sudo apt update
sudo apt install ffmpeg

Windows:

Download from https://ffmpeg.org/download.html and add it to your PATH.

📦 Installation

Clone the repository and install dependencies:

git clone https://github.com/Squawk7200/ClipRank.git
cd cliprank
python3 -m venv venv
./venv/bin/pip install -r requirements.txt

Runtime note:

faster-whisper model files are downloaded on first use
The repo-local venv is the expected Python environment for running ClipRank

▶️ Usage

Run the CLI tool on a video file:

./venv/bin/python main.py "workspace/input/your_video.mp4"

Default validation run:

./venv/bin/python main.py "workspace/input/MTG on Trump’s Iran war ‘Why would an American president do that’.mp4"

Example with creator controls:

./venv/bin/python main.py "workspace/input/your_video.mp4" --top-clips 6 --min-seconds 40 --max-seconds 70 --target-seconds 55 --profile news --keywords "iran, nato, executions"

Recommended testing note:

Use the MTG file above as the default validation input going forward
workspace/input/test.mp4 is not a valid media file and should not be used for pipeline validation

🌐 Streamlit Demo

ClipRank now also includes a Streamlit demo intended for easy portfolio use from a GitHub download.

What the demo supports:

upload common video or audio formats
choose how many clips to generate
adjust minimum, maximum, and target clip length
choose a content profile
add custom keywords
download clips, transcript files, and the report from the browser

Main demo file:

streamlit_app.py

GitHub-friendly launcher files:

run_demo.command for macOS
run_demo.bat for Windows

Typical demo flow:

Download the repository ZIP from GitHub and unzip it.
Ensure Python and FFmpeg are installed.
Double-click the launcher for your platform.
Wait for the first-run dependency install and model initialization.
The Streamlit app opens in your browser locally.

Manual launch commands:

Mac/Linux:

python3 -m venv venv
./venv/bin/pip install -r requirements.txt
./venv/bin/streamlit run streamlit_app.py

Windows:

py -3 -m venv venv
venv\Scripts\python -m pip install -r requirements.txt
venv\Scripts\streamlit run streamlit_app.py

Important note:

this is a good demo experience for GitHub and portfolio sharing
it is not the same as a fully packaged desktop app yet
average creators may still need Python installed for this demo version

Output

📁 Per-run folder → workspace/runs/<source>_<timestamp>/
📄 Transcript text + JSON → workspace/runs/<source>_<timestamp>/transcript/
📊 Report → workspace/runs/<source>_<timestamp>/reports/
🎬 Clips → workspace/runs/<source>_<timestamp>/clips/

Transcript outputs now include:

*_transcript.txt for human-readable review
*_transcript.json for structured downstream use

Report output includes:

selected clips
score breakdown
transcript preview
exported file path

Exported clip filenames now look like:

mtg-on-trump-s-iran-war-why-would-an-american-president-do-that_0043s_0098s_clip_008.mp4

⚙️ Configuration

Edit config.py:

TOP_CLIP_COUNT = 5

Other tunable parameters:

clip duration targets
transcript paragraph formatting
Whisper model settings
diversity spacing

Current transcription-related settings include:

WHISPER_MODEL_SIZE
WHISPER_COMPUTE_TYPE
WHISPER_BEAM_SIZE
WHISPER_VAD_FILTER
WHISPER_WORD_TIMESTAMPS

CLI runtime options now include:

--top-clips
--min-seconds
--max-seconds
--target-seconds
--profile
--keywords

🧪 Example Run

Captured 135 transcript segment(s)
Created 46 candidate clip(s)
Scored 46 clip(s)
Kept 5 diverse clip(s)

Exported 5 clip files

🛠 Tech Stack

Python
faster-whisper (speech-to-text transcription)
pydantic (data modeling and validation)
streamlit (demo UI)
typer (CLI interface)
FFmpeg (video/audio processing)

🔥 Current Capabilities

End-to-end automated pipeline
Timestamp-aware transcription
Human-readable transcript output
Structured transcript JSON output
Smarter transcript-aware candidate generation
Multi-factor scoring system with opening-hook emphasis
Timeline and transcript-aware diversity filtering
Automated clip export with safer filenames
Structured reporting with transcript previews and export paths

🚧 Future Improvements

Better FFmpeg error handling and possibly re-encoding for cleaner cuts
Cleaner Streamlit UX and stronger progress/error messaging
Save or reopen prior runs more easily from the demo
Batch processing
Packaged desktop releases for macOS and Windows
Optional future experimentation with stronger transcription models if runtime budget allows

🎯 Purpose

ClipRank was built to:

demonstrate real-world system design
model production-grade content workflows
explore AI-assisted media tooling
serve as a portfolio-ready project

👤 Author

Developed as part of an evolving portfolio in:

Software Development
AI-assisted systems
Media automation pipelines

⭐ Summary

ClipRank is not just a script — it is a modular, extensible system that bridges:

AI (speech + heuristics)
backend engineering
media processing

It reflects real-world thinking around:

pipelines
ranking systems
content automation
more to follow

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
docs		docs
export		export
ingest		ingest
scoring		scoring
segmentation		segmentation
transcription		transcription
utils		utils
workspace		workspace
.gitignore		.gitignore
NEXT_STEPS.md		NEXT_STEPS.md
README.md		README.md
config.py		config.py
main.py		main.py
models.py		models.py
pipeline.py		pipeline.py
profiles.py		profiles.py
requirements.txt		requirements.txt
run_demo.bat		run_demo.bat
run_demo.command		run_demo.command
streamlit_app.py		streamlit_app.py

Folders and files

Latest commit

History

Repository files navigation

🎬 ClipRank

🚀 Overview

🧠 Core Pipeline

⚙️ How It Works

1. Transcription

2. Segmentation

3. Scoring Engine (v2)

🎯 Signals Used

🧮 Final Score

4. Diversity Filtering (v2)

Current rules:

5. Clip Export

6. Report Generation

✅ Current State

📂 Project Structure

✅ Requirements

Install FFmpeg

📦 Installation

▶️ Usage

🌐 Streamlit Demo

Output

⚙️ Configuration

🧪 Example Run

🛠 Tech Stack

🔥 Current Capabilities

🚧 Future Improvements

🎯 Purpose

👤 Author

⭐ Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages