Tools for splitting MKV files containing multiple TV episodes into individual episode files.
Splits an MKV file using embedded chapter markers. Simple and fast, but only works if the source file has proper chapter metadata.
split_chapters input.mkvOutput files are named input_ep01.mkv, input_ep02.mkv, etc., based on chapter titles.
Automatically detects and splits episodes by finding repeating opening credits patterns. Uses ffmpeg's MPEG-7 video signature filter to identify where the same visual sequence (opening credits) appears multiple times in the file.
split_episodes [--dry-run] input.mkv [n_samples] [duration] [spacing]Parameters:
--dry-run: Analyse and report without splittingn_samples: Number of samples to extract from opening (default: 10)duration: Length of each sample in seconds (default: 10)spacing: Gap between sample start times in seconds (default: 2)
How it works:
- Extracts multiple short video samples from the beginning of the file (where opening credits appear)
- Generates MPEG-7 video signatures for each sample and the full file
- Compares samples against the full file to find repeating patterns
- Identifies which sample produces matches at the most regular intervals
- Uses scene detection to refine boundaries to exact frame cuts
- Splits the file at detected episode boundaries using stream copy (no re-encoding)
- ffmpeg (with signature filter support)
- Python 3 with numpy
- Bash
Place the scripts somewhere in your PATH, or symlink them:
ln -s /path/to/split-episodes/split_episodes ~/.local/bin/
ln -s /path/to/split-episodes/split_chapters ~/.local/bin/| Issue | Description | Workaround |
|---|---|---|
| Short opening credits | If credits are shorter than the sample duration, samples will include non-credits content | Reduce the duration parameter |
| Cold opens | If episodes have a cold open before credits, samples may miss the actual credits | Increase n_samples or spacing to sample further into the file |
| Soft transitions | Fades or dissolves between episodes may not trigger scene detection | Currently uses threshold 0.3; may need adjustment |
| Variable episode lengths | Very irregular episode lengths may confuse the pattern matching | Script uses coefficient of variation to handle moderate variation |
| Varying credits | If opening credits differ significantly between episodes, matching will fail | Not solvable with current approach |
| MPEG-2 in MKV | ffmpeg's signature filter doesn't work directly with MPEG-2 in MKV containers | Script works around this by piping through rawvideo |
The scene detection uses a threshold of 0.3 (30% frame difference). This works well for hard cuts but may need adjustment for:
- Noisy or low-quality video (may need higher threshold)
- Subtle transitions (may need lower threshold)
Potential improvements:
- Tuneable scene detection threshold via command-line parameter
- Black frame detection as alternative/supplement to scene detection
- Audio fingerprinting for cases where visual matching is unreliable
- Expected episode count parameter to constrain detection
- Verification step to check detected episodes have similar durations
- Unified command that tries chapters first, falls back to signature detection
- Support for detecting closing credits as well as opening credits
MIT