Skip to content

lorenzowood/split-episodes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

split-episodes

Tools for splitting MKV files containing multiple TV episodes into individual episode files.

Scripts

split_chapters

Splits an MKV file using embedded chapter markers. Simple and fast, but only works if the source file has proper chapter metadata.

split_chapters input.mkv

Output files are named input_ep01.mkv, input_ep02.mkv, etc., based on chapter titles.

split_episodes

Automatically detects and splits episodes by finding repeating opening credits patterns. Uses ffmpeg's MPEG-7 video signature filter to identify where the same visual sequence (opening credits) appears multiple times in the file.

split_episodes [--dry-run] input.mkv [n_samples] [duration] [spacing]

Parameters:

  • --dry-run: Analyse and report without splitting
  • n_samples: Number of samples to extract from opening (default: 10)
  • duration: Length of each sample in seconds (default: 10)
  • spacing: Gap between sample start times in seconds (default: 2)

How it works:

  1. Extracts multiple short video samples from the beginning of the file (where opening credits appear)
  2. Generates MPEG-7 video signatures for each sample and the full file
  3. Compares samples against the full file to find repeating patterns
  4. Identifies which sample produces matches at the most regular intervals
  5. Uses scene detection to refine boundaries to exact frame cuts
  6. Splits the file at detected episode boundaries using stream copy (no re-encoding)

Requirements

  • ffmpeg (with signature filter support)
  • Python 3 with numpy
  • Bash

Installation

Place the scripts somewhere in your PATH, or symlink them:

ln -s /path/to/split-episodes/split_episodes ~/.local/bin/
ln -s /path/to/split-episodes/split_chapters ~/.local/bin/

Known Limitations and Potential Issues

split_episodes

Issue Description Workaround
Short opening credits If credits are shorter than the sample duration, samples will include non-credits content Reduce the duration parameter
Cold opens If episodes have a cold open before credits, samples may miss the actual credits Increase n_samples or spacing to sample further into the file
Soft transitions Fades or dissolves between episodes may not trigger scene detection Currently uses threshold 0.3; may need adjustment
Variable episode lengths Very irregular episode lengths may confuse the pattern matching Script uses coefficient of variation to handle moderate variation
Varying credits If opening credits differ significantly between episodes, matching will fail Not solvable with current approach
MPEG-2 in MKV ffmpeg's signature filter doesn't work directly with MPEG-2 in MKV containers Script works around this by piping through rawvideo

Scene detection threshold

The scene detection uses a threshold of 0.3 (30% frame difference). This works well for hard cuts but may need adjustment for:

  • Noisy or low-quality video (may need higher threshold)
  • Subtle transitions (may need lower threshold)

Future Development

Potential improvements:

  • Tuneable scene detection threshold via command-line parameter
  • Black frame detection as alternative/supplement to scene detection
  • Audio fingerprinting for cases where visual matching is unreliable
  • Expected episode count parameter to constrain detection
  • Verification step to check detected episodes have similar durations
  • Unified command that tries chapters first, falls back to signature detection
  • Support for detecting closing credits as well as opening credits

Licence

MIT

About

Tools for splitting MKV files containing multiple TV episodes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages