Note
This tool was at least 99% written by AI. Altough I have tested it and it seems to work as it should, use with care and expect things to go really wrong.
Extract recurring chapter segments (intros, outros, etc.) from large MKV collections by analyzing chapter marker patterns.
Point it at a directory of MKV files and it will find chapters that repeat across episodes with similar durations -- typically intros and outros. It extracts the video segment from the first occurrence and names the output by the episode range it covers.
- Python 3.12+
- mkvtoolnix (
mkvmergeandmkvextractmust be on PATH)
pip install -e .chapter-extractor <input-dir> <output-dir> [options]
Find recurring 90-180s chapters across at least 5 episodes (dry run):
chapter-extractor /media/anime/show ./extracted --duration-range 90-180 --dry-runExtract intros/outros by chapter name:
chapter-extractor /media/anime/show ./extracted --chapter-namesCombine both filters (AND logic):
chapter-extractor /media/anime/show ./extracted --duration-range 80-100 --chapter-namesExtract all chapters of a specific length without grouping:
chapter-extractor /media/anime/show ./extracted --duration-range 85-95 --min-occurrences 0Scan subdirectories and use percentage-based tolerance:
chapter-extractor /media/library ./extracted -r --duration-range 60-120 --tolerance-percent 5Process non-episode files (no S##E## filename requirement):
chapter-extractor /media/movies ./extracted --no-episode-parsing --min-occurrences 0 --duration-range 60-300| Option | Default | Description |
|---|---|---|
--duration-range MIN-MAX |
None | Filter chapters by duration in seconds |
--chapter-names |
Off | Filter by common names (Opening, Intro, OP, ED, Ending, Outro, Credits, Preview, Recap, Prologue, Epilogue) |
--min-occurrences N |
5 | Minimum times a pattern must appear. 0 = extract all matches without grouping |
--tolerance-seconds N |
2 | How close durations must be to count as "the same" |
--tolerance-percent N |
None | Percentage-based tolerance (mutually exclusive with --tolerance-seconds) |
--no-episode-parsing |
Off | Skip S##E## filename parsing |
--dry-run |
Off | Preview detected patterns without extracting |
--recursive, -r |
Off | Scan subdirectories |
If no filters are specified, all chapters are considered.
Dry run prints a summary like:
Scanned 847 files (12 skipped: no chapters, 3 skipped: no episode tag)
Detected patterns:
[1] Opening (91s avg) -- S01E01-S03E24 (72 episodes)
First occurrence: S01E01 @ 00:01:32.000 - 00:03:03.000
Output: S01E01-S03E24_Opening.mkv
[2] Ending (89s avg) -- S01E01-S02E12 (36 episodes)
First occurrence: S01E01 @ 00:22:05.000 - 00:23:34.000
Output: S01E01-S02E12_Ending.mkv
Extracted files are named <episode-range>_<chapter-name>.mkv. If the chapter has no name (or name is 1 character), duration is used instead (e.g., S01E01-S01E12_90s.mkv). Duplicate names get _1, _2 suffixes.
- Scans the input directory for
.mkvfiles - Reads chapter metadata using
mkvmerge -J(file info) andmkvextract --simple(chapter timestamps) - Parses episode identifiers from filenames (
S01E05,S100E001, etc.) - Filters chapters by duration range and/or chapter name
- Clusters chapters with similar durations (within tolerance)
- Splits clusters by episode contiguity to reduce false positives (e.g., a season 1 intro won't be grouped with a coincidentally same-length season 5 outro)
- Extracts the segment from the first occurrence using
mkvmerge --split parts: