Skip to content

moretothat/yt-transcript

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

yt-transcript

A Claude Code skill that turns YouTube videos into cleanly-edited transcripts and structured summaries.

Given one or more YouTube URLs, it:

  1. Pulls the best available subtitle track via yt-dlp (manual captions preferred, auto-generated as fallback).
  2. Mechanically cleans the captions — fillers stripped, paragraphs grouped, chapters detected.
  3. Has Claude rewrite the transcript at paragraph granularity: speaker labels, sectioned by topic shift, voice preserved, no timestamps.
  4. Generates a separate summary file with the key ideas, notable quotes, and source metadata.

Output

By default, files land in ~/yt-transcripts/:

~/yt-transcripts/
├── transcripts/
│   └── 20260101-some-video-title.md
└── summaries/
    └── 20260101-some-video-title-summary.md

Override the location by setting YT_TRANSCRIPT_DIR:

export YT_TRANSCRIPT_DIR=~/Documents/yt-archive

Install

  1. Clone this repo into your Claude Code skills directory:

    git clone https://github.com/<your-fork>/yt-transcript ~/.claude/skills/yt-transcript
  2. Install yt-dlp:

    brew install yt-dlp
    # or: pip install yt-dlp
  3. (Optional) Set a custom output directory:

    export YT_TRANSCRIPT_DIR=~/Documents/yt-archive

That's it. Restart Claude Code and /yt-transcript will be available.

Usage

In Claude Code:

/yt-transcript https://www.youtube.com/watch?v=...
/yt-transcript <url1> <url2> <url3>
/yt-transcript <url> --summary-only
  • One URL or many — multiple URLs are processed sequentially.
  • --summary-only skips the slow paragraph-rewrite step and only produces the summary. The raw transcript is deleted afterward.
  • Shortened URLs (youtu.be/...), /shorts/, and timestamped URLs all work.

Requirements

  • Claude Code
  • Python 3.9+
  • yt-dlp
  • (Optional) pandoc if you want to convert outputs to .docx

How it works

The skill is two pieces:

  • transcribe.py — fetches the subtitle track, parses YouTube's json3 caption format, deduplicates overlapping events, strips fillers, and groups tokens into paragraphs based on pause length and sentence boundaries. Outputs a "raw mechanical cleanup" markdown file.
  • SKILL.md — instructs Claude to read the raw output and rewrite it semantically: identify speakers, group by topic, smooth surface stutters, preserve voice. Then write a separate summary.

The mechanical pass handles the tedious parts (dedupe, filler removal, paragraph chunking). The Claude pass handles the parts that need judgment (who's speaking, where do topics actually shift, what counts as a "key idea").

License

MIT — see LICENSE.

About

A Claude Code skill that turns YouTube videos into cleanly-edited transcripts and structured summaries.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages