Vidscribe

A CLI tool for transcribing YouTube videos, playlists, channels, and local video files using OpenAI's Whisper model, with MLX acceleration for Apple Silicon.

Features

Multiple Input Sources: YouTube URLs, local video files, playlists, and channels
High-Quality Transcription: Powered by OpenAI Whisper
Multiple Output Formats: Text, JSON, CSV, SRT subtitles, and WebVTT
Apple Silicon Acceleration: MLX Whisper support for ~50% faster transcription on M1/M2/M3
Date Filtering: --since flag to only process videos uploaded after a given date
Resume Support: Re-running a playlist/channel skips already-processed videos
Cookie Auth: Bypass bot detection via browser cookies or a Netscape cookies file
Batch Scripts: Pre-built scripts for transcribing curated channel lists

Installation

git clone https://github.com/lmiadowicz/vidscribe.git
cd vidscribe
pip install -e .

Apple Silicon (optional — ~50% faster)

pip install -r requirements-mac.txt

MLX models require a Hugging Face token. Create .env from the example and add yours:

cp .env.example .env
# Edit .env: HF_TOKEN=your_token_here

Get a token at https://huggingface.co/settings/tokens.

Prerequisites

Python 3.8+

FFmpeg:

brew install ffmpeg          # macOS
sudo apt install ffmpeg      # Ubuntu/Debian
choco install ffmpeg         # Windows

Usage

Single video

# Transcribe a YouTube video
vidscribe transcribe "https://www.youtube.com/watch?v=VIDEO_ID"

# Save to file with specific format
vidscribe transcribe "VIDEO_URL" -o transcript.txt -f text

# Generate SRT subtitles
vidscribe transcribe "VIDEO_URL" -o subs.srt -f srt

# Use a larger model for better accuracy
vidscribe transcribe "VIDEO_URL" -m large

# Enable MLX acceleration (Apple Silicon)
vidscribe transcribe "VIDEO_URL" --use-mlx

# Transcribe in a specific language
vidscribe transcribe "VIDEO_URL" --language es

# Translate any language to English
vidscribe transcribe "VIDEO_URL" --task translate

# Keep downloaded audio file
vidscribe transcribe "VIDEO_URL" --keep-audio

# Bypass bot detection using browser cookies
vidscribe transcribe "VIDEO_URL" --cookies-from-browser chrome
vidscribe transcribe "VIDEO_URL" --cookies-file /path/to/cookies.txt

Playlist / channel

# Process entire playlist
vidscribe playlist "https://youtube.com/playlist?list=PLxxxxxx"

# Process a channel, save to CSV
vidscribe playlist "https://youtube.com/@channel" -o results.csv

# Only process videos uploaded on or after a date
vidscribe playlist "https://youtube.com/@channel" --since 2024-01-01

# Limit number of videos processed
vidscribe playlist "https://youtube.com/@channel" --limit 10

# Use MLX acceleration and medium model
vidscribe playlist "PLAYLIST_URL" --use-mlx -m medium

# Bypass bot detection
vidscribe playlist "PLAYLIST_URL" --cookies-from-browser chrome

Re-running any playlist command automatically skips videos already present in the output CSV.

Other commands

# Get video metadata without transcribing
vidscribe info "https://www.youtube.com/watch?v=VIDEO_ID"

# List available Whisper models
vidscribe models

Batch scripts

scripts/transcribe_growth_channels.sh transcribes a curated list of mobile growth / app founder YouTube channels:

# Run with defaults (base model, videos since 2024-04-30)
bash scripts/transcribe_growth_channels.sh

# Override model and date
MODEL=small SINCE=2023-01-01 bash scripts/transcribe_growth_channels.sh

# Enable MLX acceleration
USE_MLX=1 bash scripts/transcribe_growth_channels.sh

Output CSVs are written to scripts/channel_transcriptions/. Already-covered channels are skipped automatically on re-runs.

Configuration

Environment variables (`.env`)

HF_TOKEN=your_huggingface_token_here   # required for MLX
VIDSCRIBE_MODEL_SIZE=base
VIDSCRIBE_OUTPUT_FORMAT=text
VIDSCRIBE_USE_MLX=true

YAML config (`~/.vidscribe/config.yaml`)

model:
  size: base

download:
  output_dir: ~/Downloads/vidscribe
  keep_files: false

output:
  format: text
  language: auto

Project structure

vidscribe/
├── src/vidscribe/
│   ├── core/engine.py          # Whisper / MLX transcription engine
│   ├── downloaders/youtube.py  # yt-dlp based downloader
│   ├── processors/playlist.py  # Batch playlist / channel processor
│   ├── utils/                  # Config, formatters, validators
│   └── cli.py                  # Click CLI entry point
├── scripts/
│   ├── install.sh
│   └── transcribe_growth_channels.sh
└── tests/

Models

Model	Parameters	Multilingual	Required VRAM	Relative Speed
tiny	39M	✓	~1 GB	~32x
base	74M	✓	~1 GB	~16x
small	244M	✓	~2 GB	~6x
medium	769M	✓	~5 GB	~2x
large	1550M	✓	~10 GB	1x

Development

# Install with dev dependencies
pip install -e ".[dev]"

make test        # run unit tests
make test-cov    # tests with coverage
make lint        # flake8 + mypy
make format      # black + isort

Troubleshooting

FFmpeg not found

ffmpeg -version   # verify install
brew install ffmpeg

YouTube download errors

Check your internet connection
Try passing cookies: --cookies-from-browser chrome
Some videos may be age-restricted or private
Update yt-dlp: pip install --upgrade yt-dlp

MLX authentication errors (Apple Silicon)

# Error: 401 Client Error / Repository Not Found
cp .env.example .env
# Edit .env and add: HF_TOKEN=your_token_here

Out of memory

Use a smaller model (-m tiny or -m base)
Process videos individually instead of in batches

License

MIT License — see LICENSE for details.

Acknowledgments

OpenAI Whisper — speech recognition model
yt-dlp — YouTube downloading
MLX Whisper — Apple Silicon acceleration
Click — CLI framework
Rich — terminal output

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
scripts		scripts
src/vidscribe		src/vidscribe
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
install-mlx.sh		install-mlx.sh
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements-mac.txt		requirements-mac.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vidscribe

Features

Installation

Apple Silicon (optional — ~50% faster)

Prerequisites

Usage

Single video

Playlist / channel

Other commands

Batch scripts

Configuration

Environment variables (`.env`)

YAML config (`~/.vidscribe/config.yaml`)

Project structure

Models

Development

Troubleshooting

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vidscribe

Features

Installation

Apple Silicon (optional — ~50% faster)

Prerequisites

Usage

Single video

Playlist / channel

Other commands

Batch scripts

Configuration

Environment variables (.env)

YAML config (~/.vidscribe/config.yaml)

Project structure

Models

Development

Troubleshooting

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Environment variables (`.env`)

YAML config (`~/.vidscribe/config.yaml`)

Packages