Podscript

Podcast → Transcript!

Transcribe any podcast episode or YouTube video from the command line. Generates clean markdown with speaker diarization and timestamps. Works with the ElevenLabs API or fully locally using Whisper - no API key required.

Installation

# Local transcription (free, no API key needed)
pip install podscript[local]

# Or use ElevenLabs API
pip install podscript
podscript --setup  # paste your ElevenLabs API key

For local mode, just add --local to any command. For ElevenLabs, you'll need an API key.

For YouTube support, also install yt-dlp and ffmpeg.

Usage

# Transcribe a podcast from an Apple Podcasts link
podscript "https://podcasts.apple.com/us/podcast/huberman-lab/id1545953110?i=1000690"

# Transcribe a YouTube video
podscript "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

# Use an RSS feed directly
podscript https://feeds.simplecast.com/JGE3yC0V

# Browse episodes first
podscript https://feeds.simplecast.com/JGE3yC0V --list

# Search for a specific episode
podscript https://feeds.simplecast.com/JGE3yC0V --search "AI"

# Pick episode #3 from the list
podscript https://feeds.simplecast.com/JGE3yC0V --episode 3

# Custom output filename
podscript https://feeds.simplecast.com/JGE3yC0V --latest --output transcript.md

Without any flags, the default behavior is to transcribe the most recent episode.

Output

Generates a markdown file with speaker labels and timestamps:

# The Economics of Carbon Removal

**Podcast:** a16z Podcast
**Date:** 2/10/2026
**Duration:** 1:04:23

---

## Speaker 1
[0:00] Welcome back to the show. Today we're talking about...

## Speaker 2
[0:15] Thanks for having me. So the key challenge with carbon removal is...

## Speaker 1
[2:41] That's fascinating. How does the economics actually work at scale?

Local Transcription

You can transcribe entirely offline using a local Whisper model — no API key required:

pip install podscript[local]

This installs faster-whisper, pyannote.audio, and torch.

Usage

# Basic local transcription (uses "base" model, no speaker diarization)
podscript "https://www.youtube.com/watch?v=..." --local

# Use a larger model for better accuracy
podscript "https://www.youtube.com/watch?v=..." --local --model medium

# Enable speaker diarization with a HuggingFace token
podscript "https://feeds.example.com/rss" --local --hf-token hf_xxxxx

# Or set the token as an environment variable once
export HF_TOKEN=hf_xxxxx
podscript "https://feeds.example.com/rss" --local

Model Sizes

Model	Speed	Quality	VRAM
`tiny`	Fastest	Lower	~1 GB
`base`	Fast	Good (default)	~1 GB
`small`	Moderate	Better	~2 GB
`medium`	Slower	Great	~5 GB
`large-v2`	Slowest	Best	~10 GB
`large-v3`	Slowest	Best	~10 GB

CPU mode uses int8 quantization automatically. GPU (CUDA) uses float16.

Speaker Diarization

Speaker diarization (identifying who said what) requires a free HuggingFace token:

Create an account at huggingface.co
Accept the terms for pyannote/speaker-diarization-3.1
Create a token at huggingface.co/settings/tokens
Pass it via --hf-token or set HF_TOKEN in your environment

Without a token, all speech is attributed to "Speaker 1" — still useful for single-speaker content.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
podscript.py		podscript.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Podscript

Installation

Usage

Output

Local Transcription

Usage

Model Sizes

Speaker Diarization

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Podscript

Installation

Usage

Output

Local Transcription

Usage

Model Sizes

Speaker Diarization

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages