A local, open-source alternative to Auphonic for podcast and audio post-production. CastPolish runs entirely on your Mac — no cloud, no subscription, no audio ever leaves your machine.
- Audio processing — highpass filter, compression, limiting, and two-pass EBU R128 loudness normalization (targets −16 LUFS, podcast standard)
- Noise reduction — optional ffmpeg
afftdnFFT noise reduction (checkbox in UI) - Transcription — faster-whisper with word-level timestamps, exported as HTML, WebVTT captions, and Auphonic-compatible JSON
- AI shownotes — chapter titles, long summary, brief summary, and suggested tags via Ollama (local LLM, no API key needed)
- Speaker diarization — optional, via pyannote.audio (requires HuggingFace token)
- macOS app — one-click
.applauncher with Dock icon
| Dependency | Install |
|---|---|
| Python 3.10+ | python.org |
| ffmpeg | brew install ffmpeg |
| Ollama (optional, for shownotes) | ollama.com |
# 1. Clone the repo
git clone https://github.com/abc3-Mac/castpolish.git
cd castpolish
# 2. Install Python dependencies
python3 castpolish.py setup
# 3. Start the server
python3 castpolish.py serve
# 4. Open http://localhost:8765 in your browserBuild a native .app with a Dock icon that launches the server with a double-click:
python3 create_macos_app.pyThen drag CastPolish.app to your Applications folder or Dock.
The app bundles
castpolish.pyinternally and works from any location. It does not require the source folder to remain in place after the app is built.
python3 castpolish.py serve
python3 castpolish.py serve --port 9000
python3 castpolish.py serve --output-dir ~/Desktop/podcast-outputpython3 castpolish.py process "episode.mp3"
python3 castpolish.py process "episode.mp3" \
--model small \ # tiny | base | small | medium | large-v2
--format mp3 \ # mp3 or mp4 (aac)
--language en \
--task transcribe \ # transcribe | translate (translate → English)
--lufs -16 \ # target loudness in LUFS
--title "Episode Title" \
--output-dir ~/my-output \
--denoise \ # enable ffmpeg afftdn noise reduction
--no-normalize # skip loudness normalizationfor f in ~/Podcasts/*.mp3; do
python3 /path/to/castpolish.py process "$f" --model small
doneEach processed file gets its own folder named after the audio file:
~/CastPolish-output/
my-episode/
my-episode.mp3 # processed audio
my-episode.html # transcript with chapters, summaries, audio player
my-episode.vtt # WebVTT captions (YouTube-compatible)
my-episode.json # Auphonic-compatible transcript JSON
If you process the same file twice, folders are named my-episode-01, my-episode-02, etc.
Install Ollama and pull a model:
brew install ollama
ollama pull llama3.2Then enable shownotes in the web UI Settings tab, or pass --title "Episode Title" on the CLI. CastPolish will generate:
- Chapter titles with timestamps
- Long summary (~700 tokens)
- Brief summary (~300 tokens)
- Suggested tags
Ollama runs locally — no data leaves your machine.
Requires a free HuggingFace account and accepting the pyannote model license:
pip install pyannote.audio torchEnter your HuggingFace token in Settings and enable diarization in the UI.
| Hardware | tiny |
small |
medium |
|---|---|---|---|
| Apple Silicon (M1/M2/M3) | ~0.1× realtime | ~0.3× realtime | ~0.8× realtime |
| Intel Mac (last gen) | ~0.5× realtime | ~1.5× realtime | ~4× realtime |
"0.3× realtime" means a 60-minute file takes ~18 minutes. Apple Silicon uses Core ML acceleration.
For batch jobs on Intel, --model small is the best balance of speed and accuracy.
Settings are saved to ~/.castpolish/config.json. You can also configure via the Settings tab in the web UI:
- Output directory
- Default Whisper model
- Default audio format
- Target loudness (LUFS)
- Ollama host and model
- HuggingFace token
MIT — see LICENSE.
- faster-whisper — OpenAI Whisper via CTranslate2
- ffmpeg — audio processing and noise reduction
- Ollama — local LLM inference
- Flask — local web server
- Inspired by Auphonic

