# Caption Podcast Episodes

After downloading the podcast episodes into an organized directory in [lesson 1](./1-download-podcasts.ipynb), I use the whisper model to transcribe each episode.

> 🌧 Quick note! I use the "base" whisper model because it runs on my local machine. There are more powerful versions of the whisper model which could produce higher quality outputs. See [this list of Whisper models](https://github.com/openai/whisper?tab=readme-ov-file#available-models-and-languages) and pick what works best for you. 

In [None]:
# Install the latest version from github
!pip install git+https://github.com/openai/whisper.git 

In [2]:
import whisper
from pathlib import Path

model = whisper.load_model("base")
media_dir = Path("media")

In [33]:
for mp3_path in media_dir.rglob("**/*.mp3"):
    episode_dir = mp3_path.parent
    transcript_path = episode_dir / "transcript.txt"
    if transcript_path.exists():
        continue
    print(f":: {mp3_path}")
    print(f"   - Begin transcription")
    transcript = model.transcribe(mp3_path.as_posix())
    print(f"   - End transcription")
    transcript_path.write_text(transcript["text"], encoding="utf-8")
    print(f"   - Transcription written to {transcript_path}")

:: media\2019-01-13 Blessed In Poverty and Mourning\2019-01-13 Blessed In Poverty and Mourning.mp3
   - Begin transcription
   - End transcription
   - Transcription written to media\2019-01-13 Blessed In Poverty and Mourning\transcript.txt
:: media\2019-01-20 Blessed The Meek\2019-01-20 Blessed The Meek.mp3
   - Begin transcription
   - End transcription
   - Transcription written to media\2019-01-20 Blessed The Meek\transcript.txt
:: media\2019-01-27 Blessed In Hunger and Thirst\2019-01-27 Blessed In Hunger and Thirst.mp3
   - Begin transcription
   - End transcription
   - Transcription written to media\2019-01-27 Blessed In Hunger and Thirst\transcript.txt
:: media\2019-02-10 Blessed The Merciful\2019-02-10 Blessed The Merciful.mp3
   - Begin transcription
   - End transcription
   - Transcription written to media\2019-02-10 Blessed The Merciful\transcript.txt
:: media\2019-02-24 Blessed The Pure in Heart\2019-02-24 Blessed The Pure in Heart.mp3
   - Begin transcription
   - End tr