# Podcast transcription with Whisper

This notebook demonstrates how to transcribe podcast or audio content into text using OpenAI’s Whisper model.  
It also shows how to export the transcription to a `.txt` file for further use (e.g. Readlang, subtitles, text analysis).


In [9]:
# Step 1: Set up FFmpeg path
import os

# Add FFmpeg to system PATH (adjust the path to your own installation)

os.environ["PATH"] = r"D:\FileHistory\ffmpeg-master-latest-win64-gpl-shared\bin;" + os.environ["PATH"]

## Load the Whisper Model

Whisper is a general-purpose speech recognition model.  
We use the "base" model here, which offers a good balance of speed and accuracy.


In [7]:
# Step 2: Import Whisper and load the model
import whisper

print("Loading model...")
model = whisper.load_model("base")
print("Model loaded successfully.")


Loading model...
Model loaded successfully.


## Transcribe the Audio File

Whisper will automatically detect the language and transcribe speech to text.
Make sure your audio file is in `.mp3`, `.wav`, or similar format.


In [None]:
# Step 3: Transcribe the audio file (adjust the path to your file)
audio_path = r"D:\FileHistory\Trump vs. China.mp3"
print(f"Transcribing {audio_path}...")

result = model.transcribe(audio_path)

# Display the transcription in the notebook
print("\n Transcription:\n")
print(result["text"])


In [11]:
# Step 4: Export transcript to a text file
with open("transcript.txt", "w", encoding="utf-8") as f:
    f.write(result["text"])

print("Transcript saved as transcript.txt")


Transcript saved as transcript.txt
