# Easy Whisper Transcriber

An automated pipeline for high-performance audio management and transcription powered by OpenAI Whisper. This notebook handles audio acquisition, processing, and multi-format subtitle generation.

⚠️ Performance Recommendation ⚠️

To achieve the best processing speed with Whisper, please ensure you are using a T4 GPU.

### Install OpenAI Whisper

Installs the core Whisper library from OpenAI. This package provides the neural network models for high-accuracy speech-to-text conversion.

In [None]:
!pip install -qU openai-whisper

### Install setuptools-rust

Required for compiling Rust-based components within the Whisper engine. This ensures optimal performance for the underlying processing libraries.

In [None]:
!pip install -qU setuptools-rust

### Install gdown

A utility for downloading files and folders from public Google Drive shares, which is useful for fetching external audio datasets.

In [None]:
!pip install -qU gdown

### Install yt-dlp

A versatile command-line downloader for fetching audio directly from YouTube and other video platforms with high quality.

In [None]:
!pip install -qU yt-dlp

### Set up Audio and Subtitle Directories

Creates necessary local folders (`/audio` and `/subtitles`) to keep project files organized throughout the session.

In [None]:
from pathlib import Path
audio_dir = Path("audio")
subtitle_dir = Path("subtitles")

audio_dir.mkdir(parents=True, exist_ok=True)
subtitle_dir.mkdir(parents=True, exist_ok=True)

### Download Audio from YouTube

Prompts for a YouTube URL and uses `yt-dlp` to extract the audio in 192K MP3 format, saving it to the `/audio` directory.

In [None]:
import subprocess
from pathlib import Path

url = input("Enter YouTube video URL: ").strip()

if not url:
    raise ValueError("You must provide a valid YouTube URL.")

command = [
    "yt-dlp",
    "-x",
    "--audio-format", "mp3",
    "--audio-quality", "192K",
    "-o", str(audio_dir / "%(title)s.%(ext)s"),
    url
]

try:
    process = subprocess.run(
        command,
        capture_output=True,
        text=True,
        check=True
    )
    print("Download completed successfully.")
    print(process.stdout)

except subprocess.CalledProcessError as e:
    print("An error occurred while downloading the video:")
    print(e.stderr)

### Download Files/Folders from Google Drive

Enables retrieval of audio files from Google Drive using public IDs, supporting both individual files and entire folder structures.

In [None]:
import subprocess
from pathlib import Path

audio_dir = Path("audio")

file_id = input("File ID (press Enter if not applicable): ").strip() or None
folder_id = input("Folder ID (press Enter if not applicable): ").strip() or None

if not file_id and not folder_id:
    raise ValueError("You must provide either a file_id or a folder_id.")

command = ["gdown", "--output", str(audio_dir)]

if folder_id:
    command.extend(["--folder", folder_id])
else:
    command.append(file_id)

try:
    process = subprocess.run(
        command,
        capture_output=True,
        text=True,
        check=True
    )
    print(process.stdout)

except subprocess.CalledProcessError as e:
    print("An error occurred while executing the command:")
    print(e.stderr)

### Select Whisper Model

Choose the appropriate model size based on your resources. The `turbo` model is highly recommended for its excellent balance of speed and precision.

In [None]:
import whisper

models = [
    "tiny",
    "base",
    "small",
    "medium",
    "large",
    "turbo",
]

model_name = input("Select whisper model: ").strip() or "turbo"

model = whisper.load_model(model_name)
print(f"Loaded Whisper model: {model_name}")

### Transcribe MP3 Files

Processes all discovered MP3 files through the loaded Whisper model. It outputs both `.srt` and `.vtt` subtitle files for universal compatibility.

In [None]:
import whisper
from whisper.utils import get_writer
from pathlib import Path

model = whisper.load_model("base")

files_mp3 = audio_dir.glob("**/*.mp3")

srt_writer = get_writer("srt", str(subtitle_dir))
vtt_writer = get_writer("vtt", str(subtitle_dir))

for mp3 in sorted(files_mp3):
    print(f"Transcribing: {mp3.name}...")

    result = model.transcribe(str(mp3), verbose=False)

    srt_writer(result, str(mp3))
    vtt_writer(result, str(mp3))

    print(f"Finished: {mp3.stem}.srt and {mp3.stem}.vtt")

### Compress and Export Results

Packages all generated subtitle files from the `/subtitles` directory into a single `subtitles.zip` archive for easy download and portability.

In [None]:
import zipfile
import os
from pathlib import Path

folder_path = subtitle_dir
output_zip_path = Path(f"{subtitle_dir.name}.zip")

if not folder_path.is_dir():
    print(f"Error: The path '{folder_path}' is not a valid directory.")
else:
    if output_zip_path.suffix.lower() != ".zip":
        output_zip_path = output_zip_path.with_suffix(".zip")

    try:
        with zipfile.ZipFile(output_zip_path, "w", zipfile.ZIP_DEFLATED) as zipf:
            for root, dirs, files in os.walk(folder_path):
                for file in files:
                    file_path = Path(root) / file
                    zipf.write(
                        file_path,
                        arcname=file_path.relative_to(folder_path)
                    )

        print(f"Folder '{folder_path}' successfully compressed to '{output_zip_path}'.")

    except Exception as e:
        print(f"An error occurred during zipping: {e}")