# GPT-4o Audio Transcription Test

This notebook tests OpenAI's GPT-4o audio transcription capabilities with multiple audio files.


In [None]:
import os
from pathlib import Path
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))


## Setup Audio Files

Place your audio files in a `data/audio` folder relative to the project root.
Supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm


In [3]:
# Define the audio files directory
audio_dir = Path("../data/audio")

# List of audio files to transcribe (update with your actual files)
audio_files = ["sample1.m4a", "sample2.m4a"]

# Check if files exist
print("Checking for audio files...")
for audio_file in audio_files:
    file_path = audio_dir / audio_file
    if file_path.exists():
        print(f"✓ Found: {audio_file} ({file_path.stat().st_size / 1024:.2f} KB)")
    else:
        print(f"✗ Missing: {audio_file}")


Checking for audio files...
✓ Found: sample1.m4a (23.76 KB)
✓ Found: sample2.m4a (45.70 KB)


## Transcription Function

Define a function to transcribe audio files using GPT-4o's transcription model


In [None]:
def transcribe_audio(file_path: Path, language: str = None) -> dict:
    """
    Transcribe an audio file using GPT-4o's transcription model.

    Args:
        file_path: Path to the audio file
        language: Optional language code (e.g., 'en', 'fr', 'es')

    Returns:
        Dictionary containing transcription results
    """
    try:
        with open(file_path, "rb") as audio_file:
            # Use GPT-4o transcription model
            transcription = client.audio.transcriptions.create(
                model="gpt-4o-transcribe",
                file=audio_file,
                language=language,
                response_format="json",
            )

        return {
            "success": True,
            "file": file_path.name,
            "text": transcription.text,
        }
    except Exception as e:
        return {"success": False, "file": file_path.name, "error": str(e)}


## Transcribe All Audio Files

Process each audio file and collect the results


In [10]:
results = []

for audio_file in audio_files:
    file_path = audio_dir / audio_file

    if not file_path.exists():
        print(f"⚠️  Skipping {audio_file} (file not found)")
        continue

    print(f"🎤 Transcribing: {audio_file}...")
    result = transcribe_audio(file_path)
    results.append(result)

    if result["success"]:
        print(f"  Preview: {result['text']}")
    else:
        print(f"✗ Error: {result['error']}")
    print("-" * 80)


🎤 Transcribing: sample1.m4a...
  Preview: Je voudrais partir en Slovénie en hiver avec ma famille.
--------------------------------------------------------------------------------
🎤 Transcribing: sample2.m4a...
  Preview: J'aimerais vivre une expérience incroyable avec des châteaux et des chutes d'eau énormes.
--------------------------------------------------------------------------------
