# Whisper Audio Transcription Demo

This notebook demonstrates how to transcribe audio files using OpenAI's Whisper model.


## 1. Install Dependencies


In [1]:
# Install required packages
%pip install -q openai-whisper librosa soundfile


Note: you may need to restart the kernel to use updated packages.


## 2. Import Libraries


In [2]:
import whisper
import os
from pathlib import Path


: 

## 3. Load Whisper Model


In [None]:
# Load Whisper model (options: tiny, base, small, medium, large)
print("Loading Whisper model...")
model = whisper.load_model("base")
print("Model loaded successfully!")


## 4. Define Audio File Paths

**Replace these placeholder paths with your actual audio file paths**


In [None]:
# Define paths to your audio files
# Replace these with actual file paths
audio_files = [
    "/Users/kunal/Code/BiometricProject/sample-070322_original (1).wav",  # Replace with actual path
    "/Users/kunal/Code/BiometricProject/sample-070322_perturbed_opus_final (1).wav",  # Replace with actual path
    "/Users/kunal/Code/BiometricProject/sample-070322_perturbed_amr-wb_final (1).wav"   # Replace with actual path
]

# Optional: Use files from your existing dataset
# Uncomment and modify if you want to use files from your project
# AUDIO_DIR = "/content/drive/MyDrive/adversarial-audio/Normal-Examples/long-signals"
# audio_files = [
#     os.path.join(AUDIO_DIR, "sample-070236.wav"),
#     os.path.join(AUDIO_DIR, "sample-070322.wav"),
#     os.path.join(AUDIO_DIR, "sample-070529.wav")
# ]


## 5. Transcribe Audio Files


In [None]:
print("="*80)
print("Starting transcription of audio files...")
print("="*80)
print()

for i, audio_path in enumerate(audio_files, 1):
    print(f"[{i}/{len(audio_files)}] Processing: {Path(audio_path).name}")
    print("-" * 80)
    
    # Check if file exists
    if not os.path.exists(audio_path):
        print(f"✗ Error: File not found - {audio_path}")
        print()
        continue
    
    try:
        # Transcribe the audio
        result = model.transcribe(audio_path)
        
        # Print result
        print(f"✓ Success!")
        print(f"Language: {result.get('language', 'unknown')}")
        print(f"Transcription: \"{result['text'].strip()}\"")
        
    except Exception as e:
        print(f"✗ Error: {str(e)}")
    
    print()

print("="*80)
print("Transcription complete!")
print("="*80)


---

## Usage Instructions

1. **Install dependencies** by running the first code cell
2. **Import libraries** by running the second code cell
3. **Load Whisper model** by running the third code cell (this may take a minute)
4. **Update audio file paths** in the fourth code cell with your actual audio files
5. **Run transcription** by executing the final code cell

### Supported Audio Formats
- WAV
- MP3
- M4A
- FLAC
- OGG
- OPUS

### Whisper Model Sizes
- `tiny`: Fastest, least accurate
- `base`: Good balance (used in this demo)
- `small`: Better accuracy
- `medium`: High accuracy
- `large`: Best accuracy, slowest
