# Audio Transcription Tool - Google Colab (GPU Accelerated)

This notebook demonstrates how to use the GPU-accelerated audio transcription tool in Google Colab.

**Important:** Make sure you're using a GPU runtime!
- Go to Runtime → Change runtime type → Hardware accelerator → GPU (T4 or A100)


## Check GPU Availability

First, let's verify we have GPU access:

In [None]:
# Check GPU availability
!nvidia-smi
print("\nIf you see GPU information above, you're ready for fast transcription!")
print("If not, go to Runtime → Change runtime type → Hardware accelerator → GPU")

## Setup

Clone the repository and run the GPU-optimized setup script:

In [None]:
# Clone the repository
!git clone https://github.com/your-username/your-repo.git
%cd your-repo/repo-2

In [None]:
# Run the GPU-optimized Colab setup script
!chmod +x colab_setup.sh
!./colab_setup.sh

## Test the Setup

Verify everything is working:

In [None]:
# Test the setup
!python test_setup.py

## Upload Audio Files

Use the file browser on the left to upload your audio files, or use the code below:

In [None]:
from google.colab import files

# Upload audio files
uploaded = files.upload()

# List uploaded files
for filename in uploaded.keys():
    print(f"Uploaded: {filename} ({len(uploaded[filename])} bytes)")

## Basic Transcription

Transcribe an audio file:

In [None]:
# Replace 'audio.mp3' with your actual filename
!./transcribe.py audio.mp3

## Transcription with Timestamps

In [None]:
# Transcribe with timestamps
!./transcribe.py -t audio.mp3

## GPU-Accelerated Models

With GPU acceleration, you can use larger models for much better accuracy without significant speed penalty:

In [None]:
# Use the large model - much faster with GPU!
!./transcribe.py -m large -v audio.mp3

In [None]:
# Compare speeds: small vs large model
import time

print("Testing small model...")
start = time.time()
!./transcribe.py -m small audio.mp3
small_time = time.time() - start

print(f"\nSmall model took: {small_time:.1f} seconds")

print("\nTesting large model...")
start = time.time()
!./transcribe.py -m large audio.mp3
large_time = time.time() - start

print(f"\nLarge model took: {large_time:.1f} seconds")
print(f"Large model is only {large_time/small_time:.1f}x slower but much more accurate!")

## Batch Processing

In [None]:
# Process multiple files and concatenate into one output
!./transcribe.py -b -o combined_transcript.txt *.mp3

## View Results

Display the transcription results:

In [None]:
# List generated transcript files
!ls -la *.txt

In [None]:
# Display a transcript file (replace with your actual filename)
with open('audio.txt', 'r') as f:
    print(f.read())

## Performance Benchmark

Let's benchmark the GPU acceleration performance:

In [None]:
# Performance benchmark
import time
import os

def benchmark_model(model_name, audio_file):
    """Benchmark transcription speed for a given model."""
    if not os.path.exists(audio_file):
        print(f"Audio file {audio_file} not found. Please upload an audio file first.")
        return None
    
    print(f"Benchmarking {model_name} model...")
    start_time = time.time()
    
    # Run transcription
    !./transcribe.py -m {model_name} -v {audio_file}
    
    end_time = time.time()
    duration = end_time - start_time
    
    print(f"{model_name} model completed in {duration:.1f} seconds")
    return duration

# Benchmark different models (replace 'audio.mp3' with your file)
audio_file = 'audio.mp3'  # Change this to your uploaded file

if os.path.exists(audio_file):
    models = ['tiny', 'base', 'small', 'medium', 'large']
    times = {}
    
    for model in models:
        times[model] = benchmark_model(model, audio_file)
        print()
    
    # Show results
    print("=== BENCHMARK RESULTS ===")
    for model, duration in times.items():
        if duration:
            print(f"{model:8s}: {duration:6.1f}s")
else:
    print("Please upload an audio file first to run the benchmark.")

## GPU Memory Usage

Monitor GPU memory usage during transcription:

In [None]:
# Check GPU memory usage
!nvidia-smi --query-gpu=memory.used,memory.total --format=csv,noheader,nounits

print("\nRunning transcription with large model...")
!./transcribe.py -m large audio.mp3

print("\nGPU memory after transcription:")
!nvidia-smi --query-gpu=memory.used,memory.total --format=csv,noheader,nounits

## Download Results

Download the transcript files to your computer:

In [None]:
from google.colab import files
import glob

# Download all transcript files
for txt_file in glob.glob('*.txt'):
    files.download(txt_file)
    print(f"Downloaded: {txt_file}")