<a href="https://colab.research.google.com/github/LaansDole/whisperX-FastAPI/blob/main/notebooks/whisperx_fastapi_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# WhisperX FastAPI on Google Colab

This notebook sets up and runs the WhisperX FastAPI project on Google Colab, utilizing its GPU for speech-to-text processing. The API service is exposed through a Cloudflare tunnel to allow external access.

## Features

- Speech-to-text transcription
- Audio alignment
- Speaker diarization
- Combined services

## Requirements

- Google Colab with GPU runtime
- Hugging Face token for model access
- Cloudflare account (free tier works fine)

## Setup Instructions

1. Make sure you're running this notebook with GPU runtime
2. Execute each cell in order
3. Use the Cloudflare tunnel URL to access the API

Let's start by checking if we have GPU access and setting up the environment.

In [None]:
# Keep this tab alive to prevent Colab from disconnecting you { display-mode: "form" }

#@markdown Press play on the music player that will appear below:
%%html
<audio src="https://oobabooga.github.io/silence.m4a" controls>

## 1. Install System Dependencies

First, we need to install the required system packages and utilities.

In [1]:
# Install ffmpeg for audio/video processing
!apt-get update && apt-get install -y ffmpeg

# Install git and other utilities
!apt-get install -y git curl wget

0% [Working]            Hit:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Hit:3 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:4 https://r2u.stat.illinois.edu/ubuntu jammy InRelease
Hit:5 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Hit:6 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Hit:7 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease
Hit:8 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:9 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:10 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Reading package lists... Done
W: Skipping acquire of configured file 'main/source/Sources' as repository 'https://r2u.stat.illinois.edu/ubuntu jammy InRelease' does not seem to provide it (sources.list entry misspelt?)
Reading package lists... Done
Building depe

## 2. Clone the WhisperX FastAPI Repository

In [2]:
# Clone the repository
!git clone https://github.com/pavelzbornik/whisperX-FastAPI.git
!cd whisperX-FastAPI && ls -la

fatal: destination path 'whisperX-FastAPI' already exists and is not an empty directory.
total 40312
drwxr-xr-x 10 root root     4096 Jun 23 15:33 .
drwxr-xr-x  1 root root     4096 Jun 23 15:28 ..
drwxr-xr-x  5 root root     4096 Jun 23 15:29 app
-rwxr-xr-x  1 root root 41164185 Jun 17 12:46 cloudflared
drwxr-xr-x  2 root root     4096 Jun 23 15:25 .devcontainer
-rw-r--r--  1 root root      531 Jun 23 15:25 docker-compose.yml
-rw-r--r--  1 root root     1627 Jun 23 15:25 dockerfile
-rw-r--r--  1 root root      331 Jun 23 15:25 .dockerignore
-rw-r--r--  1 root root      143 Jun 23 15:28 .env
drwxr-xr-x  8 root root     4096 Jun 23 15:25 .git
drwxr-xr-x  3 root root     4096 Jun 23 15:25 .github
-rw-r--r--  1 root root       52 Jun 23 15:25 .gitignore
-rw-r--r--  1 root root      207 Jun 23 15:25 .gitleaks.toml
-rw-r--r--  1 root root     1070 Jun 23 15:25 LICENSE
-rw-r--r--  1 root root       39 Jun 23 15:25 .markdownlint.json
-rw-r--r--  1 root root     1963 Jun 23 15:25 .pre-commit-c

## 3. Create Python Virtual Environment and Install Dependencies

We'll install PyTorch with CUDA support and all required dependencies.

In [None]:
# Install PyTorch with CUDA support
!cd whisperX-FastAPI && pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118

# Install project requirements
!cd whisperX-FastAPI && pip install -r requirements/dev.txt

# Install additional packages for Colab environment
!cd whisperX-FastAPI && pip install colorlog pyngrok python-dotenv

## 4. Set Up Environment Variables

Configure the required environment variables for WhisperX. You'll need to enter your Hugging Face API token to access the models.

In [None]:
import os

# Enter your Hugging Face token here
HF_TOKEN = input("Enter your Hugging Face token: ")

# Choose Whisper model size
WHISPER_MODEL = input("Enter Whisper model size (default: tiny): ") or "tiny"

# Set log level
LOG_LEVEL = "INFO"

# Create .env file
env_content = f"""HF_TOKEN={HF_TOKEN}
WHISPER_MODEL={WHISPER_MODEL}
LOG_LEVEL={LOG_LEVEL}
DEVICE=cuda
COMPUTE_TYPE=float16
DB_URL=sqlite:///records.db
"""

with open("whisperX-FastAPI/.env", "w") as f:
    f.write(env_content)

print("Environment configuration completed.")

## 5. Install and Configure Cloudflare Tunnel

We'll use Cloudflare tunnels to expose the FastAPI service to the internet, allowing you to access it from your browser or other clients.

In [5]:
# Download and install cloudflared
!wget https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -O cloudflared
!chmod +x cloudflared

print("Cloudflare tunnel client installed.")

cloudflared: Text file busy
Cloudflare tunnel client installed.


## 6. Start the FastAPI Service

Now we'll run the FastAPI application in the background and expose it through the Cloudflare tunnel.

In [1]:
import subprocess
import os
import time
import requests
import signal

# Function to kill any process using a specific port
def kill_port(port):
    """Kill any process using the specified port"""
    try:
        print(f"Checking for processes using port {port}...")
        # Find processes using the port
        result = subprocess.run(
            ["lsof", "-ti", f":{port}"],
            capture_output=True,
            text=True
        )

        if result.returncode == 0 and result.stdout.strip():
            pids = result.stdout.strip().split('\n')
            for pid in pids:
                if pid:
                    print(f"Killing process {pid} using port {port}")
                    try:
                        os.kill(int(pid), signal.SIGTERM)
                        time.sleep(1)  # Give it a moment to terminate gracefully
                        # If still running, force kill
                        try:
                            os.kill(int(pid), signal.SIGKILL)
                        except ProcessLookupError:
                            pass  # Process already terminated
                    except (ProcessLookupError, ValueError):
                        print(f"Process {pid} not found or invalid PID")
            print(f"All processes on port {port} have been terminated")
        else:
            print(f"No processes found using port {port}")

    except Exception as e:
        print(f"Error checking port {port}: {e}")

# Function to start FastAPI server in a background process
def start_fastapi_background():
    print("Attempting to start FastAPI server in background using subprocess...")
    try:
        # Kill any existing processes on port 8000
        kill_port(8000)

        # Check if we're already in the whisperX-FastAPI directory
        current_dir = os.path.basename(os.getcwd())
        if current_dir != "whisperX-FastAPI":
            os.chdir("whisperX-FastAPI")
            print(f"Changed directory to whisperX-FastAPI")
        else:
            print("Already in whisperX-FastAPI directory")

        # Define the command to run (using port 8000 instead of 8001)
        # Use 'exec' to replace the current shell process with uvicorn,
        # and run in the background with '&'
        # Ensure the virtual environment is sourced
        command = "uvicorn app.main:app --host 0.0.0.0 --port 8000 --log-config app/uvicorn_log_conf.yaml --log-level info &"

        # Start the subprocess
        # We don't need to capture stdout/stderr if we want it to run truly in background
        # If we need to debug, we might temporarily remove '&' and check output
        process = subprocess.Popen(command, shell=True, preexec_fn=os.setsid)

        print(f"FastAPI background process started with PID: {process.pid}")
        return process

    except Exception as e:
        print(f"Error starting FastAPI background process: {e}")
        return None

# Start the background process
fastapi_process = start_fastapi_background()

# Give the server a moment to start
time.sleep(30)

# Now check if the server is reachable
global server_url
server_url = "http://0.0.0.0:8000"
print(f"Attempting to connect to FastAPI server at {server_url}...")

try:
    response = requests.get(server_url, timeout=30)
    print(f"Successfully connected to {server_url}. Status code: {response.status_code}")
    if response.status_code in [200, 404, 422]:
        print("FastAPI server appears to be running.")
    else:
        print("Received unexpected status code. Server might be running but not as expected.")

except requests.exceptions.ConnectionError:
    print(f"Error: Could not connect to {server_url}. The server might not be running or is not accessible.")
except requests.exceptions.Timeout:
    print(f"Error: Request to {server_url} timed out.")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

# Store the process globally if needed for later shutdown
globals()['fastapi_background_process'] = fastapi_process

Attempting to start FastAPI server in background using subprocess...
Checking for processes using port 8000...
No processes found using port 8000
Changed directory to whisperX-FastAPI
FastAPI background process started with PID: 15032
Attempting to connect to FastAPI server at http://0.0.0.0:8000...
Successfully connected to http://0.0.0.0:8000. Status code: 200
FastAPI server appears to be running.


In [4]:
import re
import subprocess
import time
import threading
import IPython.display
from queue import Queue, Empty

def stream_output(process, output_queue):
    """Read output from process and put it in the queue"""
    for line in iter(process.stdout.readline, ''):
        if not line:
            break
        output_queue.put(line)
    process.stdout.close()

# Start cloudflared in a more controlled way
print("Starting Cloudflare tunnel...")
output_queue = Queue()

# Force kill any existing cloudflared processes that might be running
try:
    subprocess.run(['pkill', '-f', 'cloudflared'], check=False)
    time.sleep(1)
except Exception:
    pass

# Run cloudflared with proper error handling
try:
    cloudflared_proc = subprocess.Popen(
        ['./cloudflared', 'tunnel', '--url', server_url, '--no-autoupdate'],
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        text=True,
        bufsize=1
    )

    # Store the process for later cleanup
    globals()['cloudflared_proc'] = cloudflared_proc

    # Start thread to read output without blocking
    output_thread = threading.Thread(
        target=stream_output,
        args=(cloudflared_proc, output_queue)
    )
    output_thread.daemon = True
    output_thread.start()

    # Wait for the tunnel URL to appear
    tunnel_url = None
    start_time = time.time()
    timeout = 30  # 30 seconds timeout

    print("Waiting for tunnel URL (this may take a few seconds)...")

    while time.time() - start_time < timeout:
        try:
            line = output_queue.get(timeout=0.5)
            print(line.strip())

            match = re.search(r'(https://.*\.trycloudflare\.com)', line)
            if match:
                tunnel_url = match.group(1)
                break
        except Empty:
            pass

    if tunnel_url:
        print(f"\n✅ Public URL for WhisperX-FastAPI:\n{tunnel_url}")

        # Display a clickable link
        display(IPython.display.HTML(f'<a href="{tunnel_url}/docs" target="_blank">Open API Documentation</a>'))

        time.sleep(15)
        # Check if the tunnel is working
        import requests
        try:
            health_check = requests.get(f"{tunnel_url}/health", timeout=15)
            if health_check.status_code == 200:
                print("✅ API is accessible through the tunnel")
            else:
                print(f"⚠️ API responded with status code: {health_check.status_code}")
        except Exception as e:
            print(f"⚠️ Could not connect to API through tunnel: {str(e)}")
    else:
        print("❌ Could not find public Cloudflare URL within timeout period.")
        print("You may need to restart this cell.")

except Exception as e:
    print(f"Error starting Cloudflare tunnel: {str(e)}")
    if 'cloudflared_proc' in globals() and globals()['cloudflared_proc']:
        globals()['cloudflared_proc'].terminate()

Starting Cloudflare tunnel...
Waiting for tunnel URL (this may take a few seconds)...
2025-06-23T16:16:30Z INF Thank you for trying Cloudflare Tunnel. Doing so, without a Cloudflare account, is a quick way to experiment and try it out. However, be aware that these account-less Tunnels have no uptime guarantee, are subject to the Cloudflare Online Services Terms of Use (https://www.cloudflare.com/website-terms/), and Cloudflare reserves the right to investigate your use of Tunnels for violations of such terms. If you intend to use Tunnels in production you should use a pre-created named tunnel by following: https://developers.cloudflare.com/cloudflare-one/connections/connect-apps
2025-06-23T16:16:30Z INF Requesting new quick Tunnel on trycloudflare.com...
2025-06-23T16:16:34Z INF +--------------------------------------------------------------------------------------------+
2025-06-23T16:16:34Z INF |  Your quick Tunnel has been created! Visit it at (it may take some time to be reachable)

✅ API is accessible through the tunnel


## 10. Shutdown Services

When you're done, use this cell to properly shut down the services and free up resources.

In [4]:
# Function to shut down services
def shutdown_services():
    global fastapi_process, cloudflared_proc

    print("Shutting down services...")

    # Terminate Cloudflare tunnel
    if 'cloudflared_proc' in globals() and cloudflared_proc:
        print("Stopping Cloudflare tunnel...")
        try:
            # Send SIGTERM to the process
            cloudflared_proc.terminate()

            # Wait for up to 5 seconds for graceful termination
            for _ in range(5):
                if cloudflared_proc.poll() is not None:
                    break
                time.sleep(1)

            # If still running, force kill
            if cloudflared_proc.poll() is None:
                cloudflared_proc.kill()
                cloudflared_proc.wait()

            print("Cloudflare tunnel stopped.")
        except Exception as e:
            print(f"Error stopping Cloudflare tunnel: {e}")

    # Terminate FastAPI server
    if 'fastapi_process' in globals() and fastapi_process:
        print("Stopping FastAPI server...")
        try:
            import os
            import signal

            # Use process group ID to kill all child processes
            os.killpg(os.getpgid(fastapi_process.pid), signal.SIGTERM)

            # Wait for up to 5 seconds for graceful termination
            for _ in range(5):
                if fastapi_process.poll() is not None:
                    break
                time.sleep(1)

            # If still running, force kill
            if fastapi_process.poll() is None:
                os.killpg(os.getpgid(fastapi_process.pid), signal.SIGKILL)

            print("FastAPI server stopped.")
        except Exception as e:
            print(f"Error stopping FastAPI server: {e}")

    print("All services have been shut down.")

# For manual shutdown, uncomment and run the following line:
shutdown_services()

Shutting down services...
Stopping Cloudflare tunnel...
Cloudflare tunnel stopped.
Stopping FastAPI server...
Error stopping FastAPI server: [Errno 3] No such process
All services have been shut down.


## 11. Cleanup

Finally, clean up any temporary files and free up GPU memory.

In [None]:
# Clean up temporary files
!rm -f cloudflared-linux-amd64.deb

# Free up GPU memory (if any is still in use)
import torch
if torch.cuda.is_available():
    torch.cuda.empty_cache()
    print("GPU memory cleared.")

print("Cleanup completed. You can now close this notebook or run it again if needed.")