# AI Transcription FastAPI - V3 Demo

This notebook demonstrates how to run and interact with the V3 version of the AI Transcription API directly within a Google Colab environment.

## Understanding the V3 Architecture

The V3 architecture is designed for scalability and separates the application into two main services:

1.  **API Service (`api`)**: A lightweight FastAPI server that accepts requests, manages jobs, and handles caching.
2.  **Worker Service (`worker`)**: A dedicated process that consumes jobs from a queue, performs the heavy lifting of AI transcription, and reports results back.

This separation allows each component to be scaled independently in a production environment. To manage communication, V3 uses **Redis** as a central message broker, state database, and cache.

### Execution Modes: `distributed` vs. `local`

The V3 application supports two execution backends, configured via the `EXECUTION_BACKEND` environment variable:

-   **`distributed` (Default for Production)**: In this mode, the API and Worker services are completely separate. The API publishes jobs to a Redis Stream, and one or more independent workers consume jobs from that stream. This is the recommended mode for `Docker` and production deployments.

-   **`local` (For this Demo)**: In this mode, the API service directly starts a transcription job in a new local process using Python's `multiprocessing`. This avoids the need for a separate Redis server and is suitable for local development or single-machine environments like this Colab notebook.

**For this demonstration, we will use the `local` backend** as it allows us to run the entire application within the confines of this single notebook.

## Step 1: Setup and Dependencies

In [None]:
import os

# --- Environment Variables ---
# We must set these before importing the application code

# Use the 'local' backend for this demo
os.environ['EXECUTION_BACKEND'] = 'local'

# Set a dummy API Key
os.environ['API_KEY'] = 'colab-secret-key'

# Set a log level
os.environ['LOG_LEVEL'] = 'INFO'

print('Environment variables set for LOCAL mode.')

In [None]:
# Install all required packages
# Note: This will install PyTorch, which can take a few minutes.
!pip install -r requirements.txt

## Step 2: Run the API Server

Now we will run the FastAPI application in the background. We'll use `uvicorn` to serve the app and `threading` to run it without blocking the notebook.

In [None]:
import uvicorn
import threading
from main import app

class AppServer(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
    def run(self):
        uvicorn.run(app, host='0.0.0.0', port=8000, log_level='info')

server_thread = AppServer()
server_thread.daemon = True
server_thread.start()

print('API server is running in the background on port 8000.')

## Step 3: Send a Transcription Job

With the server running, we can now act as a client and send a request to the `/jobs` endpoint. We'll create a dummy audio file for this purpose.

Since we are in `local` mode, the API will receive the request and start a new process to handle the transcription. You can see the logs from both the API and the `[LocalWorker]` in the notebook's output.

In [None]:
import requests
import time

# Create a dummy WAV file for testing (silence)
import scipy.io.wavfile
import numpy as np

samplerate = 16000
duration = 2 # seconds
frequency = 440.0 # A4
t = np.linspace(0., duration, int(samplerate * duration))
amplitude = np.iinfo(np.int16).max * 0.5
data = amplitude * np.sin(2. * np.pi * frequency * t)
dummy_audio_path = 'test_audio.wav'
scipy.io.wavfile.write(dummy_audio_path, samplerate, data.astype(np.int16))

print(f'Created dummy audio file: {dummy_audio_path}')

# --- Send Request ---
url = 'http://localhost:8000/jobs'

api_key = os.environ['API_KEY']
headers = {'X-API-Key': api_key}

payload = {
    'model_id': 'distil_large_v3_ptbr',
    'session_id': 'colab-session-123',
    'language': 'pt'
}

files = {'files': open(dummy_audio_path, 'rb')}

try:
    response = requests.post(url, headers=headers, data=payload, files=files)
    response.raise_for_status() # Raise an exception for bad status codes
    
    print(f'API Response Status: {response.status_code}')
    print('API Response Body:')
    print(response.json())
    
    print('\n---')
    print('Job was dispatched! Check the logs above to see the [LocalWorker] process the job.')
    print('Since local mode is fire-and-forget, we cannot check job status via the API.')

except requests.exceptions.RequestException as e:
    print(f'An error occurred: {e}')