# Whisper Transcription (HTTP-only) in OpenShift

This notebook wraps a **minimal HTTP flow** for sending local audio files to your **OpenAI-compatible Whisper** service in OpenShift.

**What you get**
- Clear configuration section
- Small helper to list audio files
- A `requests`-based function that posts to `/v1/audio/transcriptions`
- Simple ipywidgets UI to pick, play, and transcribe files

> **Scope**: This version intentionally uses only the raw HTTP approach (no OpenAI SDK).

## Prerequisites
- Run this where the service DNS (e.g., `*.svc.cluster.local`) is reachable.
- Your Whisper service should implement `POST /v1/audio/transcriptions` (OpenAI-compatible).
- Put a few test audio files in the configured folder (default: `/opt/app-root/src/audio_data/`).
- If your endpoint requires auth, set an environment variable `OPENAI_API_KEY` before running.

In [None]:
# Optional: install dependencies if needed
# %pip install --quiet --upgrade requests ipywidgets
# If running in JupyterLab, enable widgets once (restart kernel might be required):
# %pip install --quiet jupyterlab-widgets ipywidgets
# %jupyter nbextension enable --py widgetsnbextension

## 1) Configuration
Update these values to match your environment and directory layout.

In [None]:
# === CONFIG ===
from pathlib import Path
import os

# Whisper endpoint (adjust port/scheme to match your service)
WHISPER_HOST = "http://whisper-large-v3-predictor.whisper-proj.svc.cluster.local:8080"
WHISPER_API = f"{WHISPER_HOST}/v1/audio/transcriptions"

# Model + auth (if any)
WHISPER_MODEL = "whisper-large-v3"
# Optional auth: set via env var OPENAI_API_KEY (or assign a token string here)
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "")

# Local audio directory (where your samples are)
LOCAL_AUDIO_DIR = Path("/opt/app-root/src/audio_data/")  # change to your path

# Audio extensions to show in UI
AUDIO_EXTS = (".wav", ".mp3", ".flac", ".ogg", ".m4a", ".aac")

## 2) Imports & Utilities
We keep helpers small and focused.

In [None]:
# === IMPORTS & UTILITIES ===
import requests
from typing import List
from IPython.display import Audio

def list_local_audio_files(directory: Path, exts=AUDIO_EXTS) -> List[Path]:
    directory = Path(directory)
    return sorted([p for p in directory.glob("*") if p.suffix.lower() in exts and p.is_file()])

## 3) Transcription function (HTTP via `requests`)
Sends a **multipart/form-data** POST to the Whisper endpoint. Returns the transcript text when available.

In [None]:
def transcribe_with_whisper(
    local_audio_path: Path,
    model_name: str = WHISPER_MODEL,
    api_url: str = WHISPER_API,
    api_key: str | None = OPENAI_API_KEY,
    timeout: int = 180,
):
    """Send a local file to an OpenAI-compatible Whisper endpoint and return transcript text (or raw JSON)."""
    headers = {}
    if api_key:
        headers["Authorization"] = f"Bearer {api_key}"
    
    # Use a context manager to avoid leaving the file handle open
    with open(local_audio_path, "rb") as f:
        files = {"file": (local_audio_path.name, f, "application/octet-stream")}
        data = {"model": model_name}
        resp = requests.post(api_url, headers=headers, files=files, data=data, timeout=timeout)

    resp.raise_for_status()
    js = resp.json()
    if "text" in js:
        return js["text"]
    if isinstance(js, dict) and "choices" in js and js["choices"] and "text" in js["choices"][0]:
        return js["choices"][0]["text"]
    return js  # fallback if the server returns a different shape

## 4) UI — Pick, play, and transcribe
Use **Refresh** to scan the folder, **Play** to preview, and **Transcribe** to send the file to Whisper.

In [None]:
# === UI WIDGETS ===
from IPython.display import display, clear_output
import ipywidgets as widgets

# Inputs
dir_text = widgets.Text(value=str(LOCAL_AUDIO_DIR), description="Folder:", layout=widgets.Layout(width="60%"))
refresh_btn = widgets.Button(description="Refresh", icon="refresh")
file_dd = widgets.Dropdown(options=[], description="File:", layout=widgets.Layout(width="70%"))

# Actions
play_btn = widgets.Button(description="Play", icon="play")
transcribe_btn = widgets.Button(description="Transcribe", icon="microphone")

# Outputs
status_out = widgets.Output()
audio_out = widgets.Output()
text_out = widgets.Output()

def refresh_files(_=None):
    folder = Path(dir_text.value).expanduser()
    files = list_local_audio_files(folder)
    file_dd.options = files
    with status_out:
        clear_output(wait=True)
        if files:
            print(f"Found {len(files)} audio file(s) in {folder}")
        else:
            print(f"No audio files found in {folder}")

def play_audio(_=None):
    sel = file_dd.value
    if not sel:
        return
    with audio_out:
        clear_output(wait=True)
        display(Audio(filename=str(sel), autoplay=False))

def run_transcription(_=None):
    sel = file_dd.value
    if not sel:
        return
    with status_out:
        clear_output(wait=True)
        print(f"Transcribing: {Path(sel).name}")
    try:
        txt = transcribe_with_whisper(Path(sel))
        with text_out:
            clear_output(wait=True)
            print("=== Transcript ===")
            print(txt if isinstance(txt, str) else str(txt))
        with status_out:
            clear_output(wait=True)
            print("Done.")
    except Exception as e:
        with text_out:
            clear_output(wait=True)
            print("Transcription failed:", e)

refresh_btn.on_click(refresh_files)
play_btn.on_click(play_audio)
transcribe_btn.on_click(run_transcription)

# Render the UI
display(widgets.HBox([dir_text, refresh_btn]))
display(file_dd)
display(widgets.HBox([play_btn, transcribe_btn]))
display(status_out, audio_out, text_out)

# initial file load
refresh_files()

## 5) (Optional) Quick smoke test
Runs transcription on the currently selected file (or the first file in the folder if nothing is selected).

In [None]:
test_file = file_dd.value or (list_local_audio_files(LOCAL_AUDIO_DIR)[:1] or [None])[0]
if test_file:
    print(f"Transcribing (smoke test): {Path(test_file).name}")
    try:
        txt = transcribe_with_whisper(Path(test_file))
        print("=== Transcript ===")
        print(txt if isinstance(txt, str) else str(txt))
    except Exception as e:
        print("Transcription failed:", e)
else:
    print("No audio file found for smoke test.")

## 6) Troubleshooting
- **Port & reachability**: Confirm `:8080` and that this notebook can reach `whisper-...svc.cluster.local`.
- **Auth**: If your gateway requires a token, export `OPENAI_API_KEY` and re-run the config cell.
- **Server errors**: Consider printing `resp.text` on non-2xx responses inside `transcribe_with_whisper` for more detail.
- **Large files**: Increase `timeout` in `transcribe_with_whisper` as needed.