# Transkripcija z modelom OpenAI Whisper

## Prompt

Uporabnikov opis zaželenega programa: Write a Python script: install and import necessary modules for choosing audio files and playing them, then transcribe the audio file with OpenAI's Whisper model using a large multilingual, and last download the transcript.


## Namestitev knjižnic

In [1]:
!pip install -q openai-whisper
!pip install -q gradio
!pip install -q pydub


## Priklic programskih modulov

In [2]:
import whisper
import gradio as gr
from pydub import AudioSegment
import os
import tempfile


## Priklic modela za transkripcijo

In [3]:
import torch

# Load the Whisper model and move it to GPU explicitly
device = "cuda" if torch.cuda.is_available() else "cpu"
model = whisper.load_model("large-v3").to(device)


  checkpoint = torch.load(fp, map_location=device)


In [8]:
import torch
print("GPU available:", torch.cuda.is_available())
print("Device name:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "CPU")


GPU available: True
Device name: Tesla T4


## Funkcija za transkripcijo

In [15]:
def transcribe_audio(audio_file):
    """Transcribes audio using Whisper and returns the transcript."""

    temp_audio_path = None
    temp_transcript_path = None
    try:
        # Pretvori zvok v pravilen format
        audio = AudioSegment.from_file(audio_file)
        audio = audio.set_frame_rate(16000)
        audio = audio.set_channels(1)
        with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as temp_audio_file:
            audio.export(temp_audio_file.name, format="wav")
            temp_audio_path = temp_audio_file.name

        # Transkripcija zvoka
        result = model.transcribe(temp_audio_path, fp16=True)
        transcript = result["text"]

        # Ustvari začasno datoteko za prenos transkripta
        with tempfile.NamedTemporaryFile(suffix=".txt", delete=False, mode="w") as temp_transcript_file:
            temp_transcript_file.write(transcript)
            temp_transcript_path = temp_transcript_file.name

        # Počisti začasne datoteke za zvok
        if temp_audio_path:
            os.remove(temp_audio_path)

        return transcript, temp_transcript_path  # Transkript in pot do datoteke
    except Exception as e:
        # Počisti začasne datoteke v primeru napake
        if temp_audio_path:
            os.remove(temp_audio_path)
        if temp_transcript_path:
            os.remove(temp_transcript_path)
        return f"Error: {e}", None


## Uporabniški vmesnik

In [None]:
# Create Gradio interface
iface = gr.Interface(
    fn=transcribe_audio,
    inputs=gr.Audio(type="filepath"),
    outputs=[
        gr.Textbox(label="Transcript"),
        gr.File(label="Download Transcript"),
    ],
    title="Audio Transcription",
    description="Transcribe audio using OpenAI's Whisper model.",
)

iface.launch(debug=True)


Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://8002757e7fd990d39e.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


Write a Python script: install and import necessary modules for choosing audio files and playing them, then transcribe the audio file with OpenAI's Whisper model using a large multilingual, and last download the transcript.