<a href="https://colab.research.google.com/github/venu72561-chinnam/AI-ML-project/blob/main/(WHISPER)YouTube_Subtitle_Generator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!apt install ffmpeg -y
!pip install yt_dlp openai-whisper gradio torch

import os, subprocess, whisper, gradio as gr

model = whisper.load_model("small")  # You can use "base" or "tiny" if GPU is low

def format_timestamp(seconds: float) -> str:
    milliseconds = int((seconds % 1) * 1000)
    seconds = int(seconds)
    minutes, seconds = divmod(seconds, 60)
    hours, minutes = divmod(minutes, 60)
    return f"{hours:02}:{minutes:02}:{seconds:02},{milliseconds:03}"

def write_srt(result, filename="subtitles.srt"):
    with open(filename, "w", encoding="utf-8") as f:
        for i, seg in enumerate(result["segments"], start=1):
            start = format_timestamp(seg["start"])
            end = format_timestamp(seg["end"])
            text = seg["text"].strip()
            f.write(f"{i}\n{start} --> {end}\n{text}\n\n")
    return filename

def generate_subtitles(youtube_url):
    try:
        # Download YouTube audio with yt_dlp (reliable)
        audio_file = "audio.mp3"
        !yt-dlp -x --audio-format mp3 -o {audio_file} {youtube_url}

        # Transcribe using Whisper
        result = model.transcribe(audio_file)
        srt_path = write_srt(result, "subtitles.srt")

        return result["text"], srt_path

    except Exception as e:
        return f"⚠️ Error: {str(e)}", None

gr.Interface(
    fn=generate_subtitles,
    inputs=gr.Textbox(label="Enter YouTube URL"),
    outputs=[gr.Textbox(label="Transcript"), gr.File(label="Download Subtitles (.srt)")],
    title=" YouTube Subtitle Generator ",
    description="Paste a YouTube video URL "
).launch()

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
ffmpeg is already the newest version (7:4.4.2-0ubuntu0.22.04.1).
0 upgraded, 0 newly installed, 0 to remove and 38 not upgraded.
Collecting yt_dlp
  Downloading yt_dlp-2025.10.22-py3-none-any.whl.metadata (176 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m176.0/176.0 kB[0m [31m7.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting openai-whisper
  Downloading openai_whisper-20250625.tar.gz (803 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m803.2/803.2 kB[0m [31m35.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Downloading yt_dlp-2025.10.22-py3-none-any.whl (3.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.2/3.2 MB[0m [31m107.2 MB/s[0m eta [36m0:00:00[0m
[?25hBuil

100%|███████████████████████████████████████| 461M/461M [00:16<00:00, 29.6MiB/s]


It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://bee41503b8aa748237.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


