<a href="https://colab.research.google.com/github/Bhoomika2224/Video_Summarizer-using-Transformers/blob/main/video_summary.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **General Overview**
This code:
1. Downloads audio from a YouTube video.
2. Transcribes the audio into text.
3. Summarizes the text.
4. Displays statistics (character and word counts) for both the transcript and summary.


In [None]:
!pip install yt-dlp openai-whisper transformers torch

Collecting yt-dlp
  Downloading yt_dlp-2025.6.9-py3-none-any.whl.metadata (174 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m174.3/174.3 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting openai-whisper
  Downloading openai-whisper-20240930.tar.gz (800 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m800.5/800.5 kB[0m [31m19.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4

In [None]:
import yt_dlp
import whisper
from transformers import pipeline

In [None]:
# 1. Audio
def download_audio(url):
    ydl_opts = {
        "format": "bestaudio/best",
        "outtmpl": "audio",
        "postprocessors": [{"key": "FFmpegExtractAudio", "preferredcodec": "wav"}],
    }
    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        ydl.download([url])
    return "audio.wav"


In [None]:
# 2. Transcript
def transcribe(audio_path):
    model = whisper.load_model("base")
    result = model.transcribe(audio_path)
    return result["text"]

In [None]:
# 3. Summarization
def summarize(text):
    summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
    chunks = [text[i:i+1000] for i in range(0, len(text), 1000)]

    summaries = []
    for chunk in chunks:
        summary = summarizer(chunk, max_length=150, min_length=30)[0]["summary_text"]
        summaries.append(summary)

    return " ".join(summaries)

# Example 1

In [None]:
url = "https://www.youtube.com/watch?v=X7j8F16eSqs"
audio_path = download_audio(url)
transcript = transcribe(audio_path)
summary = summarize(transcript)

[youtube] Extracting URL: https://www.youtube.com/watch?v=X7j8F16eSqs
[youtube] X7j8F16eSqs: Downloading webpage
[youtube] X7j8F16eSqs: Downloading tv client config
[youtube] X7j8F16eSqs: Downloading player fc2a56a5-main
[youtube] X7j8F16eSqs: Downloading tv player API JSON
[youtube] X7j8F16eSqs: Downloading ios player API JSON
[youtube] X7j8F16eSqs: Downloading m3u8 information
[info] X7j8F16eSqs: Downloading 1 format(s): 251
[download] Destination: audio
[download] 100% of    5.34MiB in 00:00:00 at 9.01MiB/s   
[ExtractAudio] Destination: audio.wav
Deleting original file audio (pass -k to keep)


100%|███████████████████████████████████████| 139M/139M [00:02<00:00, 66.0MiB/s]
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu
Your max_length is set to 150, but your input_length is only 107. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=53)


In [None]:
print("\nTranscript:", transcript[:500] + "...")
print("\n Summary", summary)



Transcript:  In a 2011 study, researchers followed a group of judges deciding whether or not to offer imprisoned individuals a chance at parole. Logically, one might expect things like unimpresened persons' crime, existing sentence and current behavior to be the primary considerations. But while those details were duly examined, one variable had a remarkably large impact. The time of day. In prisoned people who met with the board in the morning were far more likely to receive parole than those whose cases w...

 Summary In a 2011 study, researchers followed a group of judges deciding whether or not to offer imprisoned individuals a chance at parole. In prisoned people who met with the board in the morning were far more likely to receive parole than those whose cases were reviewed in the afternoon. Many people seem to have a daily threshold for making decisions. Once that threshold is met, most people make the conscious choice to take it easy. How quickly you reach this threshold depen

In [None]:
def count_words(text):
    return len(text.split())  #Word counting func

In [None]:
print(f"""
Transcript:
- Length: {len(transcript)}
- Count: {count_words(transcript)}

Summary:
- Length: {len(summary)}
- Count: {count_words(summary)}
""")



Transcript:
- Length: 4514
- Count: 734

Summary:
- Length: 1164
- Count: 191



# Example 2

In [None]:
url = "https://www.youtube.com/watch?v=dItUGF8GdTw"
audio_path = download_audio(url)
transcript = transcribe(audio_path)
summary = summarize(transcript)

[youtube] Extracting URL: https://www.youtube.com/watch?v=dItUGF8GdTw
[youtube] dItUGF8GdTw: Downloading webpage
[youtube] dItUGF8GdTw: Downloading tv client config
[youtube] dItUGF8GdTw: Downloading tv player API JSON
[youtube] dItUGF8GdTw: Downloading ios player API JSON
[youtube] dItUGF8GdTw: Downloading m3u8 information
[info] dItUGF8GdTw: Downloading 1 format(s): 251
[download] Destination: audio
[download] 100% of    4.14MiB in 00:00:00 at 11.27MiB/s  
[ExtractAudio] Destination: audio.wav
Deleting original file audio (pass -k to keep)


Device set to use cpu


In [None]:
print("\nTranscript", transcript[:500] + "...")
print("\n Summary:", summary)


Transcript  Every day, a sea of decisions stretches before us. Some are small and unimportant, but others have a larger impact on our lives. For example, which politician should I vote for? Should I try the latest diet craze? Or will email make me a millionaire? Or bombarded with so many decisions that it's impossible to make a perfect choice every time? But there are many ways to improve our chances, and one particularly effective technique is critical thinking. This is a way of approaching a question tha...

 Summary: Every day, a sea of decisions stretches before us. One particularly effective technique is critical thinking. This is a way of approaching a question that reveals hidden issues. Having a clear idea of your question will help you determine what's relevant. If you're trying to decide on a diet to improve your nutrition, you may ask an expert. Ask yourself, what concepts are at work? What assumptions exist? Is my interpretation of the information logically sound? For exam

In [None]:
print(f"""
Transcript:
- Length: {len(transcript)}
- Count: {count_words(transcript)}

Summary:
- Length: {len(summary)}
- Count: {count_words(summary)}
""")



Transcript:
- Length: 3804
- Count: 638

Summary:
- Length: 788
- Count: 128

