# **General Purpose of the Code**
This code downloads an audio file from a YouTube video, transcribes it into text using speech recognition, and then summarizes the text. It also measures the time taken for each step to provide performance insights and displays the results (transcript and summary) along with word and character counts.
# **Line-by-Line Explanation of the Code**

# **1. Library Installations**
```python
!pip install yt-dlp openai-whisper transformers torch
```
- `!pip install`: A command to install Python packages.
- `yt-dlp`: A tool for downloading audio or video from YouTube.
- `openai-whisper`: OpenAI’s Whisper model for transcribing audio into text.
- `transformers`: Hugging Face’s library for natural language processing tasks like summarization.
- `torch`: PyTorch, a framework required to run AI models.

In [1]:
!pip install yt-dlp openai-whisper transformers torch

Collecting yt-dlp
  Downloading yt_dlp-2025.2.19-py3-none-any.whl.metadata (171 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m171.9/171.9 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting openai-whisper
  Downloading openai-whisper-20240930.tar.gz (800 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m800.5/800.5 kB[0m [31m26.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting tiktoken (from openai-whisper)
  Downloading tiktoken-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.12

# **2. Library Imports**
```python
import yt_dlp
import whisper
from transformers import pipeline
import time  # Adding the time module
```
- `yt_dlp`: For downloading video/audio.
- `whisper`: For audio transcription.
- `pipeline`: A tool from Transformers to easily run tasks like summarization.
- `time`: For measuring the duration of processes.

In [2]:
import yt_dlp
import whisper
from transformers import pipeline
import time  # Zaman modülünü ekliyoruz

# **3. Time Measurement Variables**
```python
download_time = 0
transcribe_time = 0
summarize_time = 0
```
- Global variables to store the duration of each step (downloading, transcribing, summarizing).

In [3]:
# Süre ölçümleri için global değişkenler
download_time = 0
transcribe_time = 0
summarize_time = 0

# **4. Audio Download Function**
```python
def download_audio(url):
    global download_time
    start_time = time.time()
```
- `download_audio(url)`: A function to download audio from a YouTube URL.
- `global download_time`: Accesses the global variable to store the duration.
- `start_time`: Records the start time of the process.

```python
    ydl_opts = {
        "format": "bestaudio/best",
        "outtmpl": "audio",
        "postprocessors": [{"key": "FFmpegExtractAudio", "preferredcodec": "wav"}],
    }
```
- `ydl_opts`: Configuration options for `yt-dlp`:
  - `"format": "bestaudio/best"`: Selects the best available audio quality.
  - `"outtmpl": "audio"`: Names the output file as “audio”.
  - `"postprocessors"`: Uses FFmpeg to convert the audio to WAV format.

```python
    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        ydl.download([url])
```
- `YoutubeDL`: Executes the download using `yt-dlp`.
- `[url]`: Downloads the provided YouTube URL.

```python
    download_time = time.time() - start_time
    return "audio.wav"
```
- `download_time`: Calculates the time taken for downloading.
- `"audio.wav"`: Returns the name of the downloaded audio file.


In [4]:
# 1. Video'dan Ses İndirme
def download_audio(url):
    global download_time
    start_time = time.time()

    ydl_opts = {
        "format": "bestaudio/best",
        "outtmpl": "audio",
        "postprocessors": [{"key": "FFmpegExtractAudio", "preferredcodec": "wav"}],
    }
    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        ydl.download([url])

    download_time = time.time() - start_time
    return "audio.wav"

# **5. Transcription Function**
```python
def transcribe(audio_path):
    global transcribe_time
    start_time = time.time()
```
- `transcribe(audio_path)`: A function to convert audio into text.
- `global transcribe_time`: Accesses the global variable for duration.
- `start_time`: Records the start time.

```python
    model = whisper.load_model("base")
    result = model.transcribe(audio_path)
```
- `whisper.load_model("base")`: Loads Whisper’s “base” model (a small, fast model).
- `model.transcribe(audio_path)`: Transcribes the audio file into text.

```python
    transcribe_time = time.time() - start_time
    return result["text"]
```
- `transcribe_time`: Calculates the transcription duration.
- `result["text"]`: Returns the transcribed text.

In [5]:
# 2. Transkripsiyon
def transcribe(audio_path):
    global transcribe_time
    start_time = time.time()

    model = whisper.load_model("base")
    result = model.transcribe(audio_path)

    transcribe_time = time.time() - start_time
    return result["text"]

# **6. Summarization Function**
```python
def summarize(text):
    global summarize_time
    start_time = time.time()
```
- `summarize(text)`: A function to summarize the text.
- `global summarize_time`: Accesses the global variable for duration.
- `start_time`: Records the start time.

```python
    summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
```
- `pipeline("summarization")`: Sets up a summarization task.
- `model="facebook/bart-large-cnn"`: Uses Facebook’s BART model (a powerful summarization model).

```python
    chunks = [text[i:i+1000] for i in range(0, len(text), 1000)]
```
- `chunks`: Splits the text into 1000-character chunks (to handle long texts within model limits).

```python
    summaries = []
    for chunk in chunks:
        summary = summarizer(chunk, max_length=150, min_length=30)[0]["summary_text"]
        summaries.append(summary)
```
- Summarizes each chunk:
  - `max_length=150`: Limits the summary to 150 words.
  - `min_length=30`: Ensures a minimum of 30 words.
- Stores each summary in a list.

```python
    summarize_time = time.time() - start_time
    return " ".join(summaries)
```
- `summarize_time`: Calculates the summarization duration.
- `" ".join(summaries)`: Combines the chunk summaries into a single text.

In [6]:
# 3. Özetleme
def summarize(text):
    global summarize_time
    start_time = time.time()

    summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
    chunks = [text[i:i+1000] for i in range(0, len(text), 1000)]

    summaries = []
    for chunk in chunks:
        summary = summarizer(chunk, max_length=150, min_length=30)[0]["summary_text"]
        summaries.append(summary)

    summarize_time = time.time() - start_time
    return " ".join(summaries)

# **7. Word Counting Function**
```python
def count_words(text):
    return len(text.split())
```
- `count_words(text)`: Counts the number of words in the text.
- `text.split()`: Splits the text into words based on spaces.

In [7]:
def count_words(text):
    return len(text.split())

# **8. Main Process Function**
```python
def run_process(url):
    total_start = time.time()
```
- `run_process(url)`: The main function that runs all steps (download, transcribe, summarize).
- `total_start`: Records the start time for the entire process.

```python
    audio_path = download_audio(url)
    transcript = transcribe(audio_path)
    summary = summarize(transcript)
```
- Executes the three main steps in sequence:
  1. Downloads audio (`download_audio`).
  2. Transcribes audio to text (`transcribe`).
  3. Summarizes the text (`summarize`).

```python
    total_time = time.time() - total_start
```
- `total_time`: Calculates the total duration of the process.

```python
    print(f"""
Audio Download Time: {download_time:.2f} seconds
Transcription Time: {transcribe_time:.2f} seconds
Summarization Time: {summarize_time:.2f} seconds
Total Time: {total_time:.2f} seconds
""")
```
- Prints the duration of each step and the total time in seconds (with 2 decimal places).

```python
    print("\nTranscript:", transcript[:500] + "...")
    print("\nSummary:", summary)
```
- `transcript[:500] + "..."`: Shows the first 500 characters of the transcript (truncating long text).
- `summary`: Displays the full summary.

```python
    print(f"""
Transcript:
- Characters: {len(transcript)}
- Words: {count_words(transcript)}

Summary:
- Characters: {len(summary)}
- Words: {count_words(summary)}
""")
```
- Prints character and word counts for both the transcript and the summary.


In [8]:
# Tüm süreci çalıştırma ve zamanları gösterme
def run_process(url):
    total_start = time.time()

    audio_path = download_audio(url)
    transcript = transcribe(audio_path)
    summary = summarize(transcript)

    total_time = time.time() - total_start

    print(f"""
Ses İndirme Süresi: {download_time:.2f} saniye
Transkripsiyon Süresi: {transcribe_time:.2f} saniye
Özetleme Süresi: {summarize_time:.2f} saniye
Toplam Süre: {total_time:.2f} saniye
""")

    print("\nTranskript:", transcript[:500] + "...")
    print("\nÖzet:", summary)

    print(f"""
Transkript:
- Karakter: {len(transcript)}
- Kelime: {count_words(transcript)}

Özet:
- Karakter: {len(summary)}
- Kelime: {count_words(summary)}
""")

# **9. Test URLs and Execution**
```python
urls = [
    "https://www.youtube.com/watch?v=KKNCiRWd_j0",
    "https://www.youtube.com/watch?v=aZ5EsdnpLMI"
]
```
- A list of YouTube URLs to process.

```python
for index, url in enumerate(urls):
    print(f"\n{'#'*30} PROCESS STARTING ({index+1}/{len(urls)}) {'#'*30}")
    run_process(url)
```
- `enumerate(urls)`: Loops through the URLs.
- Prints a header before each process (e.g., `##### PROCESS STARTING (1/2) #####`).
- `run_process(url)`: Runs the full process for each URL.


In [9]:
urls = [
    "https://www.youtube.com/watch?v=KKNCiRWd_j0",
    "https://www.youtube.com/watch?v=aZ5EsdnpLMI"
]

for index, url in enumerate(urls):
    print(f"\n{'#'*30} PROCESS STARTING ({index+1}/{len(urls)}) {'#'*30}")
    run_process(url)


############################## PROCESS STARTING (1/2) ##############################
[youtube] Extracting URL: https://www.youtube.com/watch?v=KKNCiRWd_j0
[youtube] KKNCiRWd_j0: Downloading webpage
[youtube] KKNCiRWd_j0: Downloading tv client config
[youtube] KKNCiRWd_j0: Downloading player b191cf34
[youtube] KKNCiRWd_j0: Downloading tv player API JSON
[youtube] KKNCiRWd_j0: Downloading ios player API JSON
[youtube] KKNCiRWd_j0: Downloading m3u8 information
[info] KKNCiRWd_j0: Downloading 1 format(s): 251
[download] Destination: audio
[download] 100% of   17.17MiB in 00:00:00 at 39.87MiB/s  
[ExtractAudio] Destination: audio.wav
Deleting original file audio (pass -k to keep)


100%|████████████████████████████████████████| 139M/139M [00:01<00:00, 135MiB/s]
  checkpoint = torch.load(fp, map_location=device)
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu
Your max_length is set to 150, but your input_length is only 13. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=6)



Ses İndirme Süresi: 23.27 saniye
Transkripsiyon Süresi: 339.57 saniye
Özetleme Süresi: 267.00 saniye
Toplam Süre: 629.90 saniye


Transkript:  I want to tell you what I see coming. I've been lucky enough to be working on AI for almost 15 years now. Back when I started to describe it as fringe would be an understatement. Researchers would say, no, no, we're only working on machine learning. Because working on AI was seen as way too out there. In 2010, just a very mention of the phrase AGI, artificial general intelligence, would get you some seriously strange looks. And even a cold shoulder, you're actually building AGI, people would sa...

Özet: In 2010, just a very mention of the phrase AGI, artificial general intelligence, would get you some seriously strange looks. People thought it was 50 years away or 100 years away if it was even possible at all. It wasn't long though, before AI started beating humans at a whole range of tasks. People started waking up to the fact that AI was goi

  checkpoint = torch.load(fp, map_location=device)
Device set to use cpu



Ses İndirme Süresi: 23.41 saniye
Transkripsiyon Süresi: 762.27 saniye
Özetleme Süresi: 514.31 saniye
Toplam Süre: 1300.00 saniye


Transkript:  Despite what you hear about artificial intelligence, machines still can't think like a human. But in the last few years, they have become capable of learning. And suddenly, our devices have opened their eyes and ears, and cars have taken the wheel. Today, artificial intelligence is not as good as you hope, and not as bad as you fear. But humanity is accelerating into a future that few can predict. That's why so many people are desperate to meet Kai Fouli, the Oracle of AI. Kai Fouli is in there...

Özet: Kai Fouli is the 'Oracle of AI' and has 50 million social media followers. He believes artificial intelligence will change the world more than anything in the history of. Beijing Venture Capital firm manufactures billionaires. Lee believes the best place to be an AI capitalist is Communist China. China attracted half of all AI capital in the w

# **Short Summary of the Code**
This code:
1. Downloads audio from a YouTube video (using `yt-dlp`).
2. Transcribes the audio into text (using `Whisper`).
3. Summarizes the text (using the `BART` model).
4. Measures the time taken for each step and presents the results (transcript, summary, word/character counts).

It processes two test YouTube URLs sequentially. The purpose is to automatically analyze video content and generate a concise text summary.
