<a href="https://colab.research.google.com/github/Oleksy1121/video-summarizer/blob/main/notebooks/colab_video_summarizer_llama.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install -q yt-dlp requests torch bitsandbytes transformers sentencepiece accelerate

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m175.9/175.9 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.2/3.2 MB[0m [31m25.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.3/61.3 MB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
# Imports

import os
import requests
import subprocess
from IPython.display import Markdown, display, update_display
from openai import OpenAI
from google.colab import drive
from google.colab import userdata
from huggingface_hub import login, snapshot_download
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer, BitsAndBytesConfig
import torch

In [None]:
# Constant

AUDIO_MODEL = 'whisper-1'
LLAMA = "meta-llama/Meta-Llama-3.1-8B-Instruct"

AUDIO_FILENAME = "audio.mp3"
URL = "https://www.youtube.com/watch?v=TMkoX1kfyDs&t=8s"

In [None]:
# Download Audio

subprocess.run(["yt-dlp", "-x", "--audio-format", "mp3", URL, "-o", AUDIO_FILENAME])

CompletedProcess(args=['yt-dlp', '-x', '--audio-format', 'mp3', 'https://www.youtube.com/watch?v=TMkoX1kfyDs&t=8s', '-o', 'audio.mp3'], returncode=0)

In [None]:
# Api Keys

hf_token = userdata.get('HF_TOKEN')
login(hf_token, add_to_git_credential=True)

openai_api_key = userdata.get('OPENAI_API_KEY')
openai = OpenAI(api_key=openai_api_key)

In [None]:
# Generate transcription using OpenAI "whisper-1" model

audio_file = open(AUDIO_FILENAME, "rb")
transcription = openai.audio.transcriptions.create(model=AUDIO_MODEL, file=audio_file, response_format="text")
print(transcription)

The west bank of the River Nile, home to the world's most iconic monuments, the mighty Pyramids of Giza. The pyramids once housed the bodies of the pharaohs. But though ancient Egyptian civilization lasted for nearly 3,000 years, its kings only built huge tombs like these for a few centuries. Egyptologists are still trying to piece together why the pharaohs stopped constructing giant pyramids. For Egyptologist Chris Naunton, the majesty of the ancient structures makes the fact that Egyptians gave up building them all the more incredible. Ten miles south of the legendary Pyramids of Giza is Saqqara. When we think about pyramids, we tend to think of Giza, I think, and the Great Pyramid of Khufu in particular, but actually this is where it all began. Chris has come to the birthplace of pyramid building to search for clues to why Egyptians built giant pyramids for less than 500 years. Constructed a century before the iconic pyramids at Giza, Egypt's first pyramid is a 200-foot-tall mausole

In [None]:
# Create prompts

system_message = "You are an assistant that summarizes YouTube videos by highlighting the most important points and events, in clear Markdown format."
user_prompt = f"Below is the transcript of a YouTube video. Please write a summary describing the key moments, main ideas, and highlights in a structured way (using bullet points or short sections).\n\n{transcription}"

messages = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_prompt}
]

In [None]:
# Quantization

quant_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4"
)

In [None]:
# Generate Summarize using LLAMA model

tokenizer = AutoTokenizer.from_pretrained(LLAMA)
tokenizer.pad_token = tokenizer.eos_token
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
model = AutoModelForCausalLM.from_pretrained(LLAMA,
                                             device_map="auto",
                                             quantization_config=quant_config)
outputs = model.generate(inputs, max_new_tokens=2000)

tokenizer_config.json:   0%|          | 0.00/55.4k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/855 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Fetching 4 files:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/184 [00:00<?, ?B/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


In [None]:
# Clearing Outputs

generated_ids = outputs[0][inputs.shape[1]:]
prefix_tokens = tokenizer("<|start_header_id|>assistant<|end_header_id|>", add_special_tokens=False).input_ids

if generated_ids[:len(prefix_tokens)].tolist() == prefix_tokens:
    generated_ids = generated_ids[len(prefix_tokens):]


response = tokenizer.decode(generated_ids, skip_special_tokens=True)

In [None]:
# Display in Markdown

display(Markdown(response))



**Summary of the YouTube Video: "The Birthplace of Pyramid Building"**

**Introduction**
The video explores the mysterious reason behind the ancient Egyptians' decision to stop building giant pyramids, which were once the tombs of pharaohs. The pyramids, including the iconic ones at Giza, were constructed for only a few centuries despite the long-lasting Egyptian civilization.

**Key Moments**

* **Saqqara: The Birthplace of Pyramid Building**: The video takes us to Saqqara, ten miles south of the Pyramids of Giza, where Egypt's first pyramid was constructed a century before the iconic pyramids at Giza.
* **The First Pyramid: Djoser's Mausoleum**: The pyramid is a 200-foot-tall mausoleum of six huge limestone platforms, carefully engineered to spread the weight of rock and prevent collapse. Inside, a giant shaft leads to the intended final resting place of the pharaoh Djoser.
* **The Architectural Revolution**: The pyramid sparked an architectural revolution, and over the next century, Egypt's kings developed the concept, building monumental tombs along the Nile's west bank.

**Main Ideas**

* **Why Egyptians Stopped Building Giant Pyramids**: Egyptologists are still trying to piece together why the pharaohs stopped constructing giant pyramids after a few centuries.
* **The Importance of Pyramids**: The pyramids were not just tombs designed to secure the pharaoh's physical body for eternity but also ensured the king was remembered by the living for success in the afterlife.

**Highlights**

* **The Red Pyramid and the Bent Pyramid**: The video mentions the first geometrically true pyramid, the Red Pyramid, and a misshapen experiment, the Bent Pyramid.
* **The Great Pyramid of Khufu**: The video highlights the most iconic monuments in Egypt, the Pyramids of Giza, including the Great Pyramid of Khufu.
* **The Legacy of Pyramid Building**: The video concludes that just a few short centuries after the Great Pyramid of Khufu rose from the desert, a new era was on the horizon, marking the end of the pyramid-building era.