In [5]:
import google.generativeai as genai
import os
from dotenv import load_dotenv

load_dotenv()

genai.configure(api_key=os.getenv("GEMINI_API_KEY"))

In [61]:
from IPython.display import IFrame

IFrame('https://www.youtube.com/embed/T-D1OfcDW1M', width=560, height=315)

In [7]:
from youtube_transcript_api import YouTubeTranscriptApi

youtube_video_url = "https://youtu.be/T-D1OfcDW1M?si=g-jIb0p49KZkSVPb"


video_id = youtube_video_url.split("youtu.be/")[1].split("?")[0]

In [8]:
print(video_id)

T-D1OfcDW1M


In [9]:
ytt_api = YouTubeTranscriptApi()

transcript = ytt_api.fetch(video_id)

In [10]:
full_transcript = ""
for segment in transcript:
    full_transcript += segment.text + " "

print(full_transcript)

Large language models. They are everywhere. They get some things amazingly right and other things very interestingly wrong. My name is Marina Danilevsky. I am a Senior Research Scientist here at IBM Research. And I want to tell you about a framework to help large language models be more accurate and more up to date: Retrieval-Augmented Generation, or RAG. Let's just talk about the "Generation" part for a minute. So forget the "Retrieval-Augmented". So the generation, this refers to large language models, or LLMs, that generate text in response to a user query, referred to as a prompt. These models can have some undesirable behavior. I want to tell you an anecdote to illustrate this. So my kids, they recently asked me this question: "In our solar system, what planet has the most moons?" And my response was, “Oh, that's really great that you're asking this question. I loved space when I was your age.” Of course, that was like 30 years ago. But I know this! I read an article and the artic

## prompt

In [11]:
from langchain_core.prompts import PromptTemplate

template = """
# Role: YouTube Video Summarizer

## Task
Provide a concise, informative summary of the YouTube video transcript below.

## Guidelines
- Focus on the main ideas, key points, and essential takeaways
- Organize information in a structured, easy-to-read format with bullet points
- Include any significant conclusions or insights presented
- Keep the summary under 250 words
- Exclude unnecessary details or tangential information

## Transcript

{full_transcript}

## Output Format
- Title: [Inferred title based on content]
- Main Topic: [1-2 sentence description of what the video is about]
- Key Points:
  • [Point 1]
  • [Point 2]
  • [Point 3]
  • [Additional points as needed]
- Conclusion: [Main takeaway or call to action]
"""

prompt_template = PromptTemplate.from_template(template)
prompt = prompt_template.format_prompt(full_transcript = full_transcript)

In [12]:
from dotenv import load_dotenv
import os
api_key = os.getenv("GEMINI_API_KEY")
load_dotenv()

from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(api_key=api_key, model="gemini-2.0-flash")

In [13]:
summary = llm.invoke(prompt)

In [14]:
print(summary.content)

- Title: Retrieval-Augmented Generation (RAG) for More Accurate LLMs
- Main Topic: The video explains Retrieval-Augmented Generation (RAG) as a framework to improve the accuracy and up-to-date nature of large language models (LLMs). It addresses the issues of LLMs providing outdated or unsourced information.
- Key Points:
  • LLMs can generate text in response to user queries (prompts) but may provide inaccurate or outdated information without proper sourcing.
  • RAG augments LLMs with a content store (e.g., the internet or a collection of documents) that the LLM consults before generating a response.
  • The LLM first retrieves relevant information from the content store based on the user's query.
  • The retrieved content is combined with the user's question, and the LLM generates an answer based on this combined input, providing evidence for its response.
  • RAG helps address the challenges of LLMs providing outdated information by allowing for easy updating of the content store.


# Create functions

In [50]:
from youtube_transcript_api import YouTubeTranscriptApi
from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI
from youtube_transcript_api.proxies import WebshareProxyConfig


def get_video_transcript(video_id):

    ytt_api = YouTubeTranscriptApi()
    

    transcript = ytt_api.fetch(video_id)
    
    full_transcript = ""
    for segment in transcript:
        full_transcript += segment.text + " "
    return full_transcript

def generate_summary(video_id, api_key):

    full_transcript = get_video_transcript(video_id)

    template = """
    # Role: YouTube Video Summarizer

    ## Task
    Provide a concise, informative summary of the YouTube video transcript below.

    ## Guidelines
    - Focus on the main ideas, key points, and essential takeaways
    - Organize information in a structured, easy-to-read format with bullet points
    - Include any significant conclusions or insights presented
    - Keep the summary under 250 words
    - Exclude unnecessary details or tangential information

    ## Transcript

    {full_transcript}

    ## Output Format
    - Title: [Inferred title based on content]
    - Main Topic: [1-2 sentence description of what the video is about]
    - Key Points:
      • [Point 1]
      • [Point 2]
      • [Point 3]
      • [Additional points as needed]
    - Conclusion: [Main takeaway or call to action]
    """

    prompt_template = PromptTemplate.from_template(template)
    prompt = prompt_template.format(full_transcript=full_transcript)

    llm = ChatGoogleGenerativeAI(api_key=api_key, model="gemini-2.0-flash")
    summary = llm.invoke(prompt)
    
    return summary.content



In [52]:
print(generate_summary(video_id, api_key))

- Title: Retrieval-Augmented Generation (RAG) for More Accurate LLMs
- Main Topic: This video explains Retrieval-Augmented Generation (RAG), a framework designed to improve the accuracy and up-to-date nature of large language models (LLMs) by incorporating external knowledge retrieval.
- Key Points:
    • LLMs can provide inaccurate or outdated information due to their reliance on training data.
    • RAG addresses these issues by having the LLM first retrieve relevant information from a content store (e.g., the internet or a document collection) before generating a response.
    • The process involves the user prompting the LLM, the LLM retrieving relevant content, and then generating an answer based on both the user's question and the retrieved information.
    • RAG helps LLMs stay up-to-date by allowing the content store to be updated without retraining the entire model.
    • It also improves sourcing by instructing the LLM to rely on primary source data, reducing the likelihood o