# Analyze a Youtube video by asking the LLM
By [Lior Gazit](https://www.linkedin.com/in/liorgazit/)  

<a target="_blank" href="https://colab.research.google.com/github/LiorGazit/LLM_search_inside_youtube_videos/blob/main/Analyze_a_Youtube_video_by_asking_the_LLM.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

**Description of the notebook:**  
Pick a Youtube video that you'd like to understand what value it brings you without having to spend the time to watch all of it.  
For instance: an hour long lecture about a topic you are looking to learn about, and your goal is know whether it touches on all key points before dedicating time to watch it.  
This is with the intuition that if it were a PDF instead of a video, you'd be able to search through it.  

**Requirements:**  
* Open this notebook in a free [Google Colab instance](https://colab.research.google.com/github/LiorGazit/LLM_search_inside_youtube_videos/blob/main/Analyze_a_Youtube_video_by_asking_the_LLM.ipynb).  
* This code picks OpenAI's API as a choice of LLM, so a paid **API key** is necessary.   

Install:

In [None]:
%pip -q install youtube-transcript-api
%pip -q install openai
%pip -q install numpy
%pip -q install pytube
%pip -q install faiss-cpu
%pip -q install tiktoken

Imports:

In [2]:
import os
from youtube_transcript_api import YouTubeTranscriptApi
import faiss
import numpy as np
import openai
import tiktoken
from urllib.parse import urlparse, parse_qs

#### Insert API Key

In [None]:
my_api_key = "..."

#### Pick the Youtube Video and Insert its URL

In [None]:
video_url = "https://www.youtube.com/watch?v=ySEx_Bqxvvo&ab_channel=AlexanderAmini"

#### Save API Key to Environement Variable

In [5]:
os.environ["OPENAI_API_KEY"] = my_api_key

#### Define functions:

In [6]:
# Extract video ID from URL
def extract_video_id(url):
    query = urlparse(url).query
    params = parse_qs(query)
    return params['v'][0]

# Fetch transcript using youtube-transcript-api
def get_transcript(video_url):
    video_id = extract_video_id(video_url)
    transcript = YouTubeTranscriptApi.get_transcript(video_id, languages=['en'])
    text = ' '.join([t['text'] for t in transcript])
    return text

# Split transcript into chunks
def split_chunks(transcript, max_tokens=500):
    encoding = tiktoken.get_encoding("cl100k_base")
    words = transcript.split()
    chunks, current_chunk = [], []

    for word in words:
        current_chunk.append(word)
        if len(encoding.encode(' '.join(current_chunk))) > max_tokens:
            current_chunk.pop()
            chunks.append(' '.join(current_chunk))
            current_chunk = [word]
    if current_chunk:
        chunks.append(' '.join(current_chunk))
    return chunks

# Get embeddings using updated OpenAI embeddings model
def get_embeddings(chunks, model="text-embedding-3-small"):
    embeddings = openai.embeddings.create(
        input=chunks,
        model=model
    )
    embeddings_list = [e.embedding for e in embeddings.data]
    return np.array(embeddings_list, dtype='float32')

# Build FAISS index
def build_index(embeddings):
    dim = embeddings.shape[1]
    index = faiss.IndexFlatL2(dim)
    index.add(embeddings)
    return index

# Similarity search
def search_chunks(question, chunks, index, top_k=3):
    query_embedding = openai.embeddings.create(
        input=[question],
        model="text-embedding-3-small"
    ).data[0].embedding
    query_embedding = np.array([query_embedding], dtype='float32')

    _, indices = index.search(query_embedding, top_k)
    return [chunks[i] for i in indices[0]]

# Query LLM with retrieved context using the latest GPT-4 model
def query_llm(prompt, model="gpt-4-turbo"):
    completion = openai.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": "You answer questions based on video transcripts. Drop a new line after every sentence!"},
            {"role": "user", "content": prompt}
        ],
        temperature=0.5,
        max_tokens=1000
    )
    return completion.choices[0].message.content.strip()

### Set Up the Retrieval Mechanism:

In [7]:
# Entire pipeline execution
def pipeline(video_url, question):
    print("--- Prompt ---\n")
    print(question)

    # Fetching transcript:
    transcript = get_transcript(video_url)

    # Splitting transcript into chunks:
    chunks = split_chunks(transcript)

    # Getting embeddings:
    embeddings = get_embeddings(chunks)

    # Building FAISS index:
    index = build_index(embeddings)

    # Searching relevant chunks:
    relevant_chunks = search_chunks(question, chunks, index)

    context = "\n\n".join(relevant_chunks)
    prompt = f"Context from video:\n\n{context}\n\nQuestion: {question}\nStart a new line after every sentence in your answer!"

    print("\n--- Answer ---\n")
    return query_llm(prompt)

### Some Questions About the Content of the Video

In [8]:
question = "Do they mention transformers? In what way? Tell me in 2-3 sentences."
print(pipeline(video_url, question))


--- Prompt ---

Do they mention transformers? In what way? Tell me in 2-3 sentences.

--- Answer ---

Yes, transformers are mentioned in the context of utilizing self-attention mechanisms within neural networks. The speaker describes transformers as powerful architectures that are built upon the concept of self-attention to process input data efficiently and extract important features. Multiple self-attention heads within transformers help in forming a rich representation of data by focusing on different relevant parts of the input.


In [9]:
question = "Do they mention attention?"
print(pipeline(video_url, question))


--- Prompt ---

Do they mention attention?

--- Answer ---

Yes, they mention attention multiple times throughout the video. 
They discuss the concept of attention and self-attention as foundational mechanisms in the Transformer architecture. 
They explain how attention helps in identifying and focusing on the most important parts of the input data. 
The video also describes how attention weights are computed and used to extract features that deserve high attention.


In [10]:
question = "Do they mention back propogation? Please provide 2-3 sentences that tell about it."
backprop_answer_english = pipeline(video_url, question)
print(backprop_answer_english)

--- Prompt ---

Do they mention back propogation? Please provide 2-3 sentences that tell about it.

--- Answer ---

Yes, they mention back propagation in the context of training recurrent neural networks (RNNs). The video explains that back propagation in RNNs involves propagating errors back through each time step in the sequence, which is termed as back propagation through time (BPTT). This process helps in updating the weights of the network to minimize the overall loss, accounting for the temporal dynamics of the input sequences.


#### Translate the Last Response to Hindi

In [11]:
prompt = f"Please translate this answer from English to Hindi: <{backprop_answer_english}>. Make sure to translate properly with the appropriate technical terms."
print(query_llm(prompt))

हां, उन्होंने पुनरावर्ती न्यूरल नेटवर्क (RNNs) को प्रशिक्षित करने के संदर्भ में बैक प्रोपेगेशन का उल्लेख किया है।
वीडियो समझाता है कि RNNs में बैक प्रोपेगेशन का अर्थ है त्रुटियों को अनुक्रम में प्रत्येक समय चरण के माध्यम से वापस प्रसारित करना, जिसे समय के माध्यम से बैक प्रोपेगेशन (BPTT) कहा जाता है।
यह प्रक्रिया नेटवर्क के वजन को अपडेट करने में मदद करती है ताकि कुल हानि को कम किया जा सके, इनपुट अनुक्रमों की समयिक गतिशीलता को ध्यान में रखते हुए।


#### Translate the Last Response to Tamil

In [12]:
prompt = f"Please translate this answer from English to Tamil: <{backprop_answer_english}>. Make sure to translate properly with the appropriate technical terms."
print(query_llm(prompt))

ஆம், அவர்கள் மீளாய்வு நரம்பியல் வலைகள் (RNNs) பயிற்சியின் சூழலில் பின்னோக்கு பரவலை குறிப்பிடுகின்றனர். 
வீடியோ விளக்கம் அளிக்கிறது என்பது ஆர்.என்.என்-களில் பின்னோக்கு பரவல் என்பது கால அடுக்குகளில் ஒவ்வொரு படியாக பிழைகளை பின்னோக்கிப் பரப்புவதை உள்ளடக்கியது, இது கால வழியாக பின்னோக்கு பரவல் (BPTT) என அழைக்கப்படுகிறது. 
இந்த செயல்முறை நெட்வொர்க்கின் எடைகளை புதுப்பித்து, மொத்த இழப்பைக் குறைப்பதற்கு உதவுகிறது, உள்ளீட்டு அடுக்குகளின் கால இயக்க விதிமுறைகளை கணக்கில் கொள்கிறது.


### The Video's Text that the LLM Can Use to Answer:

In [13]:
print(get_transcript(video_url))

Hello everyone! I hope you enjoyed Alexander's 
first lecture. I'm Ava and in this second lecture,   Lecture 2, we're going to focus on this 
question of sequence modeling -- how   we can build neural networks that can 
handle and learn from sequential data. So in Alexander's first lecture he 
introduced the essentials of neural   networks starting with perceptrons building 
up to feed forward models and how you can   actually train these models and start 
to think about deploying them forward. Now we're going to turn our attention to 
specific types of problems that involve   sequential processing of data and we'll 
realize why these types of problems require   a different way of implementing and building 
neural networks from what we've seen so far. And I think some of the components in 
this lecture traditionally can be a bit   confusing or daunting at first but what I 
really really want to do is to build this   understanding up from the foundations walking 
through step by step de