# Analyze a Youtube video by asking the LLM
By [Lior Gazit](https://www.linkedin.com/in/liorgazit/)  

<a target="_blank" href="https://colab.research.google.com/github/LiorGazit/LLM_search_inside_youtube_videos/blob/main/Analyze_a_Youtube_video_by_asking_the_LLM.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

**Description of the notebook:**  
Pick a Youtube video that you'd like to understand what value it brings you without having to spend the time to watch all of it.  
For instance: an hour long lecture about a topic you are looking to learn about, and your goal is know whether it touches on all key points before dedicating time to watch it.  
This is with the intuition that if it were a PDF instead of a video, you'd be able to search through it.  

**Requirements:**  
* Open this notebook in a free [Google Colab instance](https://colab.research.google.com/).  
* This code picks OpenAI's API as a choice of LLM, so a paid **API key** is necessary.   

Install:

In [1]:
!pip -q install --upgrade "embedchain[youtube]"
!pip -q install pytube
!pip -q install openai
!pip -q install youtube-transcript-api


ERROR: Cannot uninstall numpy 1.25.2, RECORD file not found. You might be able to recover from this via: 'pip install --force-reinstall --no-deps numpy==1.25.2'.


Imports:

In [2]:
import os
import textwrap
import pandas as pd
import json

from embedchain import App


A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "c:\Users\gazit\miniconda3\envs\py39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\Users\gazit\miniconda3\envs\py39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "c:\Users\gazit\miniconda3\envs\py39\lib\site-packages\ipykernel_launcher.py", line 17, in <module>
    app.launch_new_instance()
  File "c:\Users\gazit\miniconda3\envs\py39\lib\site-packages\traitlets\config\application.py", line 1043, in launch_instance
    app.start()
 

AttributeError: _ARRAY_API not found

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

#### Insert API Key

In [None]:
my_api_key = "..."

#### Save API Key to Environement Variable

In [None]:
os.environ["OPENAI_API_KEY"] = my_api_key

#### Pick the Youtube Video and Insert its URL

In [None]:
video_url = "https://www.youtube.com/watch?v=ySEx_Bqxvvo&ab_channel=AlexanderAmini"

Setting up configurations for choice of embedding LLM and prompting LLM:

In [None]:
models_config = {
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-3.5-turbo",
            "temperature": 0.5,
            "max_tokens": 1000,
            "top_p": 1,
            "stream": False
        }
    },
    "embedder": {
        "provider": "openai",
        "config": {
            "model": "text-embedding-ada-002"
        }
    }
}

### Set Up the Retrieval Mechanism:

In [None]:
lecture_RAG = App().from_config(config=models_config)
lecture_RAG.reset()
lecture_RAG.add(data_type="youtube_video", source=video_url)



### Some Questions About the Content of the Video

In [None]:
print(lecture_RAG.query("Do they mention transformers? In what way? Tell me in 2-3 sentences."))

In [None]:
print(lecture_RAG.query("Do they mention attention?"))

In [None]:
backprop_answer_english = lecture_RAG.query("Do they mention back propogation? Please provide 2-3 sentences that tell about it.")
print(backprop_answer_english)

#### Translate the Last Response to Hindi

In [None]:
print(lecture_RAG.query(f"Please translate this answer from English to Hindi: <{backprop_answer_english}>. Make sure to translate properly with the appropriate technical terms."))

#### Translate the Last Response to Tamil

In [None]:
print(lecture_RAG.query(f"Please translate this answer from English to Tamil: <{backprop_answer_english}>. Make sure to translate properly with the appropriate technical terms."))

### The Video's Text that the LLM Can Use to Answer:

In [None]:
lecture_RAG.db.get()['documents']