# Analyze a Youtube video by asking the LLM
By [Lior Gazit](https://www.linkedin.com/in/liorgazit/)  

<a target="_blank" href="https://colab.research.google.com/github/LiorGazit/LLM_search_inside_youtube_videos/blob/main/Analyze_a_Youtube_video_by_asking_the_LLM.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

**Description of the notebook:**  
Pick a Youtube video that you'd like to understand what value it brings you without having to spend the time to watch all of it.  
For instance: an hour long lecture about a topic you are looking to learn about, and your goal is know whether it touches on all key points before dedicating time to watch it.  
This is with the intuition that if it were a PDF instead of a video, you'd be able to search through it.  

**Requirements:**  
* Open this notebook in a free [Google Colab instance](https://colab.research.google.com/).  
* This code picks OpenAI's API as a choice of LLM, so a paid **API key** is necessary.   

Install:

In [1]:
!pip -q install --upgrade "embedchain[youtube]"
!pip -q install pytube
!pip -q install openai
!pip -q install youtube-transcript-api
!pip install --upgrade pandas



Collecting pandas
  Obtaining dependency information for pandas from https://files.pythonhosted.org/packages/31/9e/6ebb433de864a6cd45716af52a4d7a8c3c9aaf3a98368e61db9e69e69a9c/pandas-2.2.3-cp310-cp310-win_amd64.whl.metadata
  Downloading pandas-2.2.3-cp310-cp310-win_amd64.whl.metadata (19 kB)
Downloading pandas-2.2.3-cp310-cp310-win_amd64.whl (11.6 MB)
   ---------------------------------------- 0.0/11.6 MB ? eta -:--:--
   ---------------------------------------- 0.0/11.6 MB ? eta -:--:--
   ---------------------------------------- 0.0/11.6 MB 991.0 kB/s eta 0:00:12
   - -------------------------------------- 0.5/11.6 MB 5.6 MB/s eta 0:00:02
   ---------- ----------------------------- 3.1/11.6 MB 24.9 MB/s eta 0:00:01
   ---------------- ----------------------- 4.8/11.6 MB 28.1 MB/s eta 0:00:01
   ------------------------- -------------- 7.4/11.6 MB 33.9 MB/s eta 0:00:01
   -------------------------------- ------- 9.3/11.6 MB 35.2 MB/s eta 0:00:01
   ----------------------------------

In [3]:
!pip install onnxruntime



Imports:

In [1]:
import os
import textwrap
import pandas as pd
import json

from embedchain import App

ValueError: The onnxruntime python package is not installed. Please install it with `pip install onnxruntime`

#### Insert API Key

In [None]:
my_api_key = "..."

#### Save API Key to Environement Variable

In [None]:
os.environ["OPENAI_API_KEY"] = my_api_key

#### Pick the Youtube Video and Insert its URL

In [None]:
video_url = "https://www.youtube.com/watch?v=ySEx_Bqxvvo&ab_channel=AlexanderAmini"

Setting up configurations for choice of embedding LLM and prompting LLM:

In [None]:
models_config = {
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-3.5-turbo",
            "temperature": 0.5,
            "max_tokens": 1000,
            "top_p": 1,
            "stream": False
        }
    },
    "embedder": {
        "provider": "openai",
        "config": {
            "model": "text-embedding-ada-002"
        }
    }
}

### Set Up the Retrieval Mechanism:

In [None]:
lecture_RAG = App().from_config(config=models_config)
lecture_RAG.reset()
lecture_RAG.add(data_type="youtube_video", source=video_url)



### Some Questions About the Content of the Video

In [None]:
print(lecture_RAG.query("Do they mention transformers? In what way? Tell me in 2-3 sentences."))

In [None]:
print(lecture_RAG.query("Do they mention attention?"))

In [None]:
backprop_answer_english = lecture_RAG.query("Do they mention back propogation? Please provide 2-3 sentences that tell about it.")
print(backprop_answer_english)

#### Translate the Last Response to Hindi

In [None]:
print(lecture_RAG.query(f"Please translate this answer from English to Hindi: <{backprop_answer_english}>. Make sure to translate properly with the appropriate technical terms."))

#### Translate the Last Response to Tamil

In [None]:
print(lecture_RAG.query(f"Please translate this answer from English to Tamil: <{backprop_answer_english}>. Make sure to translate properly with the appropriate technical terms."))

### The Video's Text that the LLM Can Use to Answer:

In [None]:
lecture_RAG.db.get()['documents']