<h2>YouTube audio</h2>
<p>
Building chat or QA applications on YouTube videos is a topic of high interest.

Below we show how to easily go from a YouTube url to audio of the video to text to chat!

We wil use the OpenAIWhisperParser, which will use the OpenAI Whisper API to transcribe audio to text, and the OpenAIWhisperParserLocal for local support and running on private clouds or on premise.

Note: You will need to have an <b>OPENAI_API_KEY</b> supplied.</p>

In [1]:
%pip install --upgrade --quiet  yt_dlp
%pip install --upgrade --quiet  pydub
%pip install --upgrade --quiet  librosa

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.2.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.2.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.2.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [13]:
!pip show langchain

Name: langchain
Version: 0.1.0
Summary: Building applications with LLMs through composability
Home-page: https://github.com/langchain-ai/langchain
Author: 
Author-email: 
License: MIT
Location: c:\pythonenv\python3106_langchain\lib\site-packages
Requires: aiohttp, async-timeout, dataclasses-json, jsonpatch, langchain-community, langchain-core, langsmith, numpy, pydantic, PyYAML, requests, SQLAlchemy, tenacity
Required-by: langchain-experimental


In [1]:
from langchain_community.document_loaders.blob_loaders.youtube_audio import (
    YoutubeAudioLoader,
)
from langchain.document_loaders.generic import GenericLoader
from langchain.document_loaders.parsers.audio import OpenAIWhisperParserLocal
from langchain.document_loaders.parsers import OpenAIWhisperParser

<h3>YouTube url to text</h3>
<p>
Use YoutubeAudioLoader to fetch / download the audio files.
Then, ues OpenAIWhisperParser() to transcribe them to text.
Let’s take the first lecture of Andrej Karpathy’s YouTube course as an example!
</p>

In [2]:
# set a flag to switch between local and remote parsing
# change this to True if you want to use local parsing
local = False
# Two Karpathy lecture videos
urls = ["https://youtu.be/kCc8FmEb1nY", "https://youtu.be/VMj-3S1tku0"]

# Directory to save audio files
save_dir = "~/Downloads/YouTube"

# Transcribe the videos to text
if local:
    loader = GenericLoader(
        YoutubeAudioLoader(urls, save_dir), OpenAIWhisperParserLocal()
    )
else:
    loader = GenericLoader(YoutubeAudioLoader(urls, save_dir), OpenAIWhisperParser())
#make sure anti virus is allowed to bypass the following line of code. 
docs = loader.load()

[youtube] Extracting URL: https://youtu.be/kCc8FmEb1nY
[youtube] kCc8FmEb1nY: Downloading webpage
[youtube] kCc8FmEb1nY: Downloading ios player API JSON
[youtube] kCc8FmEb1nY: Downloading android player API JSON
[youtube] kCc8FmEb1nY: Downloading m3u8 information
[info] kCc8FmEb1nY: Downloading 1 format(s): 140
[download] C:\Users\leonwoo\Downloads\YouTube\Let's build GPT： from scratch, in code, spelled out..m4a has already been downloaded
[download] 100% of  107.73MiB
[ExtractAudio] Not converting audio C:\Users\leonwoo\Downloads\YouTube\Let's build GPT： from scratch, in code, spelled out..m4a; file is already in target format m4a
[youtube] Extracting URL: https://youtu.be/VMj-3S1tku0
[youtube] VMj-3S1tku0: Downloading webpage
[youtube] VMj-3S1tku0: Downloading ios player API JSON
[youtube] VMj-3S1tku0: Downloading android player API JSON
[youtube] VMj-3S1tku0: Downloading m3u8 information
[info] VMj-3S1tku0: Downloading 1 format(s): 140
[download] C:\Users\leonwoo\Downloads\YouTube\T

In [3]:
# Returns a list of Documents, which can be easily viewed or parsed
docs[0].page_content[0:500]

'Hi everyone. So by now you have probably heard of ChatGPT. It has taken the world and the AI community by storm and it is a system that allows you to interact with an AI and give it text-based tasks. So for example, we can ask ChatGPT to write us a small haiku about how important it is that people understand AI and then they can use it to improve the world and make it more prosperous. So when we run this, AI knowledge brings prosperity for all to see, embrace its power. Okay, not bad. And so you'

<h3>Building a chat app from YouTube video

In [4]:
'''Given Documents, we can easily enable chat / question+answering.'''
from langchain.chains import RetrievalQA
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

In [5]:
# Combine doc
combined_docs = [doc.page_content for doc in docs]
text = " ".join(combined_docs)

In [6]:
# Split them
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=150)
splits = text_splitter.split_text(text)

In [8]:
# Build an index
embeddings = OpenAIEmbeddings()
vectordb = FAISS.from_texts(splits, embeddings)

In [9]:
# Build a QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0),
    chain_type="stuff",
    retriever=vectordb.as_retriever(),
)

In [12]:
# Ask a question!
query = "Why do we need to zero out the gradient before backprop at each step?"
qa_chain.invoke({"query": query})

{'query': 'Why do we need to zero out the gradient before backprop at each step?',
 'result': "We need to zero out the gradient before backpropagation at each step because the gradients accumulate during the backward pass. If we don't reset the gradients to zero, the gradients from previous iterations will continue to accumulate and affect the current iteration, leading to incorrect updates of the model parameters. By zeroing out the gradients before each backward pass, we ensure that only the gradients from the current iteration are considered for updating the model parameters."}

In [None]:
query = "What is the difference between an encoder and decoder?"
qa_chain.run(query)

In [None]:
query = "For any token, what are x, k, v, and q?"
qa_chain.run(query)

</h3>