YouTube Video Transcript

In [59]:
from youtube_transcript_api import YouTubeTranscriptApi
import re

In [63]:
yt_api = YouTubeTranscriptApi()

video_url = input("Enter YouTube video URL: ")


def get_video_id(url):
    match = re.search(r"(?:v=|youtu.be/)([\w-]{11})", url)
    return match.group(1) if match else None

video_id = get_video_id(video_url)
video_transcript = yt_api.get_transcript(video_id)

Enter YouTube video URL:  https://www.youtube.com/watch?v=C6xOUzgaIts


In [64]:
output = ''
for x in video_transcript:
    sentense = x["text"]
    output += f' {sentense}\n\n'
    

In [65]:
output

' This is the Vehicle Assembly Building.\n\n NASA has used it to\nconstruct massive rockets\n\n before they lift off from the launch pad.\n\n In this video, we\'ll\nsee inside the building\n\n and how it\'s been used for\nmore than half a century.\n\n (graphics whooshing)\n(explosion booms)\n\n This video is sponsored by Brilliant.\n\n In the 1960s, NASA was headed to the moon.\n\n The launch vehicle for this\nwas the giant Saturn V Rocket\n\n with the Apollo Spacecraft at the top.\n\n They needed a place to assemble\nthe pieces of the rocket\n\n and the launch pad wasn\'t set up for this.\n\n This is why they built the\nVehicle Assembly Building,\n\n often just called the VAB.\n\n This building is still in use today.\n\n It\'s a controlled environment\n\n that protects from the outside weather.\n\n It keeps the rocket\nsafe until it has moved\n\n to one of the launch pads,\n\n usually just a few\nweeks before the launch.\n\n The VAB has been mainly used\nto assemble three rockets.\n\n

Text Splitting and Embedding the Transcript

In [66]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [67]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)

chunks = text_splitter.split_text(output)

In [68]:
chunks

["This is the Vehicle Assembly Building.\n\n NASA has used it to\nconstruct massive rockets\n\n before they lift off from the launch pad.\n\n In this video, we'll\nsee inside the building\n\n and how it's been used for\nmore than half a century.\n\n (graphics whooshing)\n(explosion booms)\n\n This video is sponsored by Brilliant.\n\n In the 1960s, NASA was headed to the moon.\n\n The launch vehicle for this\nwas the giant Saturn V Rocket\n\n with the Apollo Spacecraft at the top.",
 "with the Apollo Spacecraft at the top.\n\n They needed a place to assemble\nthe pieces of the rocket\n\n and the launch pad wasn't set up for this.\n\n This is why they built the\nVehicle Assembly Building,\n\n often just called the VAB.\n\n This building is still in use today.\n\n It's a controlled environment\n\n that protects from the outside weather.\n\n It keeps the rocket\nsafe until it has moved\n\n to one of the launch pads,\n\n usually just a few\nweeks before the launch.",
 'usually just a few\nw

In [69]:
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

In [70]:
embedding_text = embeddings.embed_documents(chunks)

In [71]:
embedding_text

[[0.001707395322705383,
  -0.0027429694580502774,
  -0.00889534046338013,
  -0.012243667738258898,
  -0.015505771305683775,
  -0.051245209335321375,
  0.009189936423089526,
  0.03302341359658199,
  -0.01850920747603731,
  0.038915317889608586,
  -0.011417363882970943,
  -0.010145573976913154,
  -0.03382816199066776,
  -0.033109635441390714,
  0.008952822624575196,
  0.055585101795339244,
  -0.01183410838748195,
  0.014083092239888132,
  0.0010131204135106326,
  -0.0022974841057723815,
  0.01641111278336898,
  -0.02270539313608363,
  -0.016641041428149247,
  -0.0027860810789465777,
  0.004900340970532667,
  -0.034834098414597565,
  0.0022741318904753935,
  0.010598245181417047,
  0.00578412698701569,
  0.03719085864169207,
  0.014542947666803504,
  0.013687903196579306,
  -0.008406743490205933,
  0.013702273504047426,
  0.005342233978774178,
  0.03359823334588749,
  0.03566758742361958,
  0.026010608557235453,
  -0.024688521643716658,
  0.005040453796653305,
  0.014427984275735953,
  -0

Storing the embedding in VectorDataBase

In [72]:
from langchain_core.documents import Document

document = [Document(page_content=chunk) for chunk in chunks]

In [73]:
from langchain_community.vectorstores import FAISS

vector_store = FAISS.from_documents(document, embeddings)

In [74]:
retriever = vector_store.as_retriever()

Define Prompt and Initialize LLM

In [75]:
from langchain.prompts import PromptTemplate
from langchain.chat_models import ChatOpenAI

prompt_template = """
Answer the question "{input}" based on the context below:

<context>
{context}
</context>
"""

prompt = PromptTemplate.from_template(prompt_template)

In [76]:
llm = ChatOpenAI(model="gpt-3.5-turbo")

Building the Retrieval Chain

In [77]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain

combine_docs_chain = create_stuff_documents_chain(llm=llm,prompt=prompt)
chain = create_retrieval_chain(retriever, combine_docs_chain)

Query the chain and Inspecting the chain

In [82]:

result = chain.invoke({"input": query})
print("\n\n",result["answer"])

while True:
    query = input("What is your question:\n")
    
    result = chain.invoke({"input": query})
    print("\n\n Answer: ",result["answer"])

    # Ask if user wants to continue
    cont = input("\n\nDo you have any other question? (yes/no): ").strip().lower()
    if cont != "yes":
        print("Bye bye! 👋")
        break



 NASA builds their rockets at various locations across the United States, with the final assembly taking place at the Vehicle Assembly Building (VAB) at Kennedy Space Center in Florida.


What is your question:
 where do NASA build their rockets




 Answer:  NASA builds their rockets at various locations across the United States, and then they are transported to the Kennedy Space Center in Florida for final assembly and launch.




Do you have any other question? (yes/no):  yes
What is your question:
 how many branches are there in Kennedy Space Center




 Answer:  There are four branches in Kennedy Space Center, numbered 1, 2, 3, and 4, which are the High Bays where the rockets are assembled.




Do you have any other question? (yes/no):  no


Bye bye! 👋
