# Veritasium - RetrieverQA

## Downloading & Importing Libraries

In [1]:
!pip install langchain langchain_community langchain_openai langchain-pinecone pinecone-client

Collecting langchain
  Downloading langchain-0.2.7-py3-none-any.whl (983 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m983.6/983.6 kB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain_community
  Downloading langchain_community-0.2.7-py3-none-any.whl (2.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m62.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain_openai
  Downloading langchain_openai-0.1.16-py3-none-any.whl (46 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.1/46.1 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-pinecone
  Downloading langchain_pinecone-0.1.1-py3-none-any.whl (8.4 kB)
Collecting pinecone-client
  Downloading pinecone_client-4.1.2-py3-none-any.whl (216 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m216.4/216.4 kB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m
Collecting langchain-core<0.3.0,>=0.2.12 (from l

In [2]:
import os
from google.colab import files
from google.colab import userdata
from google.colab import runtime

from pinecone import Pinecone, ServerlessSpec
from langchain_openai.chat_models import ChatOpenAI
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_pinecone import PineconeVectorStore
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain.schema import format_document
from langchain.schema.runnable import RunnableMap, RunnableSequence, RunnablePassthrough
from langchain.chains import RetrievalQA, LLMChain
from langchain.vectorstores import Pinecone as LCPinecone
from langchain.schema import SystemMessage, HumanMessage, AIMessage

import logging

In [3]:
OPENAI_API_KEY = userdata.get('Ironhack-GPT')
PC_API_KEY = userdata.get('PineCone')
HF_TOKEN = userdata.get('HF')


os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
os.environ['HF_TOKEN'] = HF_TOKEN

## Initialize and Retrieve Embeddings


In [5]:
# Initialize Pinecone
pc = Pinecone(api_key=PC_API_KEY)

# Initialize the Pinecone index
index_name = "veritasium-vs-final"
pinecone_index = pc.Index(index_name)

# Initialize embeddings
embeddings_model = OpenAIEmbeddings(api_key=OPENAI_API_KEY, model='text-embedding-ada-002')

# Initialize LangChain Pinecone vector store with the summary as text_key
vector_store = LCPinecone(
    index=pinecone_index,
    embedding=embeddings_model,
    text_key="transcription"
)

  warn_deprecated(


In [6]:
# Initialize the Chat LLM with model_kwargs
llm = ChatOpenAI(api_key=OPENAI_API_KEY, model="gpt-3.5-turbo")

# Define the prompt template
LLM_CONTEXT_PROMPT = ChatPromptTemplate.from_template(
    """You are an assistant for question-answering tasks. Use the following pieces of retrieved context from Veritasium videos to answer the question. If you don't know the answer, just say that you don't know. Be as verbose and educational in your response as possible.

    Context: {context}
    Question: "{question}"
    Answer:
    """
)

# Create the LLM chain with the prompt template
llm_chain = LLMChain(prompt=LLM_CONTEXT_PROMPT, llm=llm)

# Set up the retrieval-based QA chain using RetrievalQA.from_chain_type
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vector_store.as_retriever(),
    return_source_documents=True
)

# Example query
query = "Which country has the lowest vaccination rate in the world?"

# Construct the input for the QA chain
qa_input = {
    "query": query
}

# Get the answer
try:
    answer = qa_chain.invoke(qa_input)
    print("---- Answer ----")
    print(answer)
except Exception as e:
    print("Error occurred:", str(e))


  warn_deprecated(


---- Answer ----
{'query': 'Which country has the lowest vaccination rate in the world?', 'result': "I'm sorry, but there is no mention of the country with the lowest vaccination rate in the provided pieces of context.", 'source_documents': [Document(metadata={'category': 'Biology', 'chunk_id': '7ziWrneMYss_3', 'description': 'this video is sponsored by brilliant the first 200 people to sign up via https brilliant org veritasium get 20 off a yearly', 'published_at': '2022-03-22T11:55:53Z', 'summary': "In 1870, a British military doctor, Edward Nicholson, was stationed in Burma. He noticed that the older snake handlers were less affected by accidental bites than the younger ones. 20 years later in Saigon, a French medical researcher named Albert Calmet was vaccinating local residents against smallpox. He wondered if it was possible to make a vaccine for snake bites. Back in Paris, he tried injecting rabbits with a tiny amount of cobra venom, starting with just.03 milligrams. After 8 mon

In [7]:
# Example query
query = "what us special about the number 37?"

# Construct the input for the QA chain
qa_input = {
    "query": query
}

# Get the answer
try:
    answer = qa_chain.invoke(qa_input)
    print("---- Answer ----")
    print(answer)
except Exception as e:
    print("Error occurred:", str(e))

---- Answer ----
{'query': 'what us special about the number 37?', 'result': 'The number 37 is considered special for a variety of reasons. It has interesting coincidences and patterns associated with it, such as being a prime number, appearing frequently in different aspects of life, and being commonly perceived as a random and distinct number by many people. Additionally, 37 has unique properties like being a lucky prime, a sexy prime, and a permutable prime. Its presence in various contexts and its intriguing mathematical characteristics contribute to its special status for many individuals.', 'source_documents': [Document(metadata={'category': 'Mathematics', 'chunk_id': 'd6iQrh2TK98_5', 'description': 'the number 37 is on your mind more than you think head to https brilliant org veritasium to start your free 30 day trial and get', 'published_at': '2024-03-28T18:48:10Z', 'summary': "The number 37 is one of our most prominent prime numbers, and most of all, our ideal number for makin

In [8]:
# Example query
query = "what do you know about snake bites?"

# Construct the input for the QA chain
qa_input = {
    "query": query
}

# Get the answer
try:
    answer = qa_chain.invoke(qa_input)
    print("---- Answer ----")
    print(answer)
except Exception as e:
    print("Error occurred:", str(e))

---- Answer ----
{'query': 'what do you know about snake bites?', 'result': "Snakebites can be dangerous and potentially deadly due to the venom injected by some species of snakes. The venom can have various effects on the human body, such as neurotoxicity (affecting the nervous system), hemotoxicity (affecting blood cells), cytotoxicity (attacking cells), and myotoxicity (destroying muscles). Different snakes have evolved different types of venom based on their prey and environment. In the case of a snakebite, it's important to remain calm, immobilize the affected limb, and seek medical help as soon as possible. Antivenom, produced from injecting large animals like horses with diluted venom, is used to counteract the effects of snake venom. If left untreated, snakebites can lead to severe complications and even death.", 'source_documents': [Document(metadata={'category': 'Biology', 'chunk_id': '7ziWrneMYss_2', 'description': 'this video is sponsored by brilliant the first 200 people t

In [9]:
# Example query
query = "who's the president of Spain??"

# Construct the input for the QA chain
qa_input = {
    "query": query
}

# Get the answer
try:
    answer = qa_chain.invoke(qa_input)
    print("---- Answer ----")
    print(answer)
except Exception as e:
    print("Error occurred:", str(e))

---- Answer ----
{'query': "who's the president of Spain??", 'result': "I don't know the current president of Spain.", 'source_documents': [Document(metadata={'category': 'Space', 'chunk_id': '6YOz9Pxnzho_1', 'description': 'what it s like to see the earth from orbit special thanks to col chris hadfield for chatting with me http chrishadfield ca space', 'published_at': '2015-02-09T16:33:14Z', 'summary': '"I\'m confident this isn\'t the end of the world. This is just a problem that we\'re facing that is going to change things, but we\'re going to have to figure out a way to deal with it," he says. "It\'s us or me or I that has to make the change"', 'title': 'an astronaut s view of earth', 'url': 'https://www.youtube.com/watch?v=6YOz9Pxnzho', 'video_id': '6YOz9Pxnzho'}, page_content='individually That s who has to make the change You can t say they or him or her or it It s us or me or I that has to make the change and it s not going to be perfect and it s going to have to get a little bi

In [10]:
# Example query
query = "give me 5 topics you know about physics??"

# Construct the input for the QA chain
qa_input = {
    "query": query
}

# Get the answer
try:
    answer = qa_chain.invoke(qa_input)
    print("---- Answer ----")
    print(answer)
except Exception as e:
    print("Error occurred:", str(e))

---- Answer ----
{'query': 'give me 5 topics you know about physics??', 'result': '1. Center of mass and balance: The phenomenon of balancing objects and finding the center of mass.\n2. Phone flip physics: Exploring why flipping a phone end over end causes rotation in different directions.\n3. Electric charges and water: Understanding the attraction of charged objects to water.\n4. Magnetic cereal: Exploring the magnetic properties of certain types of cereal.\n5. Tea bag rocket: Investigating the science behind creating a rocket from a tea bag.', 'source_documents': [Document(metadata={'category': 'Physics', 'chunk_id': '1Xp_imnO6WE_0', 'description': 'five cool physics tricks but how do they work explanations http youtu be jimihpdmbpy check out audible com', 'published_at': '2014-08-06T06:46:32Z', 'summary': "Five fun physics phenomena. Have you ever tried to spin your phone? If you do it in this direction, it's pretty easy. But if you try to flip your phone end over end like this, yo

In [11]:
# Example query
query = "who is Derek Muller?"

# Construct the input for the QA chain
qa_input = {
    "query": query
}

# Get the answer
try:
    answer = qa_chain.invoke(qa_input)
    print("---- Answer ----")
    print(answer)
except Exception as e:
    print("Error occurred:", str(e))

---- Answer ----
{'query': 'who is Derek Muller?', 'result': 'Derek Muller is a scientist, educator, and filmmaker who explores and explains scientific phenomena through his video series, such as "Veritasium."', 'source_documents': [Document(metadata={'category': 'Physics', 'chunk_id': 'liqF6EamiE4_1', 'description': 'when sunlight shines through a small hole it casts a circular image on the wall regardless of the shape of the hole the size of the', 'published_at': '2011-06-13T22:30:47Z', 'summary': "re seeing is... A projection of the sun. I've never thought about it before. Well, I did photography at school and we did pinhole cameras. The hole actually reflects what it's showing on the wall. So? So you see what you see on the other side.", 'title': 'can you solve this shadow illusion', 'url': 'https://www.youtube.com/watch?v=liqF6EamiE4', 'video_id': 'liqF6EamiE4'}, page_content='re seeing is A projection of the sun I ve never thought about it before I think that you actually know an

In [12]:
# Example query
query = "can you fetch me some youtube video urls about physics??"

# Construct the input for the QA chain
qa_input = {
    "query": query
}

# Get the answer
try:
    answer = qa_chain.invoke(qa_input)
    print("---- Answer ----")
    print(answer)
except Exception as e:
    print("Error occurred:", str(e))

---- Answer ----
{'query': 'can you fetch me some youtube video urls about physics??', 'result': "I don't know.", 'source_documents': [Document(metadata={'category': 'Physics', 'chunk_id': '5THOUSvpCKk_0', 'description': 'veritasium is a channel of science and engineering videos featuring experiments expert interviews cool demos and discussions', 'published_at': '2013-02-11T06:09:19Z', 'summary': "Sometimes the simplest questions have the most amazing answers. Where does the Sun get that energy from? Where do they get the matter to make the tree? What is a candle flame really made of? Whoa! How does it do that? Go the laws of physics! I can't see the X. I guess the question is why not?", 'title': 'veritasium trailer', 'url': 'https://www.youtube.com/watch?v=5THOUSvpCKk', 'video_id': '5THOUSvpCKk'}, page_content='Sometimes the simplest questions have the most amazing answers Like is there a speed limit in the universe Where does the Sun get that energy from Where do they get the matter 

In [13]:
# # Test response
# response = pinecone_index.fetch(ids=["vVKFBaaL4uM_1"])
# print(response)

## Chat test

In [14]:
chat_model = ChatOpenAI(api_key=OPENAI_API_KEY, model_name="gpt-3.5-turbo")

In [15]:
# Define the function to ask GPT with retriever
def ask_gpt_with_retriever(query, context=""):
    # Use the qa_chain to get the response and source documents
    result = qa_chain({"query": query})
    response = result["result"]
    source_documents = result["source_documents"]

    # Log retrieved documents for verification
    retrieved_texts = "\n\n".join(doc.page_content for doc in source_documents)
    print("Retrieved Documents:\n", retrieved_texts)

    # Combine retrieved texts with the existing context
    combined_context = context + "\n\nRetrieved documents:\n" + retrieved_texts

    messages = [
        SystemMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved info from Veritasium videos to answer the question. If the info doesn't help, just say that you don't know and be concise in your response. else if the retrieved info is helpful, be as verbose and educational in your response as possible."),
        HumanMessage(content="Here is some info retrieved from Veritasium videos:\n" + combined_context),
        HumanMessage(content="Based on this info, please answer the following question':"),
        HumanMessage(content=query)
    ]

    prompt = ChatPromptTemplate.from_messages(messages)
    llm_chain = LLMChain(llm=chat_model, prompt=prompt)
    gpt_response = llm_chain.run({})
    return gpt_response

# Define the function to simulate the conversation
def simulate_conversation(queries):
    context = ""
    for i, query in enumerate(queries):
        # Process the query using GPT-3.5 Turbo with retriever
        response = ask_gpt_with_retriever(query, context)

        # Update context with the current query and response
        context += f"\nUser Query {i+1}: {query}\nBot Response {i+1}: {response}\n"

        # Print the conversation
        print(f"User Query {i+1}: {query}")
        print(f"Bot Response {i+1}: {response}")
        print("-" * 50)

# Define a set of conversational queries for testing
test_queries = [
    "how are you?",
    "tell me about the number 37?",
    "where do you get this info from?",
    "What are some other fun math facts?"
    "Can you fetch me some YouTube video URLs about physics?",
    "Tell me about the speed limit in the universe.",
    "How does quantum entanglement work?",
    "Can you summarize the video about imaginary numbers?",
    "What are some fun physics phenomena?",
    "Who is the president of Spain?",
]

# Run the simulated conversation
simulate_conversation(test_queries)

  warn_deprecated(


Retrieved Documents:
 ces Not that I can think of Okay so what forces are acting on you right now

A human being is a part of the whole called by us universe A part limited in time and space He experiences himself his thoughts and feelings as something separated from the rest a kind of optical delusion of his consciousness This delusion is a kind of prison for us restricting us to our personal desires and to affection for a few things persons nearest to us Our task must be to free ourselves from this prison by widening our circle of compassion to embrace all living creatures and the whole of nature in it Nobody is able to achieve this completely but the striving for such achievement is in itself a part of the liberation and a foundation for inner security

ltimate question of life the universe and everything from the hitchhiker s guide to the galaxy How are you doing Well what s your favorite ice cream flavor Pistachio or rocky road Do you truly love turbulent flow or do you just fake 

  warn_deprecated(


User Query 1: how are you?
Bot Response 1: I don't have specific information on how the person is feeling at the moment based on the retrieved text.
--------------------------------------------------
Retrieved Documents:
 a lot I started back in the 80s There was a comedy routine by Charles Fleischer and he went through this sort of litany of coincidences about the number 37 like there are 37 holes in the speaker part of a telephone Shakespeare wrote 37 plays there s 37 movements in Beethoven s nine symphonies There are all these amazing coincidences that he rattled off I was amazed I ve been collecting them ever since Since like 1981 Yeah so 43 years probably I built the 37 website for the first time in 1994 I don t know how the website got out But somehow it got out there I started getting email from strangers I ve got maybe a half a dozen people from around the world who every week or month will post their latest batch of 37s that they ve seen out and about And they ve been doing th

In [16]:
test_queries = [
    "can you share a video url explaining how bikes work?",
    "who is Derek Muller??",
    "How many videos do you have?"
]

# Run the simulated conversation
simulate_conversation(test_queries)

Retrieved Documents:
 Most people don t know how bicycles actually work So we modified this bike to prove it This video is sponsored by KiwiCo More about them at the end of the show controller that allows him to lock out the steering to one side So what he s going to do is as I m biking he s going to pick whether I can turn either to the left or to the right So go for it I m giving it a left turn it pulls the pin out But you can see that you can still fully steer after I ve pulled the pin out I ve armed it There s where it locks Okay Now that s when your LED comes on That just says turn that way Turn left Yeah And if I try to turn right Can t I can t And if I try to turn left you can I can So the question is can I successfully execute this left hand turn Should we give it a shot I mean he s not going to tell me whether it s left or right so I have to look at the LED to know which way I can still turn You let me know when you re ready Okay No That was meant to be a turn to the right but