# Practice Lab: Vector Stores and Retrieval Augmented Generation (RAG)

In this practice lab, you will work with vector stores and implement a Retrieval Augmented Generation (RAG) system. Complete the TODOs and follow the hints to build your understanding of these concepts.

## Installation

First, let's install the necessary packages. Execute the cell below to install all required libraries.

In [None]:
# TODO: Install the required packages for working with LangChain, vector stores, and RAG
# HINT: You need packages for LangChain, HuggingFace, OpenAI, Chroma, FAISS, PDF processing, and Gradio

# pip install langchain-huggingface langchain_core langchain_openai langchain_chroma langchain_community faiss-cpu pypdf gradio langsmith

## Environment Setup



1. Create a .env file in the same folder as this notebook
2. Add your OpenAI API key to the .env file in the format: `OPENAI_API_KEY=your_api_key_here`
3. Run the code below to load the environment variables

In [None]:
# TODO: Write code to load environment variables from the .env file
# HINT: Use the dotenv library to load the .env file

# from dotenv import load_dotenv
# load_dotenv()

## Part 1: Working with Documents and Vector Stores

### Creating Documents Manually

In LangChain, a **Document** represents a unit of text and associated metadata. It has two main attributes:
- **page_content**: a string representing the content
- **metadata**: a dictionary containing arbitrary metadata

Let's create some documents manually:

In [None]:
# TODO: Import the Document class from langchain_core.documents
# Then create a list of documents about movies with appropriate metadata
# HINT: Each document should have page_content with movie descriptions and metadata with details like year, director, rating, etc.

# from langchain_core.documents import Document

# documents = [
#     Document(
#         page_content="A bunch of scientists bring back dinosaurs and mayhem breaks loose",
#         metadata={"year": 1993, "rating": 7.7, "genre": "science fiction"},
#     ),
#     Document(
#         page_content="Leo DiCaprio gets lost in a dream within a dream within a dream within a ...",
#         metadata={"year": 2010, "director": "Christopher Nolan", "rating": 8.2},
#     ),
#     Document(
#         page_content="A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea",
#         metadata={"year": 2006, "director": "Satoshi Kon", "rating": 8.6},
#     ),
#     Document(
#         page_content="A bunch of normal-sized women are supremely wholesome and some men pine after them",
#         metadata={"year": 2019, "director": "Greta Gerwig", "rating": 8.3},
#     ),
#     Document(
#         page_content="Toys come alive and have a blast doing so",
#         metadata={"year": 1995, "genre": "animated"},
#     ),
#     Document(
#         page_content="Three men walk into the Zone, three men walk out of the Zone",
#         metadata={
#             "year": 1979,
#             "director": "Andrei Tarkovsky",
#             "genre": "thriller",
#             "rating": 9.9,
#         },
#     ),
# ]

### Creating a Vector Store (Chroma) and Persisting Documents

**Chroma** is an AI-native open-source vector database. It can run in different modes:
- **in-memory**: temporary storage that disappears when the session ends
- **in-memory with persistence**: store/load to disk from a script or notebook
- **in a docker container**: as a server running on your local machine or in the cloud

Let's create a vector store using Chroma and store our documents:

In [None]:
# TODO: Import the necessary libraries and create a Chroma vector store
# HINT: You need to import Chroma and either OpenAIEmbeddings or HuggingFaceEmbeddings
# Then create a vector store from the documents and persist it to a directory

# from langchain_chroma import Chroma
# from langchain_openai import OpenAIEmbeddings
# from langchain_huggingface.embeddings import HuggingFaceEmbeddings

# Choose one of the embedding options below:
# embeddings = OpenAIEmbeddings()  # If you have an OpenAI API key
# embeddings = HuggingFaceEmbeddings()  # Free alternative using HuggingFace

# vectorstore = Chroma.from_documents(
#     documents, embedding=embeddings, persist_directory="chromadb_practice"
# )

### Loading a Vector Store from a Persist Directory

Now let's see how to load a vector store from a persist directory:

In [None]:
# TODO: Load the Chroma vector store from the persist directory you created above
# HINT: Use the Chroma constructor with the persist_directory and embedding_function parameters

# vectorstore = Chroma(
#     persist_directory="chromadb_practice", embedding_function=embeddings
# )

### Using Vector Store as a Retriever

In LangChain:
- VectorStore objects do not subclass Runnable
- Retrievers are Runnables that can be used in chains with LCEL (LangChain Expression Language)

Vector stores implement an **as_retriever** method that generates a Retriever (specifically a VectorStoreRetriever).

Let's create a retriever from our vector store:

In [None]:
# TODO: Create a retriever from the vector store and test it with a query
# HINT: Use the as_retriever method with appropriate search_type and search_kwargs parameters
# Then invoke the retriever with a test query

# retriever = vectorstore.as_retriever(
#     search_type="similarity",  # Other options: mmr, similarity_score_threshold
#     search_kwargs={"k": 1},  # Return 1 most similar document
# )

# # Test the retriever with a query
# retriever.invoke("tell me about movie directed by Satoshi Kon")

## Part 2: Loading and Indexing PDF Documents with FAISS

Now let's see how to load a PDF document, split it into chunks, and create a vector store using FAISS.

### Loading the PDF

First, make sure you have a PDF file named `travel-policy.pdf` in the same directory as this notebook or provide the correct path to your PDF file.

In [None]:
# TODO: Import PyPDFLoader and load documents from a PDF file
# HINT: Use PyPDFLoader from langchain_community.document_loaders.pdf

# from langchain_community.document_loaders.pdf import PyPDFLoader
# from langchain.text_splitter import CharacterTextSplitter

# # Update the file path to your PDF file
# loader = PyPDFLoader(file_path="travel-policy.pdf")
# documents = loader.load()

In [None]:
# TODO: Print one of the loaded documents to see its content
# HINT: Try accessing documents[1] to see the second document

# documents[1]

### Splitting Documents into Chunks

After loading documents, we need to split them into smaller chunks for better retrieval:

In [None]:
# TODO: Create a text splitter and split the loaded documents into chunks
# HINT: Use CharacterTextSplitter with appropriate chunk size and overlap

# text_splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=50, separator="\n")
# docs = text_splitter.split_documents(documents=documents)

### Creating a FAISS Vector Store

Now let's create a FAISS vector store from the document chunks:

In [None]:
# TODO: Import FAISS and create a vector store from the document chunks
# HINT: Use FAISS.from_documents and save it locally

# from langchain_community.vectorstores import FAISS
# from langchain_openai.embeddings import OpenAIEmbeddings
# from langchain_huggingface.embeddings import HuggingFaceEmbeddings

# Choose one embedding option:
# embeddings = OpenAIEmbeddings()  # If you have an OpenAI API key
# embeddings = HuggingFaceEmbeddings()  # Free alternative using HuggingFace

# vectorstore = FAISS.from_documents(docs, embeddings)
# vectorstore.save_local("faiss_store_practice")

## Part 3: Creating a Retrieval Chain using Vector Store

Now let's create a retrieval chain using our vector store as a retriever. This is the core of a RAG (Retrieval Augmented Generation) system.

In [None]:
# TODO: Import necessary modules for creating a retrieval chain
# HINT: You need to import ChatPromptTemplate, RunnablePassthrough, ChatOpenAI, and functions to create chains

# from langchain_core.prompts import ChatPromptTemplate
# from langchain_core.runnables import RunnablePassthrough
# from langchain_openai import ChatOpenAI
# from langchain.chains.combine_documents.stuff import create_stuff_documents_chain
# from langchain.chains.retrieval import create_retrieval_chain

# # Load the saved FAISS vector store
# vectorstore = FAISS.load_local(
#     "faiss_store_practice", embeddings=embeddings, allow_dangerous_deserialization=True
# )

# # Create a retriever
# retriever = vectorstore.as_retriever(
#     search_type="similarity",  # Other options: mmr, similarity_score_threshold
#     search_kwargs={"k": 3},  # Return 3 most similar documents
# )

# # Create a prompt template for the RAG system
# message = """
#         Answer this question using the provided context only.
#         If the information is not available in the context, just reply with "I don't know"
#         {input}
#         Context:
#         {context}
#         """

# prompt = ChatPromptTemplate.from_messages([("human", message)])

# # Create a language model
# llm = ChatOpenAI()

# # Create a question-answering chain that combines retrieved documents
# question_answer_chain = create_stuff_documents_chain(llm, prompt)

# # Create the final RAG chain by combining the retriever and the question-answering chain
# rag_chain = create_retrieval_chain(retriever, question_answer_chain)
# print(rag_chain)

# # Alternative way to create a RAG chain using LCEL:
# # rag_chain = (
# #    {"context": retriever, "question": RunnablePassthrough()}
# #    | prompt
# #    | llm
# # )

### Invoking the Retrieval Chain

Now let's test our RAG chain with a query:

In [None]:
# TODO: Invoke the RAG chain with a query and display the results
# HINT: Use the invoke method with a dictionary containing the input key

# response = rag_chain.invoke({"input": "tell me about all the reimbursement policies"})
# print(response)
# print("\nAnswer:")
# print(response['answer'])
# print("\nRetrieved contexts:")
# for doc in response["context"]:
#     print(doc.page_content)

## Part 4: Building a ChatPDF Application

Now let's build an application similar to ChatPDF that allows users to upload a PDF and chat with it. We'll use Gradio to create a simple web interface.

### Creating a PDF Loading Function

First, let's create a function to load a PDF into a vector store:

In [None]:
# TODO: Create a function to load a PDF into a vector store
# HINT: The function should take a file (tempfile), load it, split it into chunks, and store it in a vector store

# import tempfile
# from langchain_chroma import Chroma
# from langchain_openai import OpenAIEmbeddings

# def load_pdf_into_vectorstore(file: tempfile) -> str:
#     try:
#         print("======Loading file==================")
#         file_path = file.name
#         loader = PyPDFLoader(file_path=file_path)
#         documents = loader.load()
#         text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=30, separator="\n")
#         docs = text_splitter.split_documents(documents=documents)
#         embeddings = HuggingFaceEmbeddings()  # or OpenAIEmbeddings() if you have an API key

#         # Choose one option below:
#         # Option 1: FAISS
#         # vectorstore = FAISS.from_documents(docs, embeddings)
#         # vectorstore.save_local("pdf_store")

#         # Option 2: Chroma
#         vectorstore = Chroma.from_documents(
#             documents, embedding=embeddings, persist_directory="chromadb_chatpdf"
#         )

#         print("======File Loaded================== ")

#         return 'Document uploaded and index created successfully. You can chat now.'
#     except Exception as e:
#         print(e)
#         return str(e)

### Creating a Response Function

Now let's create a function to handle chat queries:

In [None]:
# TODO: Create a function to get responses from the RAG system for chat queries
# HINT: The function should load the vector store, create a RAG chain, and process the query

# import gradio as gr
# from langchain import OpenAI, PromptTemplate
# from langchain.document_loaders import PyPDFLoader
# from langchain_community.chat_message_histories import ChatMessageHistory
# from langchain_core.chat_history import BaseChatMessageHistory
# from langchain_core.runnables.history import RunnableWithMessageHistory
# from langchain_core.messages import HumanMessage, SystemMessage, AIMessage

# model = ChatOpenAI()

# def getresponse(query, history: list) -> tuple:
#     # Load the vector store
#     vectorstore = Chroma(
#         persist_directory="chromadb_chatpdf", embedding_function=embeddings
#     )
#     # Alternative: FAISS
#     # vectorstore = FAISS.load_local("pdf_store", embeddings=embeddings, allow_dangerous_deserialization=True)

#     # Create the prompt template
#     message = """
#     Answer this question using the provided context. If information is not available in the context,
#     just respond saying "I don't know"
#     {input}
#     Context:
#     {context}
#     """

#     prompt = ChatPromptTemplate.from_messages([("human", message)])
#     llm = ChatOpenAI()

#     # Create the RAG chain
#     question_answer_chain = create_stuff_documents_chain(llm, prompt)
#     rag_chain = create_retrieval_chain(vectorstore.as_retriever(), question_answer_chain)

#     # Process the query
#     response = rag_chain.invoke({"input": query})
#     print(response)
#     history.append((query, response['answer']))
#     return "", history

### Building the Gradio Interface

Now let's build a Gradio interface for our ChatPDF application:

In [None]:
# TODO: Create a Gradio interface for the ChatPDF application
# HINT: Use gr.Blocks to create a UI with file upload, chat, and other components

# with gr.Blocks() as demo:
#     with gr.Row():
#         with gr.Column():
#             file = gr.components.File(
#                 label='Upload your PDF file',
#                 file_count='single',
#                 file_types=['.pdf'])
#             upload = gr.components.Button(
#                 value='Upload', variant='primary')
#         label = gr.components.Textbox()
#     
#     chatbot = gr.Chatbot(label='Talk to the Document')
#     msg = gr.Textbox()
#     clear = gr.ClearButton([msg, chatbot])
#     vectorStore = None

#     # Connect components to functions
#     upload.click(load_pdf_into_vectorstore, [file], [label])
#     msg.submit(getresponse, [msg, chatbot], [msg, chatbot])

# if __name__ == '__main__':
#     demo.launch(debug=True)

## Part 5: Retrieving Data from Web URLs

Let's see how to retrieve data from a web URL, create embeddings, and use it for retrievals:

In [None]:
# TODO: Create a web loader to fetch and process content from a URL
# HINT: Use WebBaseLoader and RecursiveCharacterTextSplitter

# from langchain_chroma import Chroma
# from langchain_community.document_loaders import WebBaseLoader
# from langchain_openai import OpenAIEmbeddings
# from langchain_text_splitters import RecursiveCharacterTextSplitter

# # Load blog post (or any other web page)
# loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
# data = loader.load()

# # Split the content into smaller chunks
# text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
# splits = text_splitter.split_documents(data)

# # Create a vector database
# embedding = HuggingFaceEmbeddings()  # or OpenAIEmbeddings()
# vectordb = Chroma.from_documents(documents=splits, embedding=embedding)
# vectordb

## Part 6: Using MultiQueryRetriever

The MultiQueryRetriever automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents.

In [None]:
# TODO: Create and use a MultiQueryRetriever
# HINT: Import MultiQueryRetriever and create an instance from a vector store and LLM

# from langchain.retrievers.multi_query import MultiQueryRetriever
# from langchain_openai import ChatOpenAI

# question = "What are the approaches to Task Decomposition?"
# llm = ChatOpenAI(temperature=0)
# retriever_from_llm = MultiQueryRetriever.from_llm(
#     retriever=vectordb.as_retriever(), llm=llm
# )

# # Set up logging to see the multiple queries that are generated
# import logging
# logging.basicConfig()
# logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

# # Use the retriever
# unique_docs = retriever_from_llm.invoke(question)
# print(unique_docs)