## **Learning Objectives**

By completing these exercises, you will:

- Understand Retrieval-Augmented Generation (RAG) and its components.
- Load, preprocess, and handle PDF documents effectively.
- Convert textual data into embeddings for efficient retrieval.
- Implement and test document retrieval systems using LangChain and FAISS.
- Integrate retrieval systems with free Language Models (LLMs) from ChatGroq .
- Build an interactive chat-based Q&A system.

---

## **Exercise 1: Setup and Warm-up**

In this exercise, you'll set up your environment and select a suitable language model.

**Steps:**

1. **Load Environment Variables:** Ensure your environment variables (e.g., API keys, tokens) are securely stored and loaded.
2. **Choose LLM:** Select a free LLM model from from ChatGroq. 
3. **Instantiate the Model:** Create an instance of your chosen model.


In [3]:
# Import necessary libraries
from dotenv import load_dotenv
from langchain_huggingface import HuggingFaceEndpoint
import warnings
from langchain_groq import ChatGroq
from langchain.prompts.prompt import PromptTemplate

# Load environment variables
load_dotenv()

warnings.filterwarnings("ignore")

llm = ChatGroq(
    model="llama-3.1-8b-instant", #"llama3-8b-8192",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2
)


---

## **Exercise 2: Data Ingestion**

In this exercise, you'll learn to load PDF data into a Python environment.

**Steps:**

1. **Import PDF Loader:** Use LangChain’s `PyPDFLoader`.
2. **Load PDF File:** Create a function to read the PDF file.
3. **Display PDF Content:** Print the number of pages and first page content.

In [4]:
# Import PyPDFLoader
from langchain_community.document_loaders import PyPDFLoader

# Example function to load PDF

def load_pdf(pdf_path):
    """Load a PDF and return a list of LangChain Document objects."""
    loader = PyPDFLoader(file_path=pdf_path)
    documents = loader.load()
    return documents


In [5]:
# Load your PDF and print out content here
react_docs = load_pdf("../documents/react_paper.pdf")

print(f"Number of pages loaded: {len(react_docs)}")
print("\nFirst page preview:\n")
print(react_docs[0].page_content[:1200])


Number of pages loaded: 33

First page preview:

Published as a conference paper at ICLR 2023
REAC T: S YNERGIZING REASONING AND ACTING IN
LANGUAGE MODELS
Shunyu Yao∗*,1, Jeffrey Zhao2, Dian Yu2, Nan Du2, Izhak Shafran2, Karthik Narasimhan1, Yuan Cao2
1Department of Computer Science, Princeton University
2Google Research, Brain team
1{shunyuy,karthikn}@princeton.edu
2{jeffreyzhao,dianyu,dunan,izhak,yuancao}@google.com
ABSTRACT
While large language models (LLMs) have demonstrated impressive performance
across tasks in language understanding and interactive decision making, their
abilities for reasoning (e.g. chain-of-thought prompting) and acting (e.g. action
plan generation) have primarily been studied as separate topics. In this paper, we
explore the use of LLMs to generate both reasoning traces and task-speciﬁc actions
in an interleaved manner, allowing for greater synergy between the two: reasoning
traces help the model induce, track, and update action plans as well as handle
except

---

## **Exercise 3: Document Chunking**

This exercise introduces splitting large documents into manageable text chunks.

**Steps:**

1. **Import Text Splitter:** Use `RecursiveCharacterTextSplitter`.
2. **Chunk Document:** Write a function that splits loaded documents into chunks.
3. **Test Function:** Verify by displaying the resulting chunks.


In [8]:
# Import RecursiveCharacterTextSplitter
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Example chunking function
def chunk_documents(documents, chunk_size=150, chunk_overlap=40):
    """Split loaded documents into overlapping chunks."""
    splitter = RecursiveCharacterTextSplitter(
        chunk_size=chunk_size,
        chunk_overlap=chunk_overlap,
    )
    chunks = splitter.split_documents(documents)

    # Add a stable chunk id for easier debugging/inspection.
    for idx, chunk in enumerate(chunks):
        chunk.metadata["id"] = f"chunk_{idx}"

    return chunks


In [10]:
# Execute your chunking function and display results here
react_chunks = chunk_documents(react_docs, chunk_size=150, chunk_overlap=40)

print(f"Number of chunks created: {len(react_chunks)}")
print("\nFirst chunk metadata:\n", react_chunks[0].metadata)
print("\nFirst chunk content:\n", react_chunks[0].page_content)


Number of chunks created: 1094

First chunk metadata:
 {'producer': 'pdfTeX-1.40.21', 'creator': 'LaTeX with hyperref', 'creationdate': '2023-03-13T00:09:11+00:00', 'author': '', 'keywords': '', 'moddate': '2023-03-13T00:09:11+00:00', 'ptex.fullbanner': 'This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2', 'subject': '', 'title': '', 'trapped': '/False', 'source': '../documents/react_paper.pdf', 'total_pages': 33, 'page': 0, 'page_label': '1', 'id': 'chunk_0'}

First chunk content:
 Published as a conference paper at ICLR 2023
REAC T: S YNERGIZING REASONING AND ACTING IN
LANGUAGE MODELS



---

## **Exercise 4: Embedding and Storage**

In this exercise, you will create embeddings from text chunks and store them efficiently.

**Steps:**

1. **Choose Embedding Model:** Use `sentence-transformers/all-mpnet-base-v2` from Hugging Face.
2. **Generate Embeddings:** Transform document chunks into embeddings.
3. **Store Embeddings:** Save these embeddings using FAISS locally.


In [11]:
# Import libraries
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.vectorstores.faiss import DistanceStrategy

# Example function for embeddings and storage
def embed_and_store(chunks, db_name="react_exercise"):
    """Create embeddings for chunks and store them in a local FAISS index."""
    embedding_model = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-mpnet-base-v2",
        encode_kwargs={"normalize_embeddings": True},
    )

    vectorstore = FAISS.from_documents(
        documents=chunks,
        embedding=embedding_model,
        distance_strategy=DistanceStrategy.COSINE,
    )

    save_path = f"../vector_databases/vector_db_{db_name}"
    vectorstore.save_local(save_path)
    return vectorstore, save_path


In [12]:
# Generate embeddings and save them locally
react_vectorstore, react_vector_db_path = embed_and_store(react_chunks, db_name="react_exercise")

print(f"Stored FAISS index at: {react_vector_db_path}")
print(f"Documents in vector store: {react_vectorstore.index.ntotal}")


Stored FAISS index at: ../vector_databases/vector_db_react_exercise
Documents in vector store: 1094


---

## **Exercise 5: Retrieval from FAISS**

Here, you will learn how to retrieve documents from a vector database using embeddings.

**Steps:**

1. **Load Embeddings:** Load stored embeddings from the FAISS database.
2. **Implement Retrieval:** Create logic to retrieve relevant chunks based on queries.
3. **Test Retriever:** Execute retrieval using sample queries.

In [13]:
# Implement retrieval logic from your FAISS database
def load_retriever(vector_db_path):
    """Load a local FAISS index and expose a retriever."""
    embedding_model = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-mpnet-base-v2",
        encode_kwargs={"normalize_embeddings": True},
    )

    vectorstore = FAISS.load_local(
        folder_path=vector_db_path,
        embeddings=embedding_model,
        allow_dangerous_deserialization=True,
        distance_strategy=DistanceStrategy.COSINE,
    )

    retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
    return retriever, vectorstore


react_retriever, react_vectorstore = load_retriever(react_vector_db_path)


In [14]:
# Test your retrieval system with queries
query = "What is ReAct in language models?"
retrieved_docs = react_retriever.invoke(query)

print(f"Retrieved {len(retrieved_docs)} documents for query: {query}\n")
for i, doc in enumerate(retrieved_docs, start=1):
    print(f"Result {i} metadata: {doc.metadata}")
    print(doc.page_content[:300])
    print("-" * 80)


Retrieved 3 documents for query: What is ReAct in language models?

Result 1 metadata: {'producer': 'pdfTeX-1.40.21', 'creator': 'LaTeX with hyperref', 'creationdate': '2023-03-13T00:09:11+00:00', 'author': '', 'keywords': '', 'moddate': '2023-03-13T00:09:11+00:00', 'ptex.fullbanner': 'This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2', 'subject': '', 'title': '', 'trapped': '/False', 'source': '../documents/react_paper.pdf', 'total_pages': 33, 'page': 3, 'page_label': '4', 'id': 'chunk_148'}
Since decision making and reasoning capabilities are integrated into a large language model, ReAct
--------------------------------------------------------------------------------
Result 2 metadata: {'producer': 'pdfTeX-1.40.21', 'creator': 'LaTeX with hyperref', 'creationdate': '2023-03-13T00:09:11+00:00', 'author': '', 'keywords': '', 'moddate': '2023-03-13T00:09:11+00:00', 'ptex.fullbanner': 'This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpat

---

## **Exercise 6: Connecting Retrieval with LLM**

You'll now connect document retrieval with the Language Model.

**Steps:**

1. **Create Retrieval Chain:** Link your retrieval system to your instantiated LLM.
2. **Test the Chain:** Confirm it works by generating answers from retrieved documents.

In [15]:
# Write a function to create retrieval and document processing chains
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.retrieval import create_retrieval_chain


def create_rag_chain(retriever, llm):
    """Connect retriever with an LLM using a retrieval QA prompt."""
    prompt = ChatPromptTemplate.from_template(
        """You are a helpful assistant. Use only the provided context to answer.
If the answer is not in the context, say you do not know.

Context:
{context}

Question: {input}"""
    )

    combine_docs_chain = create_stuff_documents_chain(llm=llm, prompt=prompt)
    retrieval_chain = create_retrieval_chain(
        retriever=retriever,
        combine_docs_chain=combine_docs_chain,
    )
    return retrieval_chain


react_retrieval_chain = create_rag_chain(react_retriever, llm)


In [16]:
# Invoke your chain with a sample question
sample_question = "Give a short summary of the ReAct paper."
response = react_retrieval_chain.invoke({"input": sample_question})

print("Question:", sample_question)
print("Answer:", response["answer"])


Question: Give a short summary of the ReAct paper.
Answer: Based on the provided context, I can summarize the ReAct paper as follows:

The ReAct paper (arXiv:2210.03629v3) discusses a model called ReAct, which can change its behavior drastically when given certain hints or modifications, such as adding 17 to a specific point (Act 23). The paper aims to demonstrate the differences between ReAct and another model called IM, highlighting the importance of internal modifications in ReAct.


---

## **Exercise 7: Interactive Chat System**

In the final exercise, build an interactive chat-based query system.

**Steps:**

1. **Create Chat Interface:** Develop a simple function for interactive querying.
2. **Run the Chat:** Allow users to ask questions and receive immediate responses.


In [17]:
# Define your interactive chat querying function
def interactive_chat(retrieval_chain):
    """Simple terminal-like chat loop for querying the indexed documents."""
    print("RAG chat is ready. Type 'exit' to stop.")

    while True:
        user_question = input("\nYou: ").strip()

        if user_question.lower() in {"exit", "quit", "q"}:
            print("Assistant: Session ended.")
            break

        if not user_question:
            print("Assistant: Please enter a question.")
            continue

        result = retrieval_chain.invoke({"input": user_question})
        print("Assistant:", result["answer"].strip())


In [None]:
# Run and test your interactive chat system
interactive_chat(react_retrieval_chain)


I don't know. The provided context does not mention the specific problem that ReAct tries to solve. It only mentions the differences between ReAct and IM, and the ability of ReAct to change its behavior.


---

## **Conclusion & Reflection**

After completing these exercises:

- Summarize key concepts learned.
- Reflect on the effectiveness and limitations of the free LLM and RAG system you've built.
- Consider how you might improve or extend your system in practical applications.

---