# RAG (Retrieval Augmented Generation) using LangChain and Gemini

This notebook implements a RAG system using LangChain and Google's Gemini model.

![RAG Pipeline](download.png)

In [38]:
# Install required packages
%pip install langchain langchain-core langchain-community langchainhub \
            langchain-google-genai chromadb tiktoken python-dotenv bs4 --quiet

I0000 00:00:1755536530.231160 3900356 fork_posix.cc:71] Other threads are currently calling into gRPC, skipping fork() handlers


[0mNote: you may need to restart the kernel to use updated packages.


In [39]:
import os
from dotenv import load_dotenv
import bs4
from langchain import hub
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings

# Load environment variables
load_dotenv()

# Verify Gemini API key is set
google_api_key = os.getenv("GEMINI_API_KEY")
if not google_api_key:
    raise ValueError("Please set GEMINI_API_KEY environment variable in .env file")

# Set the API key for Google's services
os.environ["GOOGLE_API_KEY"] = google_api_key

In [40]:
#### INDEXING ####

# Custom webpage loader for better content extraction
def load_webpage():
    url = "https://lilianweng.github.io/posts/2023-06-23-agent/"
    loader = WebBaseLoader(
        web_paths=(url,),
        bs_kwargs=dict(
            parse_only=bs4.SoupStrainer(
                ["article", "div", "p", "h1", "h2", "h3", "h4", "li"]
            )
        )
    )
    return loader.load()

# Load documents
docs = load_webpage()

# Split documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=400)
splits = text_splitter.split_documents(docs)

# Create embeddings using Gemini
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# Create vector store
vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=embeddings
)

# Create retriever
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 6}
)


In [41]:
#### RETRIEVAL and GENERATION ####

# Get the RAG prompt from LangChain hub
prompt = hub.pull("rlm/rag-prompt")

# Initialize Gemini model
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0)

# Create formatting function
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Create the RAG chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [42]:
# Test the RAG system
questions = [
    "What is Task Decomposition and how is it used in AI agents?",
    "How do AI agents handle memory and what types of memory are discussed?",
    "What are the main challenges in LLM-powered autonomous agents?"
]

for question in questions:
    print(f"\nQuestion: {question}")
    print("\nAnswer:")
    response = rag_chain.invoke(question)
    print(response)
    print("\n" + "-"*80)


Question: What is Task Decomposition and how is it used in AI agents?

Answer:
I am sorry, but this context does not contain information about task decomposition in AI agents.  Therefore, I cannot answer your question.

--------------------------------------------------------------------------------

Question: How do AI agents handle memory and what types of memory are discussed?

Answer:
AI agents handle memory using short-term memory (in-context learning within a limited context window) and long-term memory (external vector stores for retrieval).  The types of memory discussed include short-term, long-term, and  mappings to human memory types like sensory, explicit/declarative, and implicit/procedural memory.

--------------------------------------------------------------------------------

Question: What are the main challenges in LLM-powered autonomous agents?

Answer:
The provided text only mentions that there are "a couple common limitations" in building LLM-centered agents, but