# Hands‑On: RAG & Agents with Gemini

**Objective:** Build a Retrieval-Augmented Generation (RAG) system using Google Gemini, LangChain, and FAISS. We will evolve this from a simple Q&A pipeline into an intelligent **Agent** capable of using tools.

**What we will build:**
1.  **Ingestion:** Load documents about a fictional company ("TechNova").
2.  **Indexing:** Split text into chunks and create vector embeddings.
3.  **Simple RAG:** A linear pipeline (Retrieve $\rightarrow$ Generate).
4.  **Agentic RAG:** A reasoning agent (LangGraph) that decides *when* to search.

## Install libraries
We need to install the following core libraries:
* `google-genai` & `langchain-google-genai`: To interface with Gemini.
* `faiss-cpu`: A vector store for efficient similarity search.
* `langgraph`: To build the agent workflow.

In [None]:
!pip install google-genai faiss-cpu
!pip install langchain_community langchain_text_splitters langchain_huggingface langgraph langchain_google_genai

## Import & Configure Gemini Client

We retrieve the Gemini API key from Google Colab's secure `userdata`.

In [None]:
import os
from google import genai
from google.colab import userdata
from huggingface_hub import login


google_api= userdata.get('GEMINI_API_KEY')
client = genai.Client(api_key=google_api)

hf_token = userdata.get('HF_TOKEN')
login(hf_token, add_to_git_credential=True)

## Document Dataset

We will use a fictional dataset regarding **TechNova Inc.**
In a real-world scenario, this text would be loaded from PDFs, Websites, or SQL databases.

In [None]:
docs = [
    "TechNova Inc. is a leading technology company that has been at the forefront of delivering innovative software solutions for businesses across multiple industries, helping organizations optimize their operations and achieve sustainable growth through cutting-edge technology.",
    "NovaSuite, the flagship product of TechNova, is a comprehensive enterprise software platform that integrates project management, workflow automation, collaboration tools, and real-time analytics to provide businesses with a single unified system for managing complex operations efficiently.",
    "TechNova Analytics is an advanced data analytics solution designed to help organizations uncover actionable insights from their data, allowing decision-makers to monitor key performance indicators, forecast trends, and make informed business decisions based on real-time and historical data.",
    "The TechNova Agents platform provides intelligent automation capabilities that enable businesses to streamline repetitive tasks, improve operational efficiency, and reduce human error by leveraging smart workflows and customizable automation rules tailored to each organization’s specific needs.",
    "TechNova Cloud is a highly secure and scalable cloud infrastructure service that allows companies to host their applications, store critical data, and deploy enterprise solutions with confidence, while benefiting from high availability, fast performance, and strict compliance with international security standards.",
    "TechNova Gemini is a versatile generative AI model offered as an API service, enabling businesses to create interactive applications, personalized customer experiences, and innovative content solutions, all while integrating seamlessly with existing enterprise systems and digital platforms.",
    "In addition to its products, TechNova offers consulting services that cover software integration, IT strategy planning, digital transformation, and organizational change management, ensuring that clients can implement new technologies effectively and achieve measurable results.",
    "The company maintains strategic partnerships with leading hardware and software vendors to enhance the functionality and compatibility of its products, providing clients with a rich ecosystem of complementary tools and solutions that maximize business value.",
    "TechNova places a strong emphasis on customer support and training, offering dedicated teams, comprehensive documentation, and hands-on workshops to help clients fully leverage the capabilities of their software and ensure successful adoption across all levels of the organization.",
    "Sustainability and ethical practices are central to TechNova’s operations, as the company continuously works to minimize its environmental footprint, promote social responsibility, and ensure that all products, services, and business practices adhere to high standards of integrity and transparency."
]


## Building the RAG


### Step 1: Indexing
To perform RAG, we cannot feed all documents to the LLM at once. We must:
1.  **Chunk:** Break text into smaller pieces.
2.  **Embed:** Convert text into numeric vectors (lists of numbers) that represent meaning.
3.  **Store:** Save these vectors in a database (FAISS).

Chunking

In [None]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores  import FAISS

# Initialize the recursive text splitter
splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,        # maximum size of each chunk
    chunk_overlap=30,      # overlap between chunks
    separators=["\n\n", "\n", " ", ""]  # hierarchy of separators
)

# Split the documents
doc_chunks = splitter.split_text(' '.join(docs))
print(f"Number of chunks: {len(doc_chunks)}")
print(f"Example Chunk:{doc_chunks[1]}")


Number of chunks: 11
Example Chunk:NovaSuite, the flagship product of TechNova, is a comprehensive enterprise software platform that integrates project management, workflow automation, collaboration tools, and real-time analytics to provide businesses with a single unified system for managing complex operations efficiently. TechNova


Embedding

In [None]:
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_core.documents import Document


# Load Sentence Transformer embeddings
embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Convert doc_chunks (list of strings) into a list of Document objects
document_objects = [Document(page_content=chunk) for chunk in doc_chunks]



Storing

In [None]:
# Create FAISS vector store
vector_store = FAISS.from_documents(document_objects, embedding_model)
# Save FAISS index and document mapping
vector_store.save_local("faiss_index")

## Step 2: Build a basic retriever
This component acts as the "Search Engine." It takes a user query, converts it into numbers (embedding), and finds the most mathematically similar chunks in our FAISS index.

In [None]:
from langchain_community.vectorstores import FAISS

def retrieve(query, k=3,vector_store= None,embedding_model=None, faiss_index_path="faiss_index"):

    if embedding_model is None:
        raise ValueError("embedding_model must be provided.")

    # Load FAISS if not passed
    if vector_store is None:
        vector_store = FAISS.load_local(
            faiss_index_path,
            embedding_model,
            allow_dangerous_deserialization=True
        )

    # Embed the query
    query_embedding = embedding_model.embed_query(query)

    # Perform similarity search and return list of (Document, score) tuples
    results = vector_store.similarity_search_by_vector(
        query_embedding, k=k
    )

    return results


In [None]:

query = "What is TechNova?"
top_docs = retrieve(query, k=3, vector_store=vector_store, embedding_model=embedding_model)

for i, doc in enumerate(top_docs, 1):
    print(f"Result {i}: {doc.page_content}\n")


Result 1: TechNova Inc. is a leading technology company that has been at the forefront of delivering innovative software solutions for businesses across multiple industries, helping organizations optimize their operations and achieve sustainable growth through cutting-edge technology. NovaSuite, the flagship

Result 2: of its products, providing clients with a rich ecosystem of complementary tools and solutions that maximize business value. TechNova places a strong emphasis on customer support and training, offering dedicated teams, comprehensive documentation, and hands-on workshops to help clients fully

Result 3: NovaSuite, the flagship product of TechNova, is a comprehensive enterprise software platform that integrates project management, workflow automation, collaboration tools, and real-time analytics to provide businesses with a single unified system for managing complex operations efficiently. TechNova



## Step 3: Create the Generator
This is the "Writer." We wrap the retrieved context into a prompt and ask Gemini to answer based **only** on that information.

In [None]:
from google.genai.types import GenerateContentConfig

def generate_answer(context, query, client, model_name="gemini-2.5-flash", temperature=0.7, max_output_tokens=512):
    """
    Generates an answer using Gemini LLM based on provided context and query.
    If the context does not contain relevant information, the model should respond 'I don't know.'
    """
    # Improved prompt
    prompt = f"""
You are an assistant that answers questions **based only on the given context**.
- Use the context to answer the question as accurately as possible.
- If the context does NOT contain enough information to answer the question, reply: "I don't know."
- Do NOT make up information.

Context:
{context}

Question:
{query}

Answer:
"""
    # Generate content
    response = client.models.generate_content(
        model=model_name,
        contents=prompt,
        config=GenerateContentConfig(
            temperature=temperature,
            max_output_tokens=max_output_tokens
        )
    )

    # Return first candidate
    return response.text


## Step 4: Putting all Togehter (RAG Pipeline)
Here, the agent first retrieves, then reasons over the retrieved context to answer.

In [None]:
def rag(query, vector_store, embedding_model, client, k=3):

    # Retrieve documents
    top_docs = retrieve(query, k=k, vector_store=vector_store, embedding_model=embedding_model)

    # Here we assume top_docs is a list of Document objects
    context = "\n".join([doc.page_content for doc in top_docs])

    # Generate answer
    answer = generate_answer(context, query, client)

    return answer, top_docs


In [None]:
query = "What is the mission of TechNova?"

answer, top_docs = rag(query, vector_store, embedding_model, client, k=3)

print("=== Generated Answer ===")
print(answer)



=== Generated Answer ===
TechNova's mission is delivering innovative software solutions for businesses across multiple industries, helping organizations optimize their operations and achieve sustainable growth through cutting-edge technology.


##  Moving to Agents (LangGraph)
A simple RAG pipeline is rigid: It *always* searches, even if you say "Hello".

An **Agent** is dynamic. It uses an LLM to **reason**:
1.  User asks a question.
2.  Agent thinks: "Do I know this? Or do I need to use a tool?"
3.  If needed, Agent calls the `rag_search_tool`.

In [None]:
def rag_search_tool(query: str) -> dict:
    """
    RAG search tool.
    Uses ONLY the existing `retrieve()` function without modifying it.
    Returns an answer + the retrieved documents.
    """
    # --- Retrieve documents using your function exactly as-is ---
    top_docs = retrieve(
        query=query,
        k=3,
        vector_store=vector_store,
        embedding_model=embedding_model
    )

    # Return result in tool-friendly format
    return {
        "documents": [doc.page_content for doc in top_docs]
    }


In [None]:
tools = [rag_search_tool]


In [None]:
from langchain.chat_models import init_chat_model
llm = init_chat_model(
    "gemini-2.0-flash",
    model_provider="google_genai",
    google_api_key= google_api
)


In [None]:
sys_msg = """
You are an AI agent that uses tools when needed for TechNova Company.
When the user asks for information stored in the company catalogue,
you MUST call the `rag_search_tool` tool.

If the user asks for general knowledge or reasoning, respond normally.
"""


In [None]:
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver

agent = create_react_agent(
    model=llm,
    tools=tools,
    prompt=sys_msg,
    checkpointer=MemorySaver()
)


<langgraph.graph.state.CompiledStateGraph object at 0x7d9b97b62ed0>


/tmp/ipython-input-388110057.py:4: LangGraphDeprecatedSinceV10: create_react_agent has been moved to `langchain.agents`. Please update your import to `from langchain.agents import create_agent`. Deprecated in LangGraph V1.0 to be removed in V2.0.
  agent = create_react_agent(


In [None]:

result = agent.invoke(
      {
        "messages": [
            {"role": "user", "content": "Hi, can you tell me more about TechNova?"}
                ]
      },
      config={"configurable": {"thread_id": "user_1"}}
        )


print("\n=== Agent Response ===")

# Assuming `result` is what you got
for msg in result["messages"]:
  msg.pretty_print()



=== Agent Response ===

Hi, can you tell me more about TechNova?
Tool Calls:
  rag_search_tool (abe42138-39ad-4750-b866-452793c0f599)
 Call ID: abe42138-39ad-4750-b866-452793c0f599
  Args:
    query: TechNova
Name: rag_search_tool

{"documents": ["TechNova Inc. is a leading technology company that has been at the forefront of delivering innovative software solutions for businesses across multiple industries, helping organizations optimize their operations and achieve sustainable growth through cutting-edge technology. NovaSuite, the flagship", "of its products, providing clients with a rich ecosystem of complementary tools and solutions that maximize business value. TechNova places a strong emphasis on customer support and training, offering dedicated teams, comprehensive documentation, and hands-on workshops to help clients fully", "NovaSuite, the flagship product of TechNova, is a comprehensive enterprise software platform that integrates project management, workflow automation, col

## You can test the following
- Print out retrieved chunks for each query and inspect relevance.
- Try changing `k` in `retrieve()` to see how adding/removing context changes the answer.
- Adjust prompt wording: maybe you need to tell the LLM to rely more on context.
- See if using a different Gemini generation model (e.g., with higher temperature) helps or hurts.