<a href="https://colab.research.google.com/github/sudeep2711/langchain-rag-tutorial/blob/main/%F0%9F%A6%9C%F0%9F%94%97_LangChain_using_Gemini_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🦜🔗 LangChain using Gemini API

**Author:** [Sudeep Kurian](https://www.linkedin.com/in/sudeep-kurian-75721614b)  
**Notebook Type:** Master Notebook — All Lessons Combined  
**Description:**  
This notebook is a complete, step-by-step learning journey for building with [LangChain](https://www.langchain.com/) using the **Google Gemini API**.  
We start with the basics (prompt templates, LLM chains) and progress to advanced topics like agents, document retrieval, embeddings, and a **capstone RAG Agent**.

---

> **Tip:** Run cells in order. Each lesson builds on the previous.  
> Tested in **Google Colab** with Python 3.11 and `langchain` ≥ 0.2.


# 📚 LangChain Learning Curriculum — Table of Contents

<table style="font-size:1.2em; width:100%;">
  <tr>
    <th>Lesson</th>
    <th>Topic</th>
    <th>Key Focus</th>
  </tr>
  <tr>
    <td>1</td>
    <td><a href="#lesson-1"><b>LangChain in Notebooks: Setup & First Call</b></a></td>
    <td>Install, set up Gemini API, run first LLM call</td>
  </tr>
  <tr>
    <td>2</td>
    <td><a href="#lesson-2"><b>Prompt Templates</b></a></td>
    <td>Create reusable, parameterized prompts</td>
  </tr>
  <tr>
    <td>3</td>
    <td><a href="#lesson-3"><b>Simple LLM Chains</b></a></td>
    <td>Combine prompts & models with LCEL</td>
  </tr>
  <tr>
    <td>4</td>
    <td><a href="#lesson-4"><b>Adding Memory</b></a></td>
    <td>Maintain conversational state with <code>ConversationBufferMemory</code></td>
  </tr>
  <tr>
    <td>5</td>
    <td><a href="#lesson-5"><b>Multiple Chains & Sequential Execution</b></a></td>
    <td>Build multi-step workflows</td>
  </tr>
  <tr>
    <td>6</td>
    <td><a href="#lesson-6"><b>Introduction to Agents</b></a></td>
    <td>Use built-in tools & decision-making</td>
  </tr>
  <tr>
    <td>7</td>
    <td><a href="#lesson-7"><b>Building a Custom Tool</b></a></td>
    <td>Create and integrate your own tool</td>
  </tr>
  <tr>
    <td>8</td>
    <td><a href="#lesson-8"><b>Loading and Splitting Documents</b></a></td>
    <td>Ingest PDFs, text, and web data</td>
  </tr>
  <tr>
    <td>9</td>
    <td><a href="#lesson-9"><b>Vectorstores</b></a></td>
    <td>Create & query embeddings with FAISS/Chroma</td>
  </tr>
  <tr>
    <td>10</td>
    <td><a href="#lesson-10"><b>Retrieval-Augmented Generation (RAG)</b></a></td>
    <td>Build Q&amp;A over documents</td>
  </tr>
  <tr>
    <td>11</td>
    <td><a href="#lesson-11"><b>Capstone Project: RAG Agent</b></a></td>
    <td>Full pipeline from data → retrieval → answer</td>
  </tr>
</table>

---


## Lesson 1 — LangChain in Notebooks: Setup & First Call <a name="lesson-1"></a>

### 🎯 Objective
We’ll get our notebook ready for LangChain (Gemini-only) and make our **first LLM call**.  
Together, we’ll see how LangChain pieces fit — **models**, **prompts**, and **chains** — and run a minimal example we can reuse in future lessons.

---

### 📖 What & Why (Concepts in Plain English)

- **LangChain**: A Python framework for building LLM-powered apps with modular building blocks — **models**, **prompts**, **chains**, **tools**, **memory**, and **retrievers**.
- **LLM (Model)**: The actual chat/completion API (here: Google Gemini). In LangChain, we wrap it with `ChatGoogleGenerativeAI`.
- **Prompt**: The instruction + inputs we send to the model. LangChain gives us `PromptTemplate` so prompts are parameterized and reusable.
- **Chain**: A sequence that wires our prompt to a model (and later, memory/tools/retrieval).  
  In this lesson, we’ll do the simplest form: “prompt → model → answer”.

> **Our Mental Model:**  
> We write a **Prompt** with placeholders → feed variables at runtime → a **Model** produces output → optionally add memory, retrieval, or tools later.  
> We start simple and will add complexity step-by-step.

---

### ✅ Prerequisites
- **Jupyter** or **Google Colab** notebook.
- A Gemini API key stored in Colab’s secrets as `GEMINI_API_KEY`.

---

### 🚀 Minimal, Runnable Example (Gemini-Only)

1) **Install packages** (run once per environment)  
```python
!pip -qU install langchain langchain-google-genai langchain-community langchain-core tiktoken python-dotenv


In [3]:
!pip install langchain langchain-openai langchain-google-genai langchain-community langchain-core tiktoken python-dotenv

Collecting langchain-openai
  Downloading langchain_openai-0.3.29-py3-none-any.whl.metadata (2.4 kB)
Collecting langchain-google-genai
  Downloading langchain_google_genai-2.1.9-py3-none-any.whl.metadata (7.2 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting python-dotenv
  Downloading python_dotenv-1.1.1-py3-none-any.whl.metadata (24 kB)
Collecting langchain-core
  Downloading langchain_core-0.3.74-py3-none-any.whl.metadata (5.8 kB)
Collecting filetype<2.0.0,>=1.2.0 (from langchain-google-genai)
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Collecting google-ai-generativelanguage<0.7.0,>=0.6.18 (from langchain-google-genai)
  Downloading google_ai_generativelanguage-0.6.18-py3-none-any.whl.metadata (9.8 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from l

In [4]:
# ====  1: Gemini-Only Setup & Demo ====

# 1) Install required packages
try:
    import langchain
except ImportError:
    !pip -qU install langchain langchain-google-genai langchain-community langchain-core tiktoken python-dotenv

In [5]:
# 2) Load Gemini API key from Colab secrets
import os
from google.colab import userdata
os.environ["GOOGLE_API_KEY"] = userdata.get('GEMINI_API_KEY')
print("Google Gemini key set?", "GOOGLE_API_KEY" in os.environ)



Google Gemini key set? True


In [6]:
# 3) Minimal demo function to verify setup
def demo_summary():
    from langchain.prompts import PromptTemplate
    from langchain_google_genai import ChatGoogleGenerativeAI

    text = "LangChain helps developers build LLM apps using modular components like prompts, models, and chains."
    prompt = PromptTemplate.from_template("You are a concise assistant. Summarize in 1–2 sentences: {text}")
    model = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.2)
    chain = prompt | model
    return chain.invoke({"text": text}).content

# 4) Test the chain
print("Model response:", demo_summary())


Model response: LangChain assists developers in creating LLM applications by providing modular components such as prompts, models, and chains. This allows for flexible and structured development of AI-powered tools.


## Lesson 2 — Prompt Templates <a name="lesson-2"></a>

### 🎯 Objective
We’ll learn how to create **reusable, parameterized prompts** in LangChain using `PromptTemplate`.  
Prompt templates let us separate instructions from data, making our prompts **consistent**, **scalable**, and **easy to modify**.

---

### 📖 What & Why

- **What is a Prompt Template?**  
  A reusable prompt pattern where placeholders (like `{variable_name}`) get replaced with actual input values at runtime.

- **Why use them?**
  - Avoid rewriting the same prompt logic multiple times.
  - Parameterize inputs so the same structure works for different data.
  - Keep prompts readable and maintainable, especially for longer instructions.

---

### 🔑 Key Points
1. **Variables**: Placeholders in curly braces `{}` that get replaced at runtime.
2. **Consistency**: One definition, many inputs — useful for batch processing.
3. **Integration**: Works with any LangChain model, chain, or agent.

---

### 🚀 Minimal, Runnable Example (Gemini-Only)

```python
from langchain.prompts import PromptTemplate

# Step 1: Define a prompt template
prompt = PromptTemplate(
    input_variables=["topic", "style"],
    template="Write a {style} tweet about {topic}."
)

# Step 2: Preview the filled prompt
print(prompt.format(topic="LangChain", style="funny"))

# Step 3: Pipe the prompt into Gemini model
from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.7)

chain = prompt | model
result = chain.invoke({"topic": "LangChain", "style": "inspirational"})
print(result.content)


### 🧪 Time to Practice — Lesson 2

**Goal:**  
We will practice building `PromptTemplate` objects to make prompts reusable and dynamic.

---

#### **Part 1 — Basic PromptTemplate**
1. Create a `PromptTemplate` with:
   - Variables: `{SuperHero}` and `{Country}`
   - Template: `"Write a story about {SuperHero} visiting {Country}"`
2. Use `.format()` to preview the prompt with:
   - `SuperHero`: `"Spider Man"`
   - `Country`: `"India"`
3. Pipe the prompt into a `ChatGoogleGenerativeAI` model (`temperature=0.7`) and generate the story.
4. Print the story output.

---

#### **Part 2 — Multiple Inputs**
1. Create a list of at least **three countries**.
2. Loop through the list and run the chain for each country.
3. Print each story on a separate line with a clear label.


In [15]:
from langchain.prompts import PromptTemplate

prompt = PromptTemplate(
    input_variables = ["SuperHero","Country"],
    template = "Write a story about  {SuperHero}  visiting  {Country}"
)
print(prompt.format(SuperHero = "Spider Man", Country = "India"))

Write a story about  Spider Man  visiting  India


In [16]:
from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.7)

In [17]:
chain = prompt | model
result = chain.invoke({"SuperHero": "Spider Man", "Country": "India"})
print(result.content)

The hum of the Quinjet was a familiar lullaby, but the view from its viewport was anything but. Below, a sprawling, vibrant tapestry of greens, browns, and the occasional glint of water stretched to the horizon, eventually giving way to a colossal urban sprawl that seemed to breathe with a life of its own.

"Alright, Spidey, this is your stop," Captain America’s voice crackled over the comms. "Intel suggests a rogue AI, calling itself 'Maya,' has been disrupting critical infrastructure in Mumbai. It’s too subtle for a full Avengers response, but its patterns are… web-like. Thought you might have a knack for it."

Peter Parker, clad in his iconic red and blue, gave a nervous chuckle. "Web-like, huh? Hope it's not a fan of my work. Never thought I’d be doing global tech support."

He leaped from the jet, a thin line of webbing shooting out to catch a distant skyscraper. The wind rushed past him, carrying a symphony of unfamiliar sounds: a cacophony of horns, the distant chanting of a tem

In [21]:
countries = ["India", "Japan", "Brazil"]
for c in countries:
    print(chain.invoke({"SuperHero": "Spider Man", "Country": c}).content)


The humid air of Mumbai hit Peter Parker like a physical force the moment he stepped off the private jet. Not exactly how he usually traveled, but when Tony Stark “suggested” he needed a “cultural experience” after accidentally web-slinging through a diplomat’s garden party, Peter found himself on a very long flight.

His Spider-Man suit, carefully folded into a nondescript backpack, felt strangely heavy. He’d tried to stick to the tourist trail for a day – Gateway of India, Marine Drive, the bustling markets. It was incredible, overwhelming, a symphony of sights, sounds, and smells unlike anything New York had to offer. The sheer volume of people was a spider-sense nightmare, a constant, low thrum of humanity.

He’d just finished a plate of vada pav – spicy, delicious, and making his eyes water – when it happened.

A surge of panic rippled through the crowded street. Not the usual jostling. This was different. A high-pitched, almost digital whine cut through the cacophony. Traffic lig

## Lesson 3 — Simple LLM Chains <a name="lesson-3"></a>

### 🎯 Objective
We’ll learn how to connect a **prompt template** and a **model** into a **chain** so we can run the entire prompt → model → output flow in a single, reusable object.

---

### 📖 What & Why

- **What is an LLM Chain?**  
  A chain in LangChain is a sequence of components that process inputs and produce outputs.  
  The simplest LLM chain has:
  1. A **PromptTemplate**
  2. An **LLM model** (in our case, Gemini)
  3. (Optionally) post-processing logic

- **Why use LLM Chains?**
  - Encapsulates the prompt + model into one reusable unit.
  - Makes code cleaner and easier to maintain.
  - Works well with batch processing and integration into larger workflows.

---

### 🔑 Key Points
1. The **LangChain Expression Language (LCEL)** lets us “pipe” components together using `|`.
2. Once we have a chain, we can run it with `.invoke()` (single input) or `.batch()` (multiple inputs).
3. Chains can be expanded later to include memory, tools, or retrieval.

---

### 🚀 Minimal, Runnable Example (Gemini-Only)

```python
from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI

# Step 1: Create a prompt template
prompt = PromptTemplate(
    input_variables=["topic", "tone"],
    template="Write a {tone} LinkedIn post about {topic}."
)

# Step 2: Initialize Gemini model
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.7)

# Step 3: Pipe prompt into model to form a chain
chain = prompt | model

# Step 4: Run the chain
result = chain.invoke({"topic": "AI-powered analytics", "tone": "professional"})
print(result.content)
```




### 🧪 Time to Practice — Lesson 3

**Goal:**  
We will practice creating a simple LLM chain that combines a `PromptTemplate` and the Gemini model into a single reusable workflow.

---

#### **Part 1 — Basic Interview Question Generator**
1. Create a `PromptTemplate` with:
   - Variables: `{role}` and `{skill}`
   - Template: `"Generate one behavioral interview question for a {role} role that tests {skill}."`
2. Initialize `ChatGoogleGenerativeAI` with:
   - `model="gemini-1.5-flash"`
   - `temperature=0.5`
3. Create a chain by piping the prompt into the model.
4. Invoke the chain with:
   - `role`: `"Data Scientist"`
   - `skill`: `"communication"`
5. Print the generated interview question.

---

#### **Part 2 — Multiple Skills**
1. Create a Python list of at least **three skills** (e.g., `["communication", "problem-solving", "leadership"]`).
2. Loop through the skills list.
3. For each skill, run the chain and print the question.

---

#### **Part 3 — Variety with Temperature**
1. Run the same chain with:
   - `temperature=0.2` (more deterministic)
   - `temperature=0.8` (more creative)
2. Compare how the style and wording of questions change.

---

> When we finish, we’ll discuss how chaining improves workflow reusability and makes it easy to adapt the same logic for multiple inputs.


In [22]:
from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI

# Step 1: Create a prompt template
prompt = PromptTemplate(
    input_variables=["role", "skill"],
    template="Generate one behavioral interview question for a {role} role that tests {skill}."
)

# Step 2: Initialize Gemini model
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.5)

# Step 3: Pipe prompt into model to form a chain
chain = prompt | model

# Step 4: Run the chain
result = chain.invoke({"role": "Data Scientist", "skill": "communication"})
print(result.content)

"Describe a time you had to explain a complex technical finding or analysis to a non-technical audience (e.g., executives, stakeholders).  What approach did you take, and what were the results?  What did you learn from that experience about communicating complex information effectively?"


In [23]:
#2 Loop

skills = ["communication", "problem-solving", "leadership"]

for skill in skills:
  print(skill,": ",chain.invoke({"role": "Data Scientist", "skill": skill}).content)


communication :  "Describe a time you had to explain a complex technical finding or analysis to a non-technical audience (e.g., executives, stakeholders).  What approach did you take, and what was the outcome?  Focus on the communication strategies you employed and how you adapted your message based on your audience."
problem-solving :  Tell me about a time you had to analyze a dataset with significant missing data.  Describe the approach you took to handle the missing values, the rationale behind your choices, and the impact your decision had on the final results.  What would you do differently if you had to repeat the process?
leadership :  Tell me about a time you had to lead a team to overcome a significant challenge in a data project where there were conflicting priorities or differing opinions among team members.  How did you navigate those differences to achieve a successful outcome?  What was your approach to ensuring everyone felt heard and valued, even if their ideas weren't 

In [26]:
# 3 Different temperatures

def run_with_temperature(temp):
    model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=temp)
    chain = prompt | model
    result = chain.invoke({"role": "Data Scientist", "skill": "Leadership"})
    print(f"--- Temperature {temp} ---")
    print(result.content, "\n")

for t in [0.2, 0.5, 0.8,1]:
    run_with_temperature(t)


--- Temperature 0.2 ---
Tell me about a time you had to lead a team through a complex data analysis project where there were conflicting priorities or disagreements among team members regarding the approach. How did you navigate these challenges and ensure the project stayed on track and delivered valuable results? 

--- Temperature 0.5 ---
Tell me about a time you had to lead a team through a complex data analysis project where there were conflicting priorities or differing opinions on the best approach.  How did you navigate those challenges and ensure the team remained productive and collaborative while still delivering high-quality results? 

--- Temperature 0.8 ---
"Tell me about a time you had to lead a team through a complex data analysis project where there were conflicting priorities or disagreements among team members regarding the approach.  How did you navigate these challenges, and what was the outcome?" 

--- Temperature 1 ---
Tell me about a time you had to lead a team t

## Lesson 4 — Adding Memory to Conversations <a name="lesson-4"></a>

### 🎯 Objective
We’ll learn how to add **memory** to a LangChain conversation so the model can remember earlier exchanges and use them in later responses.

---

### 📖 What & Why

- **What is Memory in LangChain?**  
  Memory stores parts of the conversation and feeds them back into the prompt on future calls.

- **Why use it?**
  - Without memory, each call is independent — the model forgets what we said before.
  - With memory, we can build **chatbots**, **assistants**, and **multi-turn conversations** that feel natural and coherent.

---

### 🔑 Key Points
1. Memory is **separate from the LLM** — it’s a component we attach to chains or agents.
2. Common memory types:
   - `ConversationBufferMemory` — stores the entire conversation verbatim (simplest).
   - `ConversationSummaryMemory` — stores a summarized history to save tokens.
3. Memory automatically injects stored history into the **prompt** as a variable.

---

### 🚀 Minimal, Runnable Example


In [27]:
from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.memory import ConversationBufferMemory
from langchain.chains import LLMChain

# Step 1: Define prompt with placeholders
prompt = PromptTemplate(
    input_variables=["history", "input"],
    template=(
        "The following is a friendly conversation between a human and AI.\n"
        "Conversation so far:\n{history}\n"
        "Human: {input}\n"
        "AI:"
    )
)

# Step 2: Create memory object
memory = ConversationBufferMemory(memory_key="history")

# Step 3: Initialize Gemini model
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.7)

# Step 4: Create chain
chain = LLMChain(llm=model, prompt=prompt, memory=memory)

# Step 5: Custom function to debug and run
def run_and_debug(user_input):
    # Get current history from memory
    current_history = memory.load_memory_variables({}).get("history", "")

    # Show the actual prompt sent to Gemini
    print("\n--- Prompt Sent to Model ---")
    print(prompt.format(history=current_history, input=user_input))

    # Run chain and print result
    response = chain.run(input=user_input)
    print("\n--- Model Response ---")
    print(response)

# Simulate conversation
run_and_debug("Hi, my name is Alex.")
run_and_debug("What is my name?")
run_and_debug("Suggest a hobby I might enjoy.")


  memory = ConversationBufferMemory(memory_key="history")
  chain = LLMChain(llm=model, prompt=prompt, memory=memory)
  response = chain.run(input=user_input)



--- Prompt Sent to Model ---
The following is a friendly conversation between a human and AI.
Conversation so far:

Human: Hi, my name is Alex.
AI:

--- Model Response ---
AI: Hello Alex, it's nice to meet you.  How can I help you today?

--- Prompt Sent to Model ---
The following is a friendly conversation between a human and AI.
Conversation so far:
Human: Hi, my name is Alex.
AI: AI: Hello Alex, it's nice to meet you.  How can I help you today?
Human: What is my name?
AI:

--- Model Response ---
AI: Your name is Alex.

--- Prompt Sent to Model ---
The following is a friendly conversation between a human and AI.
Conversation so far:
Human: Hi, my name is Alex.
AI: AI: Hello Alex, it's nice to meet you.  How can I help you today?
Human: What is my name?
AI: AI: Your name is Alex.
Human: Suggest a hobby I might enjoy.
AI:

--- Model Response ---
AI: That depends on what you like!  To give you the best suggestion, tell me a little about your interests. Do you prefer indoor or outdoor a

### 🧪 Time to Practice — Lesson 4

**Goal:**  
We will create a multi-turn conversation chain that remembers past exchanges and optionally displays the full prompt sent to Gemini for debugging.

---

#### **Part 1 — Basic Memory Chat**
1. Use `ConversationBufferMemory` with `memory_key="history"`.
2. Create a `PromptTemplate` with:
   - `{history}` for past conversation.
   - `{input}` for the new user message.
3. Attach the memory to an `LLMChain` with `ChatGoogleGenerativeAI`.
4. Simulate **three turns**:
   - Turn 1: Introduce yourself and state your favorite programming language.
   - Turn 2: Ask the model to recall your favorite programming language.
   - Turn 3: Ask it to suggest a project in that language.

---

#### **Part 2 — Using ConversationSummaryMemory**
1. Replace `ConversationBufferMemory` with `ConversationSummaryMemory`.
2. Keep the same three-turn conversation.
3. Compare:
   - Does the model still remember your favorite programming language?
   - How does the stored history look compared to the full buffer?

---

> **Tip:**  
> Prompt debugging is a powerful way to understand what the LLM “sees” on each turn and why it responds the way it does.


In [28]:

# Part 1

from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.memory import ConversationBufferMemory
from langchain.chains import LLMChain


# Step 1: Define prompt with placeholders
prompt = PromptTemplate(
    input_variables=["history", "input"],
    template=(
        "The following is a friendly conversation between a human and AI.\n"
        "Conversation so far:\n{history}\n"
        "Human: {input}\n"
        "AI:"
    )
)

# Step 2: Create memory object
memory = ConversationBufferMemory(memory_key="history")

# Step 3: Initialize Gemini model
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.7)

# Step 4: Create chain
chain = LLMChain(llm=model, prompt=prompt, memory=memory)

# Step 5: Custom function to debug and run
def run_and_debug(user_input):
    # Get current history from memory
    current_history = memory.load_memory_variables({}).get("history", "")

    # Show the actual prompt sent to Gemini
    print("\n--- Prompt Sent to Model ---")
    print(prompt.format(history=current_history, input=user_input))

    # Run chain and print result
    response = chain.run(input=user_input)
    print("\n--- Model Response ---")
    print(response)

# Simulate conversation
run_and_debug("Hi, my name is Sudeep. My Favourite Programming Language is Python")
run_and_debug("What is my favourite Programming Language?")
run_and_debug("Suggest a project I might enjoy in my favourite programming language?.")



--- Prompt Sent to Model ---
The following is a friendly conversation between a human and AI.
Conversation so far:

Human: Hi, my name is Sudeep. My Favourite Programming Language is Python
AI:

--- Model Response ---
AI: Hi Sudeep! It's nice to meet you. Python is a great choice for a favorite programming language.  It's versatile and widely used. What do you enjoy most about using Python?

--- Prompt Sent to Model ---
The following is a friendly conversation between a human and AI.
Conversation so far:
Human: Hi, my name is Sudeep. My Favourite Programming Language is Python
AI: AI: Hi Sudeep! It's nice to meet you. Python is a great choice for a favorite programming language.  It's versatile and widely used. What do you enjoy most about using Python?
Human: What is my favourite Programming Language?
AI:

--- Model Response ---
Your favorite programming language is Python.

--- Prompt Sent to Model ---
The following is a friendly conversation between a human and AI.
Conversation so 

In [30]:
from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.memory import ConversationSummaryMemory
from langchain.chains import LLMChain

# Prompt
prompt = PromptTemplate(
    input_variables=["history", "input"],
    template=(
        "The following is a friendly conversation between a human and AI.\n"
        "Conversation so far:\n{history}\n"
        "Human: {input}\n"
        "AI:"
    )
)

# Model for both chat and summarization (you may also use two separate instances)
chat_llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.7)
summary_llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.0)  # deterministic for summaries

# Memory requires an LLM
memory = ConversationSummaryMemory(llm=summary_llm, memory_key="history")

# Chain with memory
chain = LLMChain(llm=chat_llm, prompt=prompt, memory=memory)

# Debug helper
def run_and_debug(user_input: str):
    current_history = memory.load_memory_variables({}).get("history", "")
    print("\n--- Prompt Sent to Model ---")
    print(prompt.format(history=current_history, input=user_input))
    response = chain.run(input=user_input)
    print("\n--- Model Response ---")
    print(response)

# Conversation
run_and_debug("Hi, my name is Sudeep. My favorite programming language is Python.")
run_and_debug("What is my favorite programming language?")
run_and_debug("Suggest a project I might enjoy in my favorite programming language.")



--- Prompt Sent to Model ---
The following is a friendly conversation between a human and AI.
Conversation so far:

Human: Hi, my name is Sudeep. My favorite programming language is Python.
AI:

--- Model Response ---
Hi Sudeep!  It's nice to meet you. Python's a great choice – it's versatile and widely used.  What are you working on with Python at the moment, if you don't mind me asking?

--- Prompt Sent to Model ---
The following is a friendly conversation between a human and AI.
Conversation so far:
Sudeep introduced himself, stating his name is Sudeep and that his favorite programming language is Python. The AI responded with a greeting, complimented his choice of Python, noting its versatility and widespread use, and then asked about his current Python projects.
Human: What is my favorite programming language?
AI:

--- Model Response ---
You said your favorite programming language is Python.

--- Prompt Sent to Model ---
The following is a friendly conversation between a human an

## Lesson 5 — Multiple Chains & Sequential Execution <a name="lesson-5"></a>

### 🎯 Objective
We’ll learn how to run **multiple LangChain chains in sequence**, passing the output of one chain as the input to another.  
This is useful for **multi-step workflows**, where each step builds on the result of the previous step.

---

### 📖 What & Why

#### **What is an `LLMChain`?**
- An **LLMChain** is the most common LangChain building block.
- It’s a **wrapper** that connects:
  1. A **PromptTemplate** (instructions + variables).
  2. An **LLM model** (e.g., Gemini, OpenAI).
- When you call `.run()` or `.invoke()`, it:
  - Fills the prompt with your inputs.
  - Sends it to the LLM.
  - Returns the model’s output.

**Why use it?**
- Keeps prompt logic and LLM configuration in one place.
- Makes the chain **reusable** across workflows.

---

#### **What is a `SimpleSequentialChain`?**
- A **SimpleSequentialChain** runs **two or more LLMChains back-to-back**.
- It assumes:
  - Each chain’s output is **a single string**.
  - That string is passed as the **only input** to the next chain.

**When to use:**
- When your steps pass just **one string result** to the next step.
- For more complex workflows (multiple variables), you’d use `SequentialChain`.

---

### 🔑 Key Points
1. `LLMChain` = Prompt + LLM → single callable unit.
2. `SimpleSequentialChain` = Chain of `LLMChain` objects, passing output sequentially.
3. Each chain should have compatible input/output formats for smooth passing.

---

### 🚀 Minimal, Runnable Example (Gemini-Only, SimpleSequentialChain)



In [31]:
from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.chains import LLMChain, SimpleSequentialChain

# Step 1: Initialize the model (can be reused across chains)
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.7)

# Step 2: First chain → Idea generation
prompt_idea = PromptTemplate(
    input_variables=["topic"],
    template="Generate one creative startup idea about {topic}."
)
chain_idea = LLMChain(llm=model, prompt=prompt_idea)

# Step 3: Second chain → Elevator pitch creation
prompt_pitch = PromptTemplate(
    input_variables=["text"],
    template="Write a short, engaging elevator pitch for this startup idea: {text}"
)
chain_pitch = LLMChain(llm=model, prompt=prompt_pitch)

# Step 4: Combine into a SimpleSequentialChain
overall_chain = SimpleSequentialChain(chains=[chain_idea, chain_pitch])

# Step 5: Run the combined chain
result = overall_chain.run("AI-powered analytics")
print(result)


Is your non-profit drowning in data but struggling to tell your impact story?  ImpactVerse uses AI to transform your numbers into compelling narratives and stunning visuals, automatically generating reports, fundraising materials, and even predicting future outcomes.  We empower non-profits to connect with donors and achieve greater success – all at an affordable price.


### 🧪 Time to Practice — Lesson 5

**Goal:**  
We will practice building a multi-step workflow using `LLMChain` and `SimpleSequentialChain` so that one chain’s output becomes the next chain’s input.

---

#### **Part 1 — Blog Post Workflow**
1. **First Chain** — Blog Outline Generator:
   - Input variable: `{topic}`
   - Task: Generate a blog post outline in bullet points about the topic.
2. **Second Chain** — Intro Paragraph Writer:
   - Input variable: `{text}`
   - Task: Write a short introductory paragraph based on the outline.
3. Combine both chains using `SimpleSequentialChain`.
4. Run the workflow with:
   - `topic = "The Future of Comedy in the Age of AI"`
5. Print the final output.

---

#### **Part 2 — Experiment with Temperature**
1. Run the same workflow with:
   - Both chains having `temperature=0.2` (more deterministic).
   - Both chains having `temperature=0.8` (more creative).
2. Compare how the structure, detail, and tone change.

---

#### **Part 3 — Three-Step Workflow (Optional)**
1. Add a third chain:
   - Input variable: `{text}`
   - Task: Summarize the blog post into **a single engaging tweet**.
2. Update the `SimpleSequentialChain` to include all three chains.
3. Run the workflow end-to-end and check for coherence across all steps.

---

> **Tip:**  
> Keep your prompt instructions very clear for each step so the model knows exactly what’s expected.


In [37]:
from langchain.prompts import PromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.chains import LLMChain, SimpleSequentialChain

# Step 1: Initialize the model (can be reused across chains)
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.9)

# Step 2: First chain → Outline generator
prompt_blog_outline = PromptTemplate(
    input_variables=["topic"],
    template="Generate a blog post outline in bullet points about the topic: {topic}."
)
chain_blog_outline = LLMChain(llm=model, prompt=prompt_blog_outline)

# Step 3: Second chain → Elevator pitch creation
prompt_blog_paragraph = PromptTemplate(
    input_variables=["text"],
    template="Write a short blog post based on the outline: {text}"
)
chain_blog_paragraph = LLMChain(llm=model, prompt=prompt_blog_paragraph)


prompt_blog_tweet = PromptTemplate(
     input_variables=["text"],
    template=" Summarize the blog post into a single engaging tweet. Here's the blog Post content: {text}."
)
chain_blog_tweet = LLMChain(llm = model,prompt = prompt_blog_tweet )

# Step 4: Combine into a SimpleSequentialChain
overall_chain = SimpleSequentialChain(chains=[chain_blog_outline, chain_blog_paragraph,chain_blog_tweet ])

# Step 5: Run the combined chain
result = overall_chain.run("The Future of Comedy in the Age of AI")
print(result)


Can AI write jokes?  Yes, but can it *be* funny? 🤔  A new blog post explores AI's growing role in comedy—from writing scripts to tailoring shows—but argues the human element, with its emotional depth & lived experience, remains irreplaceable.  #AI #comedy #artificialintelligence #humor


## Lesson 6 — Introduction to Agents <a name="lesson-6"></a>

### 🎯 Objective
We’ll learn what **agents** are in LangChain, how they work, and why they’re different from fixed chains.  
Agents let the LLM **decide at runtime** what steps to take and which tools to use to accomplish a goal.

---

### 📖 What & Why

#### **What is an Agent?**
- An **Agent** is a LangChain component that:
  1. Gets a **goal** or question from the user.
  2. Chooses which **tool(s)** to use (search, calculator, database, etc.).
  3. Decides **the order** of tool usage.
  4. Loops until the goal is met.

#### **Why use Agents?**
- **Chains** are static: they run the same steps every time.
- **Agents** are dynamic: they reason about the problem, decide the next step, and choose tools as needed.
- Great for:
  - Tasks with uncertain steps.
  - Real-time data fetching.
  - Multi-step problem solving.

---

### 🔑 New Concepts in This Lesson

#### 1. **`Tool`**
- A `Tool` in LangChain is a wrapper around a Python function that the LLM can call.
- The tool has:
  - `name`: Identifier the LLM uses.
  - `func`: The Python function that will run when the tool is called.
  - `description`: A natural language explanation of when and how to use the tool.
- **Why wrap functions in a Tool?**  
  The agent doesn't execute Python directly — it decides *which tool* to call, and LangChain handles running the underlying function.

#### 2. **`initialize_agent`**
- Function that creates an agent by combining:
  - Your **tool list**.
  - An **LLM**.
  - An **agent type** (how the LLM decides on actions).
- Returns an **agent executor** you can run.

#### 3. **`AgentType.ZERO_SHOT_REACT_DESCRIPTION`**
- A built-in LangChain agent type.
- "Zero-shot" = The LLM decides what to do without seeing examples of previous runs.
- "React" = Based on the **Reason + Act** framework:
  1. **Reason** about the task.
  2. Choose a **tool**.
  3. Act by calling the tool.
  4. Observe the result.
  5. Repeat until finished.
- "Description" = Uses your tool descriptions to decide which one to call.

---

### 🚀 Minimal, Runnable Example (Gemini-Only, Calculator Agent)


In [38]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import initialize_agent, Tool, AgentType
from langchain.prompts import PromptTemplate

# Step 1: Create a simple tool (Python function)
def multiply_numbers(query: str) -> str:
    numbers = [int(x) for x in query.split() if x.isdigit()]
    result = 1
    for num in numbers:
        result *= num
    return str(result)

# Wrap the tool for LangChain
tools = [
    Tool(
        name="Multiplier",
        func=multiply_numbers,
        description="Multiplies all numbers in a given string."
    )
]

# Step 2: Initialize Gemini model
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0)

# Step 3: Create the agent
agent = initialize_agent(
    tools=tools,
    llm=model,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Step 4: Ask the agent to use the tool
agent.run("Multiply 7 and 8")


  agent = initialize_agent(




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to multiply the numbers 7 and 8.  I can use the Multiplier tool for this.
Action: Multiplier
Action Input: "7 8"[0m
Observation: [36;1m[1;3m56[0m
Thought:[32;1m[1;3mThought: I now know the final answer
Final Answer: 56[0m

[1m> Finished chain.[0m


'56'

### 🧪 Time to Practice — Lesson 6

**Goal:**  
We will practice creating an agent that can dynamically use a custom Python tool to search through a list of movies.

---

#### **Part 1 — Build the Tool**
1. Create a Python function named `search_movies(query: str)` that:
   - Searches a predefined Python list of movie titles (at least 10 titles).
   - Returns all matches that contain the query term (case-insensitive).
   - If no match is found, return `"No results found."`.
2. Wrap the function in a `Tool` object:
   - `name`: `"Movie Search Tool"`
   - `func`: Your `search_movies` function.
   - `description`: `"Searches for movies in a predefined list by matching the given keyword."`

---

#### **Part 2 — Create the Agent**
1. Initialize `ChatGoogleGenerativeAI` with:
   - `model="gemini-1.5-flash"`
   - `temperature=0` (for predictable output).
2. Use `initialize_agent` to create the agent with:
   - Your movie search tool inside the `tools` list.
   - `AgentType.ZERO_SHOT_REACT_DESCRIPTION`.
   - `verbose=True` to print the reasoning steps.

---

#### **Part 3 — Test the Agent**
1. Ask: `"Find all movies that have the word 'Matrix' in them"`.
2. Ask: `"List all movies starting with the letter I"`.
3. Ask: `"Do we have any movies containing the word 'Star'?"`.

---

#### **Part 4 — Stretch Challenge (Optional)**
1. Add a second tool called `count_movies`:
   - Takes no arguments.
   - Returns the total number of movies in the list.
2. Update the agent to include both tools.
3. Test with: `"How many movies are in our collection?"`.

---

> **Tip:**  
> When designing tools, make the `description` very clear — the agent uses this text to decide when and how to call the tool.


In [7]:
# Lesson 6 — Agent with Custom Tools (Solution)

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import initialize_agent, Tool, AgentType

# Sample dataset
MOVIES = [
    "The Matrix",
    "The Matrix Reloaded",
    "The Matrix Revolutions",
    "Interstellar",
    "Inception",
    "Inside Out",
    "Iron Man",
    "The Imitation Game",
    "Star Wars: A New Hope",
    "Star Trek",
    "Indiana Jones and the Last Crusade",
    "Whiplash",
    "Arrival",
    "Inglourious Basterds",
]

# Tool 1: substring search (case-insensitive)
def search_movies(query: str) -> str:
    q = query.strip().lower()
    if not q:
        return "No results found."
    results = [t for t in MOVIES if q in t.lower()]
    return "\n".join(results) if results else "No results found."

# Tool 2: prefix search (case-insensitive)
def search_movies_by_prefix(prefix: str) -> str:
    p = prefix.strip().lower()
    if not p:
        return "No results found."
    results = [t for t in MOVIES if t.lower().startswith(p)]
    return "\n".join(results) if results else "No results found."

# Tool 3 (stretch): count movies
def count_movies(_: str = "") -> str:
    return str(len(MOVIES))

# Wrap tools for the agent
tools = [
    Tool(
        name="Movie Search Tool",
        func=search_movies,
        description=(
            "Searches a predefined list of movie titles and returns all titles containing the given keyword. "
            "Use for general keyword searches like 'Matrix' or 'Star'."
        ),
    ),
    Tool(
        name="Movie Prefix Tool",
        func=search_movies_by_prefix,
        description=(
            "Returns all movie titles that start with the provided prefix (case-insensitive). "
            "Use for requests like 'starting with the letter I'."
        ),
    ),
    Tool(
        name="Movie Count Tool",
        func=count_movies,
        description=(
            "Returns the total number of movies in the collection. "
            "Use when asked to count movies."
        ),
    ),
]

# Initialize Gemini model (temperature 0 for predictable tool use)
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0)

# Create the agent
agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,  # Set to False to reduce logs
)

# Tests
print("\n--- Query 1 ---")
print(agent.run("Find all movies that have the word 'Matrix' in them"))

print("\n--- Query 2 ---")
print(agent.run("List all movies starting with the letter I"))

print("\n--- Query 3 ---")
print(agent.run("Do we have any movies containing the word 'Star'?"))

print("\n--- Stretch: Count ---")
print(agent.run("How many movies are in our collection?"))

print("\n--- : Count ---")
print(agent.run("How many movies are in our collection?"))


# Negative tests
print("\n--- Negative: No Matches ---")
print(agent.run("Find all movies containing 'Batman'"))

print("\n--- Negative: Empty Search Term ---")
print(agent.run("Find all movies containing ''"))

print("\n--- Negative: Invalid Prefix ---")
print(agent.run("List all movies starting with 'ZZZ'"))



  agent = initialize_agent(
  print(agent.run("Find all movies that have the word 'Matrix' in them"))



--- Query 1 ---


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to search for movies containing the keyword "Matrix".  The Movie Search Tool is appropriate for this.
Action: Movie Search Tool
Action Input: Matrix[0m
Observation: [36;1m[1;3mThe Matrix
The Matrix Reloaded
The Matrix Revolutions[0m
Thought:[32;1m[1;3mThought: I now know the final answer
Final Answer: The Matrix, The Matrix Reloaded, The Matrix Revolutions[0m

[1m> Finished chain.[0m
The Matrix, The Matrix Reloaded, The Matrix Revolutions

--- Query 2 ---


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to use the Movie Prefix Tool to find all movies starting with 'I'.
Action: Movie Prefix Tool
Action Input: I[0m
Observation: [33;1m[1;3mInterstellar
Inception
Inside Out
Iron Man
Indiana Jones and the Last Crusade
Inglourious Basterds[0m
Thought:[32;1m[1;3mThought: I now know the final answer
Final Answer: Interstellar, Inception, Inside Out, Iron Man,

## Lesson 7 — Building a Custom Tool <a name="lesson-7"></a>

### 🎯 Objective
We’ll learn how to create **custom tools** the agent can call at runtime. We will:
- Wrap a Python function as a **Tool** (simple arguments).
- Define a **Structured Tool** with a schema (validated, multi-argument calls).
- Add both tools to an agent and see them used through the agent loop.

---

### 📖 What & Why

#### What is a Tool?
A **Tool** is a callable capability that the agent can invoke to do non‑LLM work (e.g., search, math, DB lookups). Tools are wrappers around Python functions with **name** and **description** so the LLM knows when to call them.

- **Why tools?** Agents can’t access data or perform actions by themselves; tools are how they interact with the outside world.

#### `Tool` vs `StructuredTool`
- **`Tool`** (simple): Wraps a Python function that takes a **single string** input and returns a string. Good for quick utilities and simple prompts like “search for X”.
- **`StructuredTool` / `@tool(..., args_schema=...)`**: Adds a **Pydantic schema** for multiple/typed arguments with validation. The agent can pass structured kwargs reliably.

## 🛠 Tool vs Structured Tool in LangChain

### 1. **Tool**
- **Definition:** A wrapper around a **single-argument Python function** that the agent can call.
- **Input/Output:** Takes one string as input and returns one string as output.
- **Use Case:** Quick, simple actions where all necessary input can be expressed in one string.
- **Example:**

```python
from langchain.agents import Tool

def keyword_search(query: str) -> str:
    return f"Searching for: {query}"

search_tool = Tool(
    name="Keyword Search",
    func=keyword_search,
    description="Search for items that contain a given keyword."
)
```


### 2. **Structured Tool **
- **Definition:** A wrapper around a Python function that can take multiple typed arguments, validated with a Pydantic schema.

- **Input/Output:**  Accepts structured, typed inputs (e.g., integers, floats, strings, booleans), not just one string.

- **Use Case:** When you want automatic input validation so the agent doesn’t send wrong types or missing fields.

- **Example:**

```python
from langchain.tools import tool
from pydantic import BaseModel, Field

# Define the schema
class FilterArgs(BaseModel):
    prefix: str = Field(..., description="Title must start with this prefix.")
    min_length: int = Field(0, description="Minimum title length.")

# Create the structured tool
@tool("filtered_search", args_schema=FilterArgs)
def filtered_search(prefix: str, min_length: int = 0) -> str:
    return f"Filtering titles starting with {prefix}, min length {min_length}"
```
---


### 📊 Comparison Table

| Feature               | Tool                           | Structured Tool                        |
|-----------------------|--------------------------------|-----------------------------------------|
| Input type            | Single `str`                   | Multiple typed arguments                |
| Validation            | ❌ None                        | ✅ Pydantic validation                   |
| Example               | Keyword search                 | Filtered search with prefix & min length|
| Agent compatibility   | Works with most classic agents | Needs tool-calling or OpenAI-style agents|

---

### 3. **What is a Pydantic schema?**

Pydantic is a Python library for data validation and type enforcement.

- A schema is a Python class (subclass of BaseModel) that:

- Lists all the inputs (fields) your function needs.

- Specifies the type of each field (str, int, float, etc.).

- Adds optional descriptions (Field(description="...")).

Benefits in LangChain tools:

- The LLM sees the schema and knows exactly what inputs to provide.

- Prevents malformed tool calls (e.g., sending a string where an integer is required).
---

#### `initialize_agent`
Combines:
- Your **tools** list
- The **LLM**
- An **agent type** (the decision framework, e.g., `ZERO_SHOT_REACT_DESCRIPTION`)
and returns an **agent executor**. Calling `.run(prompt)` lets the LLM select tools, call them, observe results, and iterate.

---

### 🔧 New APIs Explained

- **`Tool(name, func, description)`**
  - `name`: Short identifier the LLM will mention when choosing a tool.
  - `func`: Python function to execute.
  - `description`: Natural language guide that the LLM reads to decide when to use it.
- **`@tool(args_schema=...)` (decorator)**
  - Wraps a function as a **structured tool**.
  - `args_schema`: a Pydantic `BaseModel` describing parameters, types, and descriptions.
- **`AgentType.ZERO_SHOT_REACT_DESCRIPTION`**
  - Zero-shot: no few-shot examples; agent decides steps from scratch.
  - ReAct: **Reason → Act → Observe → Repeat**.
  - Description: Chooses tools based on your tool **descriptions**.

---



In [60]:
# ✅ Example: Tool vs Structured Tool in LangChain (Gemini)

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import initialize_agent, Tool, AgentType
from langchain.tools import tool
from pydantic import BaseModel, Field

# Sample dataset
MOVIES = ["The Matrix", "Interstellar", "Inception", "Inside Out", "Iron Man"]

# ------------------------------
# 1) Tool — Single String Input
# ------------------------------
def keyword_search(query: str) -> str:
    """Returns all movies containing the keyword (case-insensitive)."""
    q = query.strip().lower()
    hits = [m for m in MOVIES if q in m.lower()]
    return "\n".join(hits) if hits else "No results found."

simple_tool = Tool(
    name="Movie Keyword Search",
    func=keyword_search,
    description="Search for movies containing the given keyword (string)."
)

# ------------------------------
# 2) Structured Tool — Multiple Typed Arguments
# ------------------------------
class FilterArgs(BaseModel):
    prefix: str = Field(..., description="Title must start with this prefix (case-insensitive).")
    min_length: int = Field(0, description="Minimum title length in characters.")

@tool("filtered_movie_search", args_schema=FilterArgs)
def filtered_movie_search(prefix: str, min_length: int = 0) -> str:
    """Returns movies starting with a given prefix and meeting min length."""
    hits = [m for m in MOVIES if m.lower().startswith(prefix.lower()) and len(m) >= min_length]
    return "\n".join(hits) if hits else "No results found."

# ------------------------------
# 3) Run Tool directly
# ------------------------------
print("\n--- Direct Tool Test ---")
print("Keyword Search('Matrix') →")
print(simple_tool.run("Matrix"))

# ------------------------------
# 4) Run Structured Tool directly
# ------------------------------
print("\n--- Direct Structured Tool Test ---")
print("Filtered Search(prefix='In', min_length=10) →")
print(filtered_movie_search.run({"prefix": "In", "min_length": 10}))

# ------------------------------
# 5) Using the Tool inside a classic agent
# ------------------------------
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0)

# Only add simple_tool here — classic ZERO_SHOT agent can't use multi-input tools
agent_simple = initialize_agent(
    tools=[simple_tool],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    max_iterations = 3,
    verbose=True
)

print("\n--- Agent using simple Tool ---")
print(agent_simple.run("Find movies with the word Matrix"))

# ------------------------------
# 6) Structured tools require a tool-calling agent (not classic ZERO_SHOT)
# ------------------------------
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents import create_tool_calling_agent, AgentExecutor

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Use tools when needed."),
    ("human", "{input}"),
    MessagesPlaceholder("agent_scratchpad")
])

tools_for_structured = [filtered_movie_search]
agent_structured = create_tool_calling_agent(llm=llm, tools=tools_for_structured, prompt=prompt)
executor = AgentExecutor(agent=agent_structured, tools=tools_for_structured, verbose=True)

print("\n--- Agent using Structured Tool ---")
print(executor.invoke({"input": "List movies starting with 'In' that are at least 10 characters long"})["output"])



--- Direct Tool Test ---
Keyword Search('Matrix') →
The Matrix

--- Direct Structured Tool Test ---
Filtered Search(prefix='In', min_length=10) →
Interstellar
Inside Out

--- Agent using simple Tool ---


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to search for movies that contain the keyword "Matrix".  I can use the Movie Keyword Search tool for this.
Action: Movie Keyword Search
Action Input: Matrix[0m
Observation: [36;1m[1;3mThe Matrix[0m
Thought:[32;1m[1;3mThought: I need to search for movies that contain the keyword "Matrix".  I can use the Movie Keyword Search tool for this.
Action: Movie Keyword Search
Action Input: Matrix[0m
Observation: [36;1m[1;3mThe Matrix[0m
Thought:[32;1m[1;3mThought: I need to search for movies that contain the keyword "Matrix".  I can use the Movie Keyword Search tool for this.
Action: Movie Keyword Search
Action Input: Matrix[0m
Observation: [36;1m[1;3mThe Matrix[0m
Thought:[32;1m[1;3m[0m

[1m> Finish

In [55]:
print("\n--- Agent using Structured Tool Negative test ---")
print(executor.invoke({"input": "List movies that start with The and is 100 characters long "})["output"])


--- Agent using Structured Tool Negative test ---


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `filtered_movie_search` with `{'min_length': 100.0, 'prefix': 'The'}`


[0m[36;1m[1;3mNo results found.[0m[32;1m[1;3mThere are no movies found that start with "The" and have a length of 100 characters.
[0m

[1m> Finished chain.[0m
There are no movies found that start with "The" and have a length of 100 characters.



## Lesson 8 — Loading and Splitting Documents <a name="lesson-8"></a>

### 🎯 Objective
We’ll learn how to:
1. **Load documents** into LangChain from common sources (text, PDFs, web pages).
2. **Split documents** into smaller chunks so LLMs can process them efficiently.

---

### 📖 What & Why

#### **What is a Document in LangChain?**
- A `Document` in LangChain is an object with:
  - `.page_content`: The text of the document (string).
  - `.metadata`: Extra info about the document (e.g., source, page number).
- LLMs have context length limits — they can’t process entire books or long PDFs in one go.

#### **Why Split Documents?**
- Splitting long text into **chunks**:
  - Prevents exceeding token limits.
  - Makes retrieval more relevant (you fetch only the most relevant chunks).
  - Improves RAG performance and speed.

---

### 🔑 New Classes

#### 1. **Document Loaders**
LangChain provides many loaders for different formats:
- `TextLoader` — Loads plain text files.
- `PyPDFLoader` — Loads PDFs, one page per document.
- `UnstructuredFileLoader` — Handles multiple formats (PDF, DOCX, etc.).
- `WebBaseLoader` — Loads text from a web URL.

#### 2. **Text Splitters**
- `CharacterTextSplitter` — Splits by character count.
- `RecursiveCharacterTextSplitter` — Splits by characters, but tries to keep logical boundaries (paragraphs, sentences).
- **Parameters:**
  - `chunk_size`: Max characters/tokens per chunk.
  - `chunk_overlap`: How much content overlaps between chunks (helps preserve context).

---


In [8]:
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Step 1: Load document
loader = TextLoader("sample.txt", encoding="utf-8")  # Replace with your file
docs = loader.load()
print(f"Loaded {len(docs)} document(s).")
print("Sample content:", docs[0].page_content[:200])

# Step 2: Split into chunks
splitter = RecursiveCharacterTextSplitter(
    chunk_size=200,
    chunk_overlap=50
)
chunks = splitter.split_documents(docs)
print(f"Split into {len(chunks)} chunks.")
print("First chunk:", chunks[0].page_content)

Loaded 1 document(s).
Sample content: LangChain is a framework for developing applications powered by large language models (LLMs). It provides a standard interface for chains, modules for various tasks, and integration with many LLM prov
Split into 8 chunks.
First chunk: LangChain is a framework for developing applications powered by large language models (LLMs). It provides a standard interface for chains, modules for various tasks, and integration with many LLM


### 🧪 Time to Practice — Lesson 8

**Goal:**  
We will practice loading text from files and other sources, then splitting it into smaller chunks for LLM processing.



**Content**
---
LangChain is a framework for developing applications powered by large language models (LLMs). It provides a standard interface for chains, modules for various tasks, and integration with many LLM providers.

One of the core challenges when working with LLMs is managing the limited context length. If you try to send too much text at once, the model may fail to process it or lose important details. LangChain addresses this by allowing you to split documents into smaller, overlapping chunks.

Splitting text into chunks improves retrieval quality when you search for relevant information. For example, in a Retrieval-Augmented Generation (RAG) setup, a user's query is matched against these smaller chunks, and only the most relevant ones are sent to the LLM. This reduces the token cost and improves accuracy.

LangChain supports a variety of document loaders for different formats, including plain text, PDFs, HTML, and Word documents. It also offers multiple text splitting strategies, such as character-based and recursive splitting, which respect natural boundaries like sentences and paragraphs.

---

#### **Part 1 — Load a Local Text File**
1. Save the provided sample content into a file named `sample.txt`.
2. Use `TextLoader` to load the file.
3. Print:
   - Number of documents loaded.
   - First 150 characters of the first document.

---

#### **Part 2 — Split into Chunks**
1. Use `RecursiveCharacterTextSplitter` with:
   - `chunk_size=150`
   - `chunk_overlap=30`
2. Print:
   - Total number of chunks.
   - The first 2 chunks’ text.

---

#### **Part 3 — Load a PDF**
1. Install:
   ```bash
   pip install pypdf
   ```
2. Use PyPDFLoader to load any PDF file you have.

3. Split the loaded content into chunks (choose your own chunk_size and chunk_overlap).

4. Print:
  * The total number of chunks.
  * The metadata and first 200 characters of the first chunk.

---
#### **Part 4 — Load from a URL (Stretch)**
1. Install the required library for web scraping:

```bash
pip install beautifulsoup4
```

2. Use WebBaseLoader to load content from any public web page.

3. Split it into chunks using RecursiveCharacterTextSplitter.

4. Print:
  * The total number of chunks.
  * The text of the first chunk.

In [2]:
# Part 1 — Load a Local Text File

from langchain_community.document_loaders import TextLoader

# Step 1: Create loader for sample.txt
loader = TextLoader("sample.txt", encoding="utf-8")

# Step 2: Load document(s)
docs = loader.load()

# Step 3: Print results
print(f"Total documents loaded: {len(docs)}")
if docs:
    print("\nFirst 200 characters of first document:\n")
    print(docs[0].page_content[:200])
else:
    print("No documents found.")


ModuleNotFoundError: No module named 'langchain_community'

In [62]:
# Part 2 — Split into Chunks

from langchain.text_splitter import RecursiveCharacterTextSplitter

# Step 1: Create splitter
splitter = RecursiveCharacterTextSplitter(
    chunk_size=150,      # max characters per chunk
    chunk_overlap=30     # overlap between chunks to preserve context
)

# Step 2: Split the loaded docs into smaller chunks
chunks = splitter.split_documents(docs)

# Step 3: Print results
print(f"Total chunks created: {len(chunks)}\n")

# Preview the first two chunks
for i, chunk in enumerate(chunks[:2]):
    print(f"--- Chunk {i+1} ---")
    print(chunk.page_content)
    print()


Total chunks created: 11

--- Chunk 1 ---
LangChain is a framework for developing applications powered by large language models (LLMs). It provides a standard interface for chains, modules for

--- Chunk 2 ---
for chains, modules for various tasks, and integration with many LLM providers.



In [66]:
!pip install pypdf

Collecting pypdf
  Downloading pypdf-5.9.0-py3-none-any.whl.metadata (7.1 kB)
Downloading pypdf-5.9.0-py3-none-any.whl (313 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m313.2/313.2 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pypdf
Successfully installed pypdf-5.9.0


In [67]:
# Part 3 — Load a PDF and Split


# 1) Load PDF (replace with your file path, e.g., "documents/report.pdf")
from langchain_community.document_loaders import PyPDFLoader
pdf_path = "my_doc.pdf"  # <-- change this to your actual PDF
loader = PyPDFLoader(pdf_path)

docs_pdf = loader.load()
print(f"Total PDF pages loaded as documents: {len(docs_pdf)}")
if not docs_pdf:
    raise FileNotFoundError(f"No pages loaded. Check the path: {pdf_path}")

# Show first page metadata and preview
print("\n--- First page metadata ---")
print(docs_pdf[0].metadata)
print("\n--- First 200 chars of first page ---")
print(docs_pdf[0].page_content[:200])

# 2) Split into chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=60)
chunks_pdf = splitter.split_documents(docs_pdf)

print(f"\nTotal PDF chunks: {len(chunks_pdf)}")
print("\n--- First chunk metadata ---")
print(chunks_pdf[0].metadata)
print("\n--- First 200 chars of first chunk ---")
print(chunks_pdf[0].page_content[:200])


Total PDF pages loaded as documents: 4

--- First page metadata ---
{'producer': 'Acrobat Distiller 6.0 (Windows)', 'creator': 'Illustrator', 'creationdate': '2006-06-06T16:59:12+10:00', 'subject': 'test', 'author': 'Anne', 'moddate': '2006-06-06T16:59:29+10:00', 'title': 'test', 'source': 'my_doc.pdf', 'total_pages': 4, 'page': 0, 'page_label': '1'}

--- First 200 chars of first page ---
ANOTHER LOOK AT FORECAST -ACCURACY METRICS
FOR INTERMITTENT DEMAND
by Rob J. Hyndman
Preview: Some traditional measurements of forecast accuracy are unsuitable for intermittent-demand data
because the

Total PDF chunks: 72

--- First chunk metadata ---
{'producer': 'Acrobat Distiller 6.0 (Windows)', 'creator': 'Illustrator', 'creationdate': '2006-06-06T16:59:12+10:00', 'subject': 'test', 'author': 'Anne', 'moddate': '2006-06-06T16:59:29+10:00', 'title': 'test', 'source': 'my_doc.pdf', 'total_pages': 4, 'page': 0, 'page_label': '1'}

--- First 200 chars of first chunk ---
ANOTHER LOOK AT FORECAST -ACCU

In [72]:
# Part 4 — Load from a URL and Split

# 1) Install dependency (run once per environment)
import bs4  # noqa

# 2) Load web page content
from langchain_community.document_loaders import WebBaseLoader

url = "https://python.langchain.com/docs/introduction/"  # <-- replace with a public page you want to test
loader = WebBaseLoader(url)

docs_web = loader.load()
print(f"Total web documents loaded: {len(docs_web)}")
if not docs_web:
    raise RuntimeError(f"No content loaded from URL: {url}")

# Show basic metadata and a short preview
print("\n--- First doc metadata ---")
print(docs_web[0].metadata)
print("\n--- First 300 chars of first doc ---")
print(docs_web[0].page_content[:300])

# 3) Split into chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=60)
chunks_web = splitter.split_documents(docs_web)

print(f"\nTotal web chunks: {len(chunks_web)}")
print("\n--- First chunk preview ---")
print(chunks_web[0].page_content[:300])


Total web documents loaded: 1

--- First doc metadata ---
{'source': 'https://python.langchain.com/docs/introduction/', 'title': 'Introduction | \uf8ffü¶úÔ∏è\uf8ffüîó LangChain', 'description': 'LangChain is a framework for developing applications powered by large language models (LLMs).', 'language': 'en'}

--- First 300 chars of first doc ---





Introduction | ü¶úÔ∏èüîó LangChain








Skip to main contentOur Building Ambient Agents with LangGraph course is now available on LangChain Academy!IntegrationsAPI ReferenceMoreContributingPeopleError referenceLangSmithLangGraphLangChain HubLangChain JS/TSv0.3v0.3v0.2v0.1üí¨SearchIntroduc

Total web chunks: 55

--- First chunk preview ---
Introduction | ü¶úÔ∏èüîó LangChain


## Lesson 9 — Embeddings & Vector Stores (FAISS) <a name="lesson-9"></a>

### 🎯 Objective
We’ll learn how to convert text chunks into **numeric embeddings**, store them in a **vector database (FAISS)**, and perform **similarity search** to power retrieval for RAG.

---

### 📖 What & Why

#### What are Embeddings?
Embeddings are **dense numeric vectors** that represent text such that **semantic similarity ≈ vector proximity**. If two texts are similar, their vectors are close in vector space.

#### Why do we need them?
LLMs can’t “search” long corpora by themselves. We:
1) **Embed** our documents into vectors  
2) **Index** them in a vector store  
3) **Retrieve** the top‑k most similar chunks for a query  
4) Feed those chunks to the LLM as **context** (RAG)

---

### 🔑 New Classes / Functions

- **`GoogleGenerativeAIEmbeddings`**  
  Creates embeddings using Gemini’s embedding model.  
  - `model`: embedding model name (e.g., `"models/text-embedding-004"`).  
  - `.embed_documents(list[str])` returns a list of vectors.  
  - `.embed_query(str)` returns a single vector for the query.

- **`FAISS` (Vector Store)**  
  An in‑memory, fast, ANN index for vectors.  
  - `FAISS.from_documents(docs, embeddings)` builds an index from `Document` chunks.  
  - `.similarity_search(query, k=4)` returns the top‑k most similar chunks.  
  - `.as_retriever(search_kwargs={"k": 4})` produces a **retriever** interface for use in chains.

- **`k` (top‑k)**  
  The number of nearest chunks to return for each search. Typical values: 3–8.

- **Similarity vs. Relevance**  
  Vector similarity is semantic; it’s a strong proxy for relevance but not perfect. We’ll later add reranking/filters.

---

### 🚀 Minimal, Runnable Example (uses chunks from Lesson 8)

**Assumption:** You ran Lesson 8 and have `chunks` (list of `Document`) ready. If not, quickly load and split `sample.txt` again.

In [1]:
 !pip install faiss-cpu


Collecting faiss-cpu
  Downloading faiss_cpu-1.11.0.post1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.0 kB)
Downloading faiss_cpu-1.11.0.post1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (31.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.3/31.3 MB[0m [31m29.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.11.0.post1


In [9]:
 from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_community.vectorstores import FAISS

# Step 1: Initialize embeddings (Gemini)
emb = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")

# Step 2: Build the FAISS vector store from document chunks
vectorstore = FAISS.from_documents(chunks, emb)

# Step 3: Similarity search
query = "Why do we split documents into chunks for RAG?"
results = vectorstore.similarity_search(query, k=3)

print(f"Top {len(results)} results:")
for i, doc in enumerate(results, 1):
    print(f"\n--- Result {i} ---")
    print(doc.page_content)


Top 3 results:

--- Result 1 ---
Splitting text into chunks improves retrieval quality when you search for relevant information. For example, in a Retrieval-Augmented Generation (RAG) setup, a user's query is matched against these

--- Result 2 ---
may fail to process it or lose important details. LangChain addresses this by allowing you to split documents into smaller, overlapping chunks.

--- Result 3 ---
multiple text splitting strategies, such as character-based and recursive splitting, which respect natural boundaries like sentences and paragraphs.


### 🧪 Time to Practice — Lesson 9: Embeddings & FAISS Vector Store

**Goal:**  
We will convert document chunks into embeddings, store them in a FAISS index, and perform similarity searches to retrieve the most relevant chunks for a given query.

---

#### **Part 1 — Build the Index**
1. Ensure you have your document chunks from Lesson 8 (variable: `chunks`).
2. Initialize the embeddings object:
   ```python
   from langchain_google_genai import GoogleGenerativeAIEmbeddings
   emb = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
  ```
3. Build the FAISS index:

   ```python
from langchain_community.vectorstores import FAISS
vectorstore = FAISS.from_documents(chunks, emb)
  ```

4. Print the number of documents stored:
  ```python
  print("Documents in index:", len(vectorstore.docstore._dict))
  ```

---
#### **Part 2 — Run Similarity Searches**
1. Create three different queries related to your document content (e.g., "What is LangChain?", "Why split documents into chunks?", "What are text splitting strategies?").

2. For each query, run:

``` python
results = vectorstore.similarity_search(query, k=3)
for i, doc in enumerate(results, 1):
    print(f"\n--- Result {i} ---")
    print(doc.page_content)

```
3. Check if the retrieved chunks are on-topic and relevant.

---

#### **Part 3 — Create and Use a Retriever**

1. Create a retriever:

```python
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
```

2. Use it to get documents:


```python
docs = retriever.get_relevant_documents("Your query here")
```

3. Compare outputs with `similarity_search`


---

# Practice  Code below:

In [13]:
chunks

[Document(metadata={'source': 'sample.txt'}, page_content='LangChain is a framework for developing applications powered by large language models (LLMs). It provides a standard interface for chains, modules for various tasks, and integration with many LLM'),
 Document(metadata={'source': 'sample.txt'}, page_content='for various tasks, and integration with many LLM providers.'),
 Document(metadata={'source': 'sample.txt'}, page_content='One of the core challenges when working with LLMs is managing the limited context length. If you try to send too much text at once, the model may fail to process it or lose important details.'),
 Document(metadata={'source': 'sample.txt'}, page_content='may fail to process it or lose important details. LangChain addresses this by allowing you to split documents into smaller, overlapping chunks.'),
 Document(metadata={'source': 'sample.txt'}, page_content="Splitting text into chunks improves retrieval quality when you search for relevant information. For e

In [14]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
emb = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")

In [15]:
from langchain_community.vectorstores import FAISS
vectorstore = FAISS.from_documents(chunks, emb)

In [17]:
print("Documents in index:", len(vectorstore.docstore._dict))

Documents in index: 8


In [18]:
query = 'What is LangChain?'
results = vectorstore.similarity_search(query, k=3)
for i, doc in enumerate(results, 1):
    print(f"\n--- Result {i} ---")
    print(doc.page_content)


--- Result 1 ---
LangChain is a framework for developing applications powered by large language models (LLMs). It provides a standard interface for chains, modules for various tasks, and integration with many LLM

--- Result 2 ---
may fail to process it or lose important details. LangChain addresses this by allowing you to split documents into smaller, overlapping chunks.

--- Result 3 ---
LangChain supports a variety of document loaders for different formats, including plain text, PDFs, HTML, and Word documents. It also offers multiple text splitting strategies, such as


In [19]:
query = 'Why split documents into chunks'
results = vectorstore.similarity_search(query, k=3)
for i, doc in enumerate(results, 1):
    print(f"\n--- Result {i} ---")
    print(doc.page_content)


--- Result 1 ---
may fail to process it or lose important details. LangChain addresses this by allowing you to split documents into smaller, overlapping chunks.

--- Result 2 ---
Splitting text into chunks improves retrieval quality when you search for relevant information. For example, in a Retrieval-Augmented Generation (RAG) setup, a user's query is matched against these

--- Result 3 ---
multiple text splitting strategies, such as character-based and recursive splitting, which respect natural boundaries like sentences and paragraphs.


In [20]:
query = 'What are text splitting strategies?'
results = vectorstore.similarity_search(query, k=3)
for i, doc in enumerate(results, 1):
    print(f"\n--- Result {i} ---")
    print(doc.page_content)


--- Result 1 ---
multiple text splitting strategies, such as character-based and recursive splitting, which respect natural boundaries like sentences and paragraphs.

--- Result 2 ---
Splitting text into chunks improves retrieval quality when you search for relevant information. For example, in a Retrieval-Augmented Generation (RAG) setup, a user's query is matched against these

--- Result 3 ---
may fail to process it or lose important details. LangChain addresses this by allowing you to split documents into smaller, overlapping chunks.


In [27]:
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
docs = retriever.get_relevant_documents("What are text splitting strategies?")
for i, doc in enumerate(results, 1):
    print(f"\n--- Result {i} ---")
    print(doc.page_content)


--- Result 1 ---
multiple text splitting strategies, such as character-based and recursive splitting, which respect natural boundaries like sentences and paragraphs.

--- Result 2 ---
Splitting text into chunks improves retrieval quality when you search for relevant information. For example, in a Retrieval-Augmented Generation (RAG) setup, a user's query is matched against these

--- Result 3 ---
may fail to process it or lose important details. LangChain addresses this by allowing you to split documents into smaller, overlapping chunks.


## Lesson 10 — Retrieval-Augmented Generation (RAG) Chain <a name="lesson-10"></a>

### 🎯 Objective
We will connect our FAISS retriever to a Gemini model so that:
1. The user asks a question.
2. Relevant document chunks are retrieved from the vector store.
3. The chunks are fed into the LLM prompt so answers are grounded in the source material.

---

### 📖 What & Why

#### What is RAG?
- **Retrieval-Augmented Generation** is an approach where the LLM is given **retrieved context** before generating an answer.
- This helps:
  - Reduce hallucinations (model making up facts)
  - Improve accuracy on domain-specific questions
  - Keep answers grounded in your dataset

#### How it Works
1. **Retriever** finds the top-k most relevant chunks for the query.
2. **Prompt Template** merges those chunks with the user’s question into a single prompt.
3. **LLM** generates an answer **only using** that context.

---

### 🔑 New Classes / Functions
- **`vectorstore.as_retriever()`** — Wraps FAISS into a retriever interface.
- **`ChatPromptTemplate`** — Template for building structured prompts.
- **`create_stuff_documents_chain()`** — Creates a chain that “stuffs” all retrieved documents into the prompt.
- **`create_retrieval_chain()`** — Combines retriever + LLM chain into a single callable RAG pipeline.

---

### 🚀 Minimal Runnable Example (Gemini)



In [28]:

# Pre-req: You must have `vectorstore` from Lesson 9
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain

# Step 1: Create retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

# Step 2: Initialize Gemini chat model
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0)

# Step 3: Create prompt template
prompt = ChatPromptTemplate.from_template("""
You are an assistant that answers questions based only on the provided context.
If the answer is not in the context, say "I don't know."

Context:
{context}

Question:
{input}
""")

# Step 4: Create document chain (stuffing context into prompt)
doc_chain = create_stuff_documents_chain(llm, prompt)

# Step 5: Create RAG chain
rag_chain = create_retrieval_chain(retriever, doc_chain)

# Step 6: Ask a question
response = rag_chain.invoke({"input": "Why do we split documents into chunks?"})
print(response["answer"])


Splitting text into chunks improves retrieval quality when searching for relevant information.  It also addresses the problem that processing large documents may cause failure to process or loss of important details.


### 🧪 Time to Practice — Lesson 10: RAG with Web Scrape

**Goal:**  
Scrape a single public article, chunk it, store in FAISS, and query it via RAG.

---

#### **Step 1 — Scrape and Load**
```python
!pip -qU beautifulsoup4
from langchain_community.document_loaders import WebBaseLoader

url = "https://example.com/article"  # replace with your chosen article
loader = WebBaseLoader(url)
web_docs = loader.load()
```


---
#### **Step 2 — Chunk**

```python
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=120)
chunks_web = splitter.split_documents(web_docs)

```

---
#### **Step 3 — Build Vector Store**
```python
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_community.vectorstores import FAISS

emb = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
vectorstore = FAISS.from_documents(chunks_web, emb)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})


```


---
#### **Step 4 — RAG Chain**

```python
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain

llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0)
prompt = ChatPromptTemplate.from_template(
    "Answer using only this context:\n{context}\n\nQuestion: {input}"
)
doc_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, doc_chain)

```

---
#### **Step 5 — Ask 2 Questions**
```python
for q in [
    "Ask something clearly in the article",
    "Ask something not covered in the article"
]:
    print(f"\nQ: {q}")
    print("A:", rag_chain.invoke({"input": q})["answer"])


```

# Practice Code Below

In [29]:
from langchain_community.document_loaders import WebBaseLoader

url = "https://en.wikipedia.org/wiki/Naruto"  # replace with your chosen article
loader = WebBaseLoader(url)
web_docs = loader.load()



In [30]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=120)
chunks_web = splitter.split_documents(web_docs)

In [46]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_community.vectorstores import FAISS

emb = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
vectorstore = FAISS.from_documents(chunks_web, emb)
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

In [47]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain

llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.6)
prompt = ChatPromptTemplate.from_template(
    "Answer using only this context:\n{context}\n\nQuestion: {input}"
)
doc_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, doc_chain)

In [48]:
for q in [
    "What is Naruto",
    "Who is Frieza?"
]:
    print(f"\nQ: {q}")
    print("A:", rag_chain.invoke({"input": q})["answer"])


Q: What is Naruto
A: Naruto is one of the best-selling manga series of all time, having 250 million copies in circulation worldwide.  It's one of Viz Media's best-selling manga series, with English translations appearing on USA Today and The New York Times bestseller lists.  The seventh volume won a Quill Award in 2006.  It has been praised for its character development, storylines, and action sequences, though some felt the latter slowed the story down.  Critics noted its coming-of-age themes and cultural references to Japanese mythology and Confucianism.  The story continues in *Boruto*, where Naruto's son Boruto Uzumaki creates his own ninja path.

Q: Who is Frieza?
A: This question cannot be answered from the given context.


## Lesson 11 — Capstone Project: RAG Agent <a name="lesson-11"></a>

### 🎯 Objective
We will build a **RAG-powered Agent** that can:
1. Search through a custom knowledge base (your ingested documents).
2. Answer grounded questions using Gemini.
3. Use extra tools when needed (e.g., Python calculations, web search).

---

### 📖 What & Why

#### Why RAG + Agent?
- **RAG** ensures answers are based on your data.
- **Agent** adds flexibility:
  - Can decide when to retrieve from knowledge base.
  - Can call external tools for supplemental information.
  - Can handle multi-step reasoning.

#### Flow
1. **Load & Chunk Docs** — From local files or scraped web pages.
2. **Embed & Index** — Build FAISS vector store.
3. **Retriever Tool** — Wrap retriever as a LangChain `Tool`.
4. **LLM Agent** — Use `ZERO_SHOT_REACT_DESCRIPTION` or tool-calling agent with Gemini.
5. **Interaction** — Ask open-ended queries; agent decides which tool to call.

---

### 🔑 Key Components

- **`Tool` / `StructuredTool`** — To wrap the FAISS retriever as an agent-usable function.
- **`AgentExecutor`** — Runs the agent loop with tools.
- **`GoogleGenerativeAIEmbeddings`** — To embed chunks.
- **`FAISS`** — Fast, local vector store.
- **`ChatGoogleGenerativeAI`** — LLM backbone for reasoning & answering.

---

### 🚀 Minimal RAG Agent Example (Gemini)


In [53]:
# 1) Imports
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_google_genai import GoogleGenerativeAIEmbeddings, ChatGoogleGenerativeAI
from langchain_community.vectorstores import FAISS
from langchain.agents import Tool, AgentType, initialize_agent

# 2) Load docs (example: scrape a web article)
loader = WebBaseLoader("https://en.wikipedia.org/wiki/Naruto")
docs = loader.load()

# 3) Chunk
splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=100)
chunks = splitter.split_documents(docs)

# 4) Build embeddings + FAISS
emb = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")
vectorstore = FAISS.from_documents(chunks, emb)
retriever = vectorstore.as_retriever(search_kwargs={"k": 10})

# 5) Wrap retriever as a Tool
def retrieve_docs(query: str):
    docs = retriever.get_relevant_documents(query)
    return "\n\n".join([d.page_content for d in docs])

retriever_tool = Tool(
    name="Document Retriever",
    func=retrieve_docs,
    description="Use this to search the knowledge base for answers."
)

# 6) Create LLM agent
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0)
agent = initialize_agent(
    tools=[retriever_tool],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# 7) Ask a question
response = agent.run("Please summarise the story of Part 1 of Naruto?")
print("\nFinal Answer:\n", response)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to retrieve information about the plot summary of Part 1 of Naruto.  I will use the Document Retriever to find a relevant summary.

Action: Document Retriever
Action Input: "Summary of Naruto Part 1"[0m
Observation: [36;1m[1;3mPlot
Part I
See also: List of Naruto chapters (Part I)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Naruto&oldid=1304030051"

said that the central theme in Part I of Naruto is how people accept each other, citing Naruto's development across the series as an example.[8]

to prevent a coup; he accepted, on the condition that Sasuke would be spared. Devastated by this revelation, Sasuke joins the Akatsuki to destroy Konoha in revenge. As Konoha ninjas defeat several Akatsuki members, the Akatsuki figurehead leader, Nagato, kills Jiraiya and devastates Konoha, but Naruto defeats and redeems him, earning the village's respect and admiration.

from Jiraiya to prepare himself