# 🤖 AI Agents with Gemini and LangChain

This notebook walks you through building practical LLM-based agents using:
- Google Gemini (via LangChain)
- Tools (calculator, Python, SQL on DataFrame)
- Retrieval-Augmented Generation (RAG) with Chroma
- A real-world **Data Analyst Agent** that runs SQL queries on pandas

🧠 By the end, you’ll have an agent that can:
- Answer from documents (RAG)
- Execute code (Python tool)
- Perform SQL queries on a DataFrame
- Use memory and tools like a smart assistant


## 🔧 Install Dependencies

We’ll install all required packages including LangChain, Gemini connector, and Chroma.


In [4]:
!pip install -U langchain langchain-community langchain-google-genai google-generativeai chromadb langchain-experimental pandasql

Collecting langchain-community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-google-genai
  Downloading langchain_google_genai-2.1.8-py3-none-any.whl.metadata (7.0 kB)
Collecting chromadb
  Downloading chromadb-1.0.15-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.0 kB)
Collecting langchain-experimental
  Downloading langchain_experimental-0.3.4-py3-none-any.whl.metadata (1.7 kB)
Collecting pandasql
  Downloading pandasql-0.7.3.tar.gz (26 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.10.1-py3-none-any.whl.metadata (3.4 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.1-py3-none-any.whl.metadata (9.4 kB)
Collecting filetyp

## 🔑 Gemini API Setup

You'll need an API key from Google AI Studio: [https://makersuite.google.com](https://makersuite.google.com)

Make sure to replace `"your-key"` with your actual key in the cell below.


In [5]:
import os
# Add your own api key - not a good practice to do it like this but since it is a tutorial -
os.environ["GOOGLE_API_KEY"] = "your-api-key"  # Replace this with your actual Gemini API key


In [17]:
import time

## 💬 Initialize Gemini with LangChain

This will connect Gemini as your main LLM for agents, RAG, and tools.


In [6]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.7)


In [7]:
print(llm.invoke("What is the difference between a crocodile and an alligator?").content)


While both crocodiles and alligators are large, semi-aquatic reptiles belonging to the order Crocodilia, they have several distinct differences that allow you to tell them apart.

Here's a breakdown of the key distinctions:

| Feature           | Alligator                                 | Crocodile                                  |
| :---------------- | :---------------------------------------- | :----------------------------------------- |
| **Snout Shape**   | **Broader, U-shaped** (like a shovel)     | **Narrower, V-shaped** (more pointed)      |
| **Teeth Visibility** | When their mouth is closed, you generally only see the teeth from their **upper jaw** (the lower teeth fit into sockets in the upper jaw). The large 4th tooth on the lower jaw is *hidden*. | When their mouth is closed, teeth from **both the upper and lower jaws** are visible and interlock. The large 4th tooth on the lower jaw prominently sticks up on the outside of the upper jaw. |
| **Color**         | Typically 

## 🤖 Conversational Chatbot with Memory (Gemini + LangChain)

In this section, we'll build a chatbot that:
- Maintains **conversation history**
- Feels **personal and friendly**
- Can remember what the user said earlier
- Responds with context-aware answers using **Gemini (via LangChain)**

We'll use:
- `ConversationChain`
- `ConversationBufferMemory`
- `PromptTemplate` for personality


In [8]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import PromptTemplate


### 🔑 Step 1: Initialize Gemini LLM

We'll use `ChatGoogleGenerativeAI` with `temperature=0.7` for balanced creativity.


In [9]:
llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.7)


### 🧠 Step 2: Create Conversation Memory

`ConversationBufferMemory` stores the full chat history in memory.  
This allows the agent to remember and respond with awareness of earlier turns.


In [10]:
memory = ConversationBufferMemory(return_messages=True)


  memory = ConversationBufferMemory(return_messages=True)


### 🎭 Step 3: Add AI Personality with PromptTemplate

We'll give the chatbot a fun and friendly persona using `PromptTemplate`.
The `{history}` and `{input}` placeholders allow dynamic conversation flow.


In [11]:
chat_prompt = PromptTemplate.from_template(
    """You are a helpful, humorous AI friend named Bomba.
   keep the conversation light and friendly, but don't talk too much.

Chat History:
{history}

User: {input}
Bomba:"""
)


### 🔄 Step 4: Create the ConversationChain

We'll connect the LLM, memory, and prompt into a single chain.


In [12]:
chatbot = ConversationChain(
    llm=llm,
    memory=memory,
    prompt=chat_prompt,
    verbose=False # verbose=False  # to turn off LangChain logs
)


  chatbot = ConversationChain(


### 🗣️ Step 5: Chat with Bomba!

Try saying things and refer to them later. Bomba will remember what you said earlier.


In [13]:
while True:
    user_input = input("You: ")
    if user_input.lower() in ["exit", "quit"]:
        break
    reply = chatbot.run(user_input)
    print(f"Bomba: {reply}")


You: exit



## 🧰 LLM Agent with Tools

We'll now build an agent that can:
- Solve math
- Execute Python code
- Get current time
- Fetch content from the web



```
Note: Because LangChain requires every Tool’s func to accept exactly one argument — even if it's unused.
```



In [14]:
from langchain.agents import Tool, initialize_agent, AgentType
from langchain_experimental.tools.python.tool import PythonREPLTool
from datetime import datetime
import requests

# 🕒 Tool: Current Time
def current_time_tool(_):
    return f"📅 Current datetime: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"

# 🌐 Tool: Fetch example.com
def fetch_example_dot_com(_):
    response = requests.get("https://example.com")
    return response.text[:1000]

# 🧮 Safe Calculator Tool
def safe_calculator(query: str) -> str:
    try:
        # Only allow math operators
        allowed = "0123456789+-*/(). "
        if any(c not in allowed for c in query):
            return "Only math expressions are allowed."
        return str(eval(query))
    except Exception as e:
        return f"Error: {str(e)}"

# ✅ Tools list
tools = [
    PythonREPLTool(),
    Tool(
        name="Calculator",
        func=safe_calculator,
        description="Performs basic math: addition, subtraction, multiplication, division."
    ),
    Tool(
        name="CurrentTime",
        func=current_time_tool,
        description="Returns the current date and time."
    ),
    Tool(
        name="FetchExampleDotCom",
        func=fetch_example_dot_com,
        description="Fetches the HTML content of example.com."
    ),
]


In [15]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.7)

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # zero shot ReAct doesn't depend on history, only reasoning and steps
    verbose=True #false to hide the process
)


  agent = initialize_agent(


In [18]:
agent.run("What is (25 + 5) * 3?")
time.sleep(2)
agent.run("Give me the current time.")
time.sleep(2)
agent.run("Generate the first 10 prime numbers in Python.")
time.sleep(2)
agent.run("Fetch the content from example.com")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: Calculator
Action Input: (25 + 5) * 3[0m
Observation: [33;1m[1;3m90[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: 90[0m

[1m> Finished chain.[0m


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: CurrentTime
Action Input: _[0m
Observation: [38;5;200m[1;3m📅 Current datetime: 2025-07-29 10:06:38[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: The current time is 10:06:38 on 2025-07-29.[0m

[1m> Finished chain.[0m


[1m> Entering new AgentExecutor chain...[0m




[32;1m[1;3mAction: Python_REPL
Action Input:
def is_prime(num):
    if num < 2:
        return False
    for i in range(2, int(num**0.5) + 1):
        if num % i == 0:
            return False
    return True

primes = []
num = 2
while len(primes) < 10:
    if is_prime(num):
        primes.append(num)
    num += 1
print(primes)[0m
Observation: [36;1m[1;3m[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: The first 10 prime numbers are [2, 3, 5, 7, 11, 13, 17, 19, 23, 29].[0m

[1m> Finished chain.[0m


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: FetchExampleDotCom
Action Input:[0m
Observation: [36;1m[1;3m<!doctype html>
<html>
<head>
    <title>Example Domain</title>

    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
    <style type="text/css">
    body {
        background-

'<!doctype html>\n<html>\n<head>\n    <title>Example Domain</title>\n\n    <meta charset="utf-8" />\n    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />\n    <meta name="viewport" content="width=device-width, initial-scale=1" />\n    <style type="text/css">\n    body {\n        background-color: #f0f0f2;\n        margin: 0;\n        padding: 0;\n        font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;\n        \n    }\n    div {\n        width: 600px;\n        margin: 5em auto;\n        padding: 2em;\n        background-color: #fdfdff;\n        border-radius: 0.5em;\n        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\n    }\n    a:link, a:visited {\n        color: #38488f;\n        text-decoration: none;\n    }\n    @media (max-width: 700px) {\n        div {\n            margin: 0 auto;\n            width: auto;\n        }\n    }\n    </style>    \n</head>\n\n<body>\n<div>\n    <

## 📚 RAG: Teach the LLM Using Your Own Knowledge

We'll create an in-memory knowledge base and let Gemini answer based on that.


In [19]:
from langchain.schema import Document

rag_text = """
Return Policy:
You may return items within 30 days of receipt for a full refund. Items must be unused and in original packaging.
Electronics must be returned within 14 days.

Shipping Info:
Shipping takes 2-5 business days locally, and up to 10 days internationally.
Free shipping is available on orders above $100.

Support:
Customer support is available 24/7 via chat and email.
Phone support is available from 9 AM to 6 PM, Monday to Friday.
For urgent inquiries, use live chat on our website.
"""

docs = [Document(page_content=rag_text)]


In [20]:
from langchain.text_splitter import CharacterTextSplitter
from langchain_google_genai import GoogleGenerativeAIEmbeddings

splitter = CharacterTextSplitter(chunk_size=250, chunk_overlap=50)
chunks = splitter.split_documents(docs)

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")


In [21]:
from langchain.vectorstores import Chroma

# Use in-memory Chroma DB
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    collection_name="rag-kb"
)

retriever = vectorstore.as_retriever( search_kwargs={"k": 1})


In [22]:
from langchain.chains import RetrievalQA

rag_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever
)


In [23]:
rag_chain.run("How long do I have to return electronics?")


'Electronics must be returned within 14 days.'

## 🔍 Compare Gemini with and without RAG

Use the same query to see how much better Gemini performs with knowledge from your text chunks.


In [24]:
def compare_rag_vs_no_rag(question: str):
    print(f"❓ Question: {question}\n")
    print("🤖 Without RAG (LLM only):")
    print(llm.invoke(question).content)
    print("\n📚 With RAG (using context):")
    print(rag_chain.run(question))
    print("--------------------\n")


In [25]:
import time
compare_rag_vs_no_rag("What are the customer support hours?")
time.sleep(2)
compare_rag_vs_no_rag("Is international shipping free?")
time.sleep(2)
compare_rag_vs_no_rag("Can I return used items?")


❓ Question: What are the customer support hours?

🤖 Without RAG (LLM only):
I can't tell you without knowing **whose** customer support you're asking about!

Customer support hours vary wildly depending on the company, product, or service.

Please tell me:

*   **Which company** are you trying to contact? (e.g., Apple, Amazon, your bank, a specific software company, an airline, etc.)

Once you tell me who you're looking for, I might be able to help you find their specific hours, or at least direct you to where you can find them (usually on their official website's "Contact Us" or "Support" page).

📚 With RAG (using context):
Customer support is available 24/7 via chat and email. Phone support is available from 9 AM to 6 PM, Monday to Friday.
--------------------

❓ Question: Is international shipping free?

🤖 Without RAG (LLM only):
Generally, **no, international shipping is not free.**

International shipping involves significant costs due to:

1.  **Carrier Fees:** Costs for the actu

## 📊 Gemini-Powered Data Analyst Agent (SQL on pandas)

Let the agent accept SQL queries like:
- `What's the average income?`

It will translate the SQL to pandas queries and return results using `pandasql`.


In [26]:
from pandasql import sqldf
from langchain.tools import Tool

import pandas as pd

# Sample DataFrame
df = pd.DataFrame({
    "age": [22, 35, 58, 45, 32],
    "income": [3000, 4500, 5200, 4800, 3900],
    "city": ["Jeddah", "Riyadh", "Dammam", "Jeddah", "Riyadh"]
})

print(df)

# Simple SQL-on-DataFrame tool using pandasql
def sql_query_tool(query: str) -> str:
    try:
        # Enable use of 'df' as table name
        result = sqldf(query, {"df": df})
        return result.to_string(index=False) if not result.empty else "No results."
    except Exception as e:
        return f"⚠️ SQL Error: {e}"


   age  income    city
0   22    3000  Jeddah
1   35    4500  Riyadh
2   58    5200  Dammam
3   45    4800  Jeddah
4   32    3900  Riyadh


In [27]:
tools = [
    Tool(
        name="SQLOnDataFrame",
        func=sql_query_tool,
        description="Query the DataFrame using SQL."
    )
]


In [28]:
data_sql_agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)


In [29]:
data_sql_agent.run("find me the average income")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: SQLOnDataFrame
Action Input: SELECT AVG(income) FROM df[0m
Observation: [36;1m[1;3m AVG(income)
      4280.0[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: 4280.0[0m

[1m> Finished chain.[0m


'4280.0'

In [30]:
data_sql_agent.run("ايش المدينة اللي فيها اعلى راتب")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: SQLOnDataFrame
Action Input: SELECT City FROM df ORDER BY Salary DESC LIMIT 1[0m
Observation: [36;1m[1;3m⚠️ SQL Error: (sqlite3.OperationalError) no such column: Salary
[SQL: SELECT City FROM df ORDER BY Salary DESC LIMIT 1]
(Background on this error at: https://sqlalche.me/e/20/e3q8)[0m
Thought:[32;1m[1;3mAction: SQLOnDataFrame
Action Input: SELECT * FROM df LIMIT 1[0m
Observation: [36;1m[1;3m age  income   city
  22    3000 Jeddah[0m
Thought:[32;1m[1;3mAction: SQLOnDataFrame
Action Input: SELECT city FROM df ORDER BY income DESC LIMIT 1[0m
Observation: [36;1m[1;3m  city
Dammam[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: Dammam[0m

[1m> Finished chain.[0m


'Dammam'

In [31]:
# Why the Agent made many mitakes? how to fix it ?

##contributed by: Yazan Alshoibi