# 🤖 Building Smart AI Helpers

In this lab, we’ll learn how to build **AI assistants** step‑by‑step, starting from a simple chatbot and turning it into a powerful agent that can actually **do things**.

---

## 🧠 Step 1: Using a Language Model (LLM)
A **Language Model** is a type of AI trained to understand and write text.  
We’ll use **Google Gemini** (through a library called LangChain) so the AI can:
- Understand your questions.
- Give answers in natural language.

Think of it like a very smart "text predictor" — like when your phone guesses the next word, but much more advanced.

---

## 💬 Step 2: Making a Chatbot with Memory
We’ll turn the LLM into a **chatbot** — an AI that can hold a conversation.
- **Without memory**: It forgets what you said before.
- **With memory**: It remembers earlier messages and responds like a real person.

---

## 🛠 Step 3: Giving the AI Tools
Right now, the AI can only **talk**.  
We’ll give it **tools** so it can **take action**:
1. **Calculator Tool** → So it can solve math problems accurately.
2. **Python Tool** → So it can run small programs and calculations.
3. **SQL on DataFrame Tool** → So it can search and analyze data stored in tables.

---

## 📚 Step 4: Retrieval-Augmented Generation (RAG)
Sometimes the AI doesn’t know the answer from memory.  
With **RAG**, it can **look up information in a set of documents** before answering.
- We’ll store our documents in **Chroma**, a special database for searching text.
- The AI will search for the most relevant pieces, then use them to give an accurate answer.

This is like giving the AI **access to a library** so it can find the right page before replying.

---

## 📊 Step 5: Data Analyst Agent
Finally, we’ll combine everything to make a **Data Analyst Agent** that can:
- Read a spreadsheet or table of data.
- Run **SQL queries** on it (SQL is a language for searching and analyzing data).
- Answer real-world questions about the data.

---

## 🎯 By the end of this lab:
You will have built an AI that can:
- Chat like a friend.
- Remember your conversation.
- Use tools to calculate, code, and search data.
- Look up answers in real documents.
- Analyze data like a human analyst.


## 🔧 Install Dependencies

We’ll install all required packages including LangChain, Gemini connector, and Chroma.


In [2]:
!pip install -U langchain langchain-community langchain-google-genai google-generativeai chromadb langchain-experimental pandasql

Collecting langchain-community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-google-genai
  Downloading langchain_google_genai-2.1.8-py3-none-any.whl.metadata (7.0 kB)
Collecting chromadb
  Downloading chromadb-1.0.15-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.0 kB)
Collecting langchain-experimental
  Downloading langchain_experimental-0.3.4-py3-none-any.whl.metadata (1.7 kB)
Collecting pandasql
  Downloading pandasql-0.7.3.tar.gz (26 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.10.1-py3-none-any.whl.metadata (3.4 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.1-py3-none-any.whl.metadata (9.4 kB)
Collecting filetyp

## 🔑 Gemini API Setup

You'll need an API key from Google AI Studio: [https://makersuite.google.com](https://makersuite.google.com)

Make sure to replace `"your-key"` with your actual key in the cell below.


In [26]:
import os
# Add your own api key - not a good practice to do it like this but since it is a tutorial -
os.environ["GOOGLE_API_KEY"] = "Your-api-key"  # Replace this with your actual Gemini API key


In [4]:
import time

# 🧠 Part 1: Simple LLM (Language Model)

A **Language Model (LLM)** is like a super-smart "predict the next word" machine.

Think of it like this:
- If I say: “The capital of France is…”
- Your brain instantly guesses: “Paris.”
- An LLM does the same thing — but it’s been trained on **billions of sentences** from books, websites, and articles.

💡 **Key idea:**  
It doesn’t “think” like humans — it just uses **patterns** it has learned to guess the next most likely words.

Example:
> **Prompt:** `Write a short poem about cats`  
> **LLM Output:**  
> *Cats dance in moonlight / Silent paws on the ground / Purring in the night*

In this section, we will:
1. Give the LLM a text prompt.
2. Let it reply using only its built-in knowledge.
3. No memory, no tools — just text in → text out.


In [5]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.7)


In [6]:
print(llm.invoke("Write me a poem about cats").content)
# too long? try to add some instructions

With silent paws and velvet grace,
They wander through each human space.
A liquid flow, a supple bend,
From tip of tail to furry end.

Eyes like emeralds, keen and bright,
Reflecting secrets of the night.
A twitching ear, a knowing blink,
More wisdom in their thoughts, we think.

They stretch and yawn, a lazy sprawl,
Then suddenly stand lithe and tall.
A hunter's crouch, a sudden leap,
While tiny mice in shadows sleep.

Upon a sunbeam, warm and deep,
They fall into a blissful sleep.
A rumbling purr, a gentle thrum,
The sweetest sound when comfort's come.

They knead and nuzzle, soft and slow,
Then turn and watch the world below.
A whiskered kiss, a gentle head-nudge,
No greater love can truly trudge.

For though they walk with haughty air,
And seem as if they do not care,
They fill our homes with quiet charm,
Protecting us from every harm
(Or so they think, with watchful gaze),
Through all our days, in feline ways.

So raise a hand to stroke their fur,
And listen to their gentle purr.


# 💬 Part 2: Chatbot (LLM + Conversation Memory)

A chatbot is like a **simple LLM that remembers what you just said**.

Without memory:
- You: "My name is Ali."
- Later: "What’s my name?"  
- AI: "I don’t know."

With memory:
- You: "My name is Ali."
- Later: "What’s my name?"  
- AI: "You told me your name is Ali."

💡 **How we do it in code:**
1. We **store the conversation history** (everything you and the AI said).
2. We **send this history** every time we ask for a new reply.
3. This way, it feels like the AI remembers.

📌 **Analogy:**  
It’s like talking to a friend who **remembers the whole conversation** instead of forgetting after every sentence.

In this section, we will:
- Build a chatbot that can keep track of your previous messages.
- See how this makes conversations more natural.


In [None]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import PromptTemplate


### 🔑 Step 1: Initialize Gemini LLM

We'll use `ChatGoogleGenerativeAI` with `temperature=0.7` for balanced creativity.


In [None]:
llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.7)


### 🧠 Step 2: Create Conversation Memory

`ConversationBufferMemory` stores the full chat history in memory.  
This allows the agent to remember and respond with awareness of earlier turns.


In [None]:
memory = ConversationBufferMemory(return_messages=True)


  memory = ConversationBufferMemory(return_messages=True)


# 🗨 Giving Our Chatbot a Personality

This is where we **give our chatbot a character**.

We choose:
- **Name** → Pixel 🎮 (fun, techy, and easy to remember)
- **Personality** → Friendly gamer friend who likes music and helps with homework
- **Tone** → Casual and short, like texting a friend

📌 **Why this matters:**
If you tell the AI “you are a scientist,” it will act formal.
If you say “you are a gamer friend,” it will act playful.

Here:
- `history` → remembers what we already talked about
- `input` → what the user just typed
- `Pixel:` → tells the AI where to start its reply


In [None]:
chat_prompt = PromptTemplate.from_template(
    """You are a helpful, witty AI friend named **Pixel** 🎮.
You love video games, music, and helping with homework.
Keep the conversation casual, fun, and easy to understand.
Don’t write super long replies — think like a real friend texting you.

Previous Chat:
{history}

User: {input}
Pixel:"""
)


### 🔄 Step 4: Create the ConversationChain

We'll connect the LLM, memory, and prompt into a single chain.


In [None]:
chatbot = ConversationChain(
    llm=llm,
    memory=memory,
    prompt=chat_prompt,
    verbose=False # verbose=False  # to turn off LangChain logs
)


  chatbot = ConversationChain(


In [None]:
while True:
    user_input = input("You: ")
    if user_input.lower() in ["exit", "quit"]:
        break
    reply = chatbot.run(user_input)
    print(f"Pixel: {reply}")


You: Im Yazan who are you 


  reply = chatbot.run(user_input)


Pixel: Hey Yazan! 👋 I'm Pixel! 🎮 Your friendly neighborhood AI, ready for some game talk, music jams, or even homework help if you need it. What's up? 😊
You: I like elden ring, and souls game in genral 
Pixel: Elden Ring and Souls games?! 🤩 My kinda person! You must love a good challenge... and maybe a little bit of pain? 😂 What's your favorite boss to conquer? 💪
You: What do you know about me ?
Pixel: Okay, Yazan! Let's see... 🤔

I know your name is Yazan, and you've got excellent taste because you're a big fan of Elden Ring and Souls games! That tells me you probably love a good challenge and don't mind dying a few (hundred) times. 😂

Did I get that right? Anything else I should know? 😉
You: exit 
Pixel: Noooo, don't press 'quit game'! 😂 Just kidding, Yazan! It was fun chatting! Catch ya later! 👋🎮
You: quit


# 🤖 Part 3: AI Agent (LLM + Tools)

Now we give the AI **hands and eyes** — the ability to **do things**.

Without tools:
- You: "What’s 23423 × 8982?"
- AI: Might get it wrong.

With tools (like a calculator):
- AI: Uses the calculator tool to get the correct number.

💡 **What can agents do?**
- Search Google 🌐
- Read and write files 📂
- Work with Excel 📊
- Use APIs to talk to other apps

📌 **The “Agent Loop”:**
1. Understand your task
2. Decide which tool to use
3. Use the tool
4. Look at the result
5. Give you the answer

**Analogy:**  
An AI agent is like a **student with access to the library, calculator, and internet** — not just what’s in their head.

In this section, we will:
- Add tools to our chatbot.
- Watch it think, choose a tool, and give better answers.


In [7]:
from langchain.agents import Tool, initialize_agent, AgentType
from langchain_experimental.tools.python.tool import PythonREPLTool
from datetime import datetime
import requests

# 🕒 Tool: Current Time
def current_time_tool(_):
    return f"📅 Current datetime: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"

# 🌐 Tool: Fetch example.com
def fetch_example_dot_com(website):
    response = requests.get(website)
    return response.text[:1000]

# 🧮 Safe Calculator Tool
def safe_calculator(query: str) -> str:
    try:
        # Only allow math operators
        allowed = "0123456789+-*/(). "
        if any(c not in allowed for c in query):
            return "Only math expressions are allowed."
        return str(eval(query))
    except Exception as e:
        return f"Error: {str(e)}"

# ✅ Tools list
tools = [
    PythonREPLTool(),
    Tool(
        name="Calculator",
        func=safe_calculator,
        description="Performs basic math: addition, subtraction, multiplication, division."
    ),
    Tool(
        name="CurrentTime",
        func=current_time_tool,
        description="Returns the current date and time."
    ),
    Tool(
        name="FetchExampleDotCom",
        func=fetch_example_dot_com,
        description="Fetches the HTML content using the user input"
    ),
]


In [8]:
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", temperature=0.7)

agent = initialize_agent(
    tools=tools,
    llm=llm,
    handle_parsing_errors=True,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, # zero shot ReAct doesn't depend on history, only reasoning and steps
    verbose=True #false to hide the process
)


  agent = initialize_agent(


In [11]:
agent.run("What is (25 + 5) * 3?")
time.sleep(2)
agent.run("Give me the current time.")
time.sleep(2)
agent.run("Generate the first 10 prime numbers in Python.")
time.sleep(2)
agent.run("Fetch the content from https://example.com")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: Calculator
Action Input: (25 + 5) * 3[0m
Observation: [33;1m[1;3m90[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: 90[0m

[1m> Finished chain.[0m


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: CurrentTime
Action Input: _[0m
Observation: [38;5;200m[1;3m📅 Current datetime: 2025-08-02 13:49:22[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: 2025-08-02 13:49:22[0m

[1m> Finished chain.[0m


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: Python_REPL
Action Input:
def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

primes = []
num = 2
while len(primes) < 10:
    if is_prime(num):
        primes.append(num)
    num += 1
print(primes)[0m
Observation: [36;1m[1;3m[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
[0m
Thought:[32;1m[1;3mI now

'The HTML content from https://example.com is: <!doctype html><html><head> <title>Example Domain</title> <meta charset="utf-8" /> <meta http-equiv="Content-type" content="text/html; charset=utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1" /> <style type="text/css"> body { background-color: #f0f0f2; margin: 0; padding: 0; font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif; } div { width: 600px; margin: 5em auto; padding: 2em; background-color: #fdfdff; border-radius: 0.5em; box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02); } a:link, a:visited { color: #38488f; text-decoration: none; } @media (max-width: 700px) { div { margin: 0 auto; width: auto; } } </style> </head><body><div> <h1>Example Domain</h1> <p>This domai'

# 📚 RAG: School Event Help Desk

Our RAG system will have a **School Knowledge Vault** with:
- 📅 Dates for school events
- 🎤 Club meeting times
- 📚 Library rules
- 🏆 Competitions and deadlines

📌 **How it works:**
1. You ask a question → "When is the school science fair?"
2. The AI searches the School Knowledge Vault.
3. It finds the exact answer and gives it back to you.

💡 **Why this is fun:**
- You can ask about **realistic school activities**.
- Shows how RAG is useful in daily school life.


In [12]:
from langchain.schema import Document

rag_text = """
🎉 School Events Info:
- The school science fair happens every March in the main hall.
- The art exhibition takes place in April in the art building.
- The annual sports day is held in May at the football field.

🎤 Club Meeting Times:
- Drama Club: Tuesdays at 4 PM in Room 12.
- Coding Club: Wednesdays at 3:30 PM in the computer lab.
- Music Band: Fridays at 5 PM in the music room.

📚 Library Rules:
- You can borrow books for up to 14 days.
- Late returns cost 2 SAR per day.
- Study rooms must be booked in advance.

🏆 Competitions:
- Chess Tournament: Registration closes February 15th.
- Debate Competition: Finals on April 20th in the auditorium.
"""

docs = [Document(page_content=rag_text)]


In [13]:
from langchain.text_splitter import CharacterTextSplitter
from langchain_google_genai import GoogleGenerativeAIEmbeddings

text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=100)
chunks = text_splitter.split_documents(docs)


embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")


In [14]:
from langchain.vectorstores import Chroma

# Use in-memory Chroma DB
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    collection_name="school-kb"
)

retriever = vectorstore.as_retriever(search_kwargs={"k": 1})


In [15]:
from langchain.chains import RetrievalQA

rag_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever
)


## 🔍 Compare Gemini with and without RAG

Use the same query to see how much better Gemini performs with knowledge from your text chunks.


In [16]:
def compare_rag_vs_no_rag(question: str):
    print(f"❓ Question: {question}\n")
    print("🤖 Without RAG (LLM only):")
    print(llm.invoke(question).content)
    print("\n📚 With RAG (using context):")
    print(rag_chain.run(question))
    print("--------------------\n")


In [17]:
import time
compare_rag_vs_no_rag("When is the chess tournament registration deadline?")
time.sleep(2)
compare_rag_vs_no_rag("Where is the art exhibition held?")
time.sleep(2)
compare_rag_vs_no_rag("What are the rules for study rooms?")


❓ Question: When is the chess tournament registration deadline?

🤖 Without RAG (LLM only):
There isn't a single, universal registration deadline for "the" chess tournament.

Chess tournaments are held all the time, all over the world, by various organizations (local clubs, schools, national federations, online platforms). Each tournament has its own specific registration deadline.

**To find the registration deadline for a specific tournament, you need to:**

1.  **Identify the specific tournament** you are interested in.
2.  **Go to its official website or announcement page.** This is where you'll find all the details, including dates, location, entry fees, and the registration deadline.

If you can tell me which tournament you're looking for, I might be able to help you find the information!

📚 With RAG (using context):
Registration for the Chess Tournament closes on February 15th.
--------------------

❓ Question: Where is the art exhibition held?

🤖 Without RAG (LLM only):
I don't 

## 📊 Gemini-Powered Data Analyst Agent (SQL on pandas)

Let the agent accept SQL queries like:
- `What's the average income?`

It will translate the SQL to pandas queries and return results using `pandasql`.


In [18]:
from pandasql import sqldf
from langchain.tools import Tool

import pandas as pd

# Sample DataFrame
df = pd.DataFrame({
    "age": [22, 35, 58, 45, 32],
    "income": [3000, 4500, 5200, 4800, 3900],
    "city": ["Jeddah", "Riyadh", "Dammam", "Jeddah", "Riyadh"]
})

print(df)

# Simple SQL-on-DataFrame tool using pandasql
def sql_query_tool(query: str) -> str:
    try:
        # Enable use of 'df' as table name
        result = sqldf(query, {"df": df})
        return result.to_string(index=False) if not result.empty else "No results."
    except Exception as e:
        return f"⚠️ SQL Error: {e}"


   age  income    city
0   22    3000  Jeddah
1   35    4500  Riyadh
2   58    5200  Dammam
3   45    4800  Jeddah
4   32    3900  Riyadh


In [19]:
tools = [
    Tool(
        name="SQLOnDataFrame",
        func=sql_query_tool,
        description="Query the DataFrame using SQL. "
    )
]


In [20]:
data_sql_agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)


In [21]:
data_sql_agent.run("find me the average income")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: SQLOnDataFrame
Action Input: SELECT AVG(income) FROM df[0m
Observation: [36;1m[1;3m AVG(income)
      4280.0[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: The average income is 4280.0.[0m

[1m> Finished chain.[0m


'The average income is 4280.0.'

In [24]:
data_sql_agent.run("ايش المدينة اللي فيها اعلى راتب")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mAction: SQLOnDataFrame
Action Input: SELECT City, Salary FROM df ORDER BY Salary DESC LIMIT 1[0m
Observation: [36;1m[1;3m⚠️ SQL Error: (sqlite3.OperationalError) no such column: Salary
[SQL: SELECT City, Salary FROM df ORDER BY Salary DESC LIMIT 1]
(Background on this error at: https://sqlalche.me/e/20/e3q8)[0m
Thought:[32;1m[1;3mAction: SQLOnDataFrame
Action Input: SELECT * FROM df LIMIT 1[0m
Observation: [36;1m[1;3m age  income   city
  22    3000 Jeddah[0m
Thought:[32;1m[1;3mAction: SQLOnDataFrame
Action Input: SELECT City, income FROM df ORDER BY income DESC LIMIT 1[0m
Observation: [36;1m[1;3m  city  income
Dammam    5200[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: Dammam[0m

[1m> Finished chain.[0m


'Dammam'

In [25]:
# Why the Agent made many mitakes? how to fix it ?

##contributed by: Yazan Alshoibi