<a href="https://colab.research.google.com/github/ammarasim/ai-agent-workshop-lums-sep-20/blob/main/02_rag_chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from google.colab import userdata
import google.generativeai as genai

# Retrieve the API key from Colab Secrets
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY') # Replace 'GOOGLE_API_KEY' with the name you used

# Configure the Generative AI SDK with your API key
genai.configure(api_key=GOOGLE_API_KEY)

# Now you can initialize and use the Gemini API
model = genai.GenerativeModel('gemini-1.5-flash')
response = model.generate_content("Hello, Gemini!")
print(response.text)

Hello there!  How can I help you today?



**LLM ChatBot**

In [None]:
from google.colab import userdata
import google.generativeai as genai

# Retrieve the API key from Colab Secrets
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY') # Replace 'GOOGLE_API_KEY' with the name you used

# Configure the Generative AI SDK with your API key
genai.configure(api_key=GOOGLE_API_KEY)

def summarize_paper_no_rag(title_or_doi: str, model_name: str = "gemini-1.5-flash"):
    """
    Ask Gemini to summarize a paper given title/DOI only.
    If paper is new, model may hallucinate.
    """
    prompt = f"""
    Summarize the following research paper in 150 words.
    Paper: {title_or_doi}
    Highlight key contributions and novelty.
    """
    try:
        model = genai.GenerativeModel(model_name)
        response = model.generate_content(prompt)
        return response.text
    except Exception as e:
        return f"An error occurred: {e}"

if __name__ == "__main__":
    # Example: brand new paper
    paper_title = "Numerical investigations on CdTe-based rectangular photonic waveguides"
   # print("🔹 LLM-only Summarization (Expect hallucination if paper is new):")
    print(summarize_paper_no_rag(paper_title))


This research paper numerically investigates the optical properties of CdTe-based rectangular photonic waveguides using finite-element methods.  The key contribution lies in the comprehensive analysis of waveguide performance across a range of geometrical parameters (width and height) and wavelengths.  The study explores the impact of these parameters on confinement factor, effective refractive index, and propagation loss, providing valuable design guidelines for optimizing CdTe waveguide performance for specific applications.  Novelty is established through a detailed investigation of polarization characteristics and the identification of optimal dimensions for achieving low propagation loss and strong light confinement, particularly at wavelengths relevant to infrared applications. The findings offer significant insights for the design and fabrication of efficient CdTe-based photonic integrated circuits.



**AI Agent**

In [None]:
from google.colab import userdata
import google.generativeai as genai

# Retrieve the API key from Colab Secrets
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY') # Replace 'GOOGLE_API_KEY' with the name you used

# Configure the Generative AI SDK with your API key
genai.configure(api_key=GOOGLE_API_KEY)

def process_with_gemini(query,model_name: str = "gemini-1.5-flash"):
    """Use Gemini API to extract preference from query."""
    prompt = f"Extract the time preference (morning or afternoon) from this query: '{query}'. Return only 'morning' or 'afternoon', or None if unclear."
    try:
        response = model.generate_content(prompt)
        preference = response.text.strip()
        if preference in ["morning", "afternoon"]:
            return preference
        return None
    except Exception as e:
        print(f"Gemini API error: {e}")
        return None

def select_time_slot(preference):
    """Select a time slot based on preference (internal knowledge)."""
    time_slots = {
        "morning": ["9:00 AM", "10:00 AM", "11:00 AM"],
        "afternoon": ["1:00 PM", "2:00 PM", "3:00 PM"]
    }
    if preference in time_slots:
        return time_slots[preference][0]  # Select first available slot
    return None

def ai_agent(query):
    """Simple AI Agent for scheduling a meeting."""
    print(f"Query: {query}")

    # Step 1: Process input with Gemini API
    preference = process_with_gemini(query)
    if not preference:
        print("Error: Could not identify preference (morning/afternoon).")
        return None
    print(f"Preference: {preference}")

    # Step 2: Select time slot (action)
    time_slot = select_time_slot(preference)
    if not time_slot:
        print("Error: No available time slots for the preference.")
        return None
    print(f"Action: Selected time slot {time_slot}")

    # Step 3: Output result
    result = f"Meeting scheduled at {time_slot}"
    print(f"Result: {result}")
    return result

# Run the agent
if __name__ == "__main__":
    query = "Schedule a meeting in the afternoon"
    result = ai_agent(query)

Query: Schedule a meeting in the afternoon
Preference: afternoon
Action: Selected time slot 1:00 PM
Result: Meeting scheduled at 1:00 PM


**AI Agent with RAG**

In [None]:
   # Install dependencies
!pip install chromadb pypdf2 google-generativeai
from google.colab import userdata
import google.generativeai as genai

# Retrieve the API key from Colab Secrets
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY') # Replace 'GOOGLE_API_KEY' with the name you used

# Configure the Generative AI SDK with your API key
genai.configure(api_key=GOOGLE_API_KEY)

from google.colab import files
import google.generativeai as genai
import chromadb
from chromadb.utils import embedding_functions
from PyPDF2 import PdfReader


CHROMA_DB = "paper_db"
CHUNK_SIZE = 800    # words approx.
CHUNK_OVERLAP = 100
# ============================================

# Upload a PDF from your computer
uploaded = files.upload()
pdf_path = list(uploaded.keys())[0]
print(f"✅ Uploaded file: {pdf_path}")


# Setup embedding function (Gemini embeddings for ChromaDB)
embedding_func = embedding_functions.GoogleGenerativeAiEmbeddingFunction(
    api_key=GOOGLE_API_KEY,
    model_name="models/embedding-001"   # Gemini embeddings
)


# Initialize Chroma client
chroma_client = chromadb.PersistentClient(path=CHROMA_DB)

# Reset collection if exists
try:
    chroma_client.delete_collection("papers")
except:
    pass

collection = chroma_client.create_collection(
    name="papers",
    embedding_function=embedding_func
)

# ---------------- PDF Parsing ----------------
def load_pdf_text(pdf_path):
    reader = PdfReader(pdf_path)
    text = ""
    for page in reader.pages:
        txt = page.extract_text()
        if txt:
            text += txt + "\n"
    return text

# ---------------- Chunking ----------------
def chunk_text(text, chunk_size=800, overlap=100):
    words = text.split()
    chunks = []
    start = 0
    while start < len(words):
        end = min(start + chunk_size, len(words))
        chunk = " ".join(words[start:end])
        if len(chunk.split()) > 50:  # filter out tiny fragments
            chunks.append(chunk)
        start += chunk_size - overlap
    return chunks

# ---------------- Load + Store in Chroma ----------------
doc_text = load_pdf_text(pdf_path)
chunks = chunk_text(doc_text, CHUNK_SIZE, CHUNK_OVERLAP)

for i, chunk in enumerate(chunks):
    collection.add(documents=[chunk], ids=[str(i)])

print("✅ ChromaDB collection built successfully! Total chunks:", len(chunks))

# ---------------- RAG Query ----------------
def rag_query(question, top_k=4, show_chunks=False, model_name="gemini-1.5-flash"):
    results = collection.query(query_texts=[question], n_results=top_k)
    retrieved_docs = results["documents"][0]

    if show_chunks:
        print("\n🔍 Retrieved Chunks:")
        for i, d in enumerate(retrieved_docs):
            print(f"\n--- Chunk ---\n{d[:500]}...\n")

    context = "\n\n".join(retrieved_docs)
    prompt = f"""
    You are a research assistant. Based ONLY on the context, answer the question clearly.
    If context is insufficient, say "I don’t know".

    Context:
    {context}

    Question: {question}
    """

    try:
        model = genai.GenerativeModel(model_name)
        response = model.generate_content(prompt)
        return response.text
    except Exception as e:
        return f"❌ Error: {e}"


# ---------------- Run Example ----------------
answer = rag_query("What are the main contributions of this paper?")
print("\n🔹 RAG Answer:", answer)



**AI Agent with RAG and voice stack**

In [None]:
!pip install chromadb pypdf2 google-generativeai
from google.colab import userdata
import google.generativeai as genai
import chromadb
from chromadb.utils import embedding_functions
from PyPDF2 import PdfReader
from google.colab import files

# ============================================
# 🔑 API Key
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)

# ============================================
# 📂 Database + Chunking Settings
CHROMA_DB = "paper_db"
CHUNK_SIZE = 800
CHUNK_OVERLAP = 100

# ============================================
# 🧠 Embedding Function (Gemini)
embedding_func = embedding_functions.GoogleGenerativeAiEmbeddingFunction(
    api_key=GOOGLE_API_KEY,
    model_name="models/embedding-001"
)

# Initialize Chroma client
chroma_client = chromadb.PersistentClient(path=CHROMA_DB)

# Reset collection if exists
try:
    chroma_client.delete_collection("papers")
except:
    pass

collection = chroma_client.create_collection(
    name="papers",
    embedding_function=embedding_func
)

# ============================================
# 📑 PDF Handling
def load_pdf_text(pdf_path):
    reader = PdfReader(pdf_path)
    text = ""
    for page in reader.pages:
        txt = page.extract_text()
        if txt:
            text += txt + "\n"
    return text

def chunk_text(text, chunk_size=800, overlap=100):
    words = text.split()
    chunks = []
    start = 0
    while start < len(words):
        end = min(start + chunk_size, len(words))
        chunk = " ".join(words[start:end])
        if len(chunk.split()) > 50:
            chunks.append(chunk)
        start += chunk_size - overlap
    return chunks

def upload_and_store_pdfs():
    uploaded = files.upload()
    for filename in uploaded.keys():
        print(f"✅ Uploaded: {filename}")
        doc_text = load_pdf_text(filename)
        chunks = chunk_text(doc_text, CHUNK_SIZE, CHUNK_OVERLAP)
        for i, chunk in enumerate(chunks):
            collection.add(documents=[chunk], ids=[f"{filename}_{i}"])
    print("✅ All PDFs added to ChromaDB")

# ============================================
# 💬 Conversation Memory
chat_history = []

def reset_memory():
    global chat_history
    chat_history = []
    print("🧹 Memory cleared.")

# ============================================
# 🔍 RAG Query (Hybrid: Chroma + Gemini)
def rag_query(question, top_k=4, show_chunks=False, model_name="gemini-1.5-flash"):
    # Search Chroma
    results = collection.query(query_texts=[question], n_results=top_k)
    retrieved_docs = results["documents"][0] if results["documents"] else []

    context = "\n\n".join(retrieved_docs)

    if show_chunks and retrieved_docs:
        print("\n🔍 Retrieved Chunks:")
        for i, d in enumerate(retrieved_docs):
            print(f"\n--- Chunk ---\n{d[:500]}...\n")

    # Build prompt with history + context
    history_text = "\n".join([f"User: {h['q']}\nAssistant: {h['a']}" for h in chat_history])
    prompt = f"""
    You are a research assistant.
    Use the following:
    1. Retrieved context from papers (if available).
    2. Your world knowledge for missing info.

    If context is insufficient, combine both sources carefully.
    If still unsure, say "I don't know".

    Conversation so far:
    {history_text}

    Context from papers:
    {context}

    Question: {question}
    """

    try:
        model = genai.GenerativeModel(model_name)
        response = model.generate_content(prompt)
        answer = response.text
        # Save to memory
        chat_history.append({"q": question, "a": answer})
        return answer
    except Exception as e:
        return f"❌ Error: {e}"

# ============================================
# 🚀 Usage
# Step 1: Upload PDFs
upload_and_store_pdfs()

# Step 2: Ask Questions
print("\n🔹 RAG Answer:", rag_query("What are the main contributions of this paper?"))


**Voice-enabled RAG Agent**

In [None]:
# ==============================
# Install dependencies
# ==============================
!pip install --upgrade --quiet chromadb PyPDF2 gTTS "click==8.1.8" openai-whisper pydub google-generativeai
from google.colab import userdata
import google.generativeai as genai

# Retrieve the API key from Colab Secrets
GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY') # Replace 'GOOGLE_API_KEY' with the name you used

# Configure the Generative AI SDK with your API key
genai.configure(api_key=GOOGLE_API_KEY)

# ==============================
# Imports
# ==============================
import os
import chromadb
from chromadb.utils import embedding_functions
from PyPDF2 import PdfReader
import google.generativeai as genai
import whisper
from gtts import gTTS
from IPython.display import Audio, display
from google.colab import files

# ==============================
# Setup API Key (Gemini)
# ==============================


# ==============================
# PDF Upload + Text Chunking
# ==============================
uploaded = files.upload()
pdf_path = list(uploaded.keys())[0]
print(f"✅ Uploaded file: {pdf_path}")

def extract_pdf_text(pdf_file):
    reader = PdfReader(pdf_file)
    text = ""
    for page in reader.pages:
        text += page.extract_text() + "\n"
    return text

def chunk_text(text, chunk_size=1000, overlap=200):
    chunks = []
    start = 0
    while start < len(text):
        end = start + chunk_size
        chunks.append(text[start:end])
        start += chunk_size - overlap
    return chunks

pdf_text = extract_pdf_text(pdf_path)
chunks = chunk_text(pdf_text)

print(f"📄 Extracted {len(chunks)} chunks from PDF")

# ==============================
# Setup ChromaDB + Embeddings
# ==============================
embedding_func = embedding_functions.DefaultEmbeddingFunction()
client = chromadb.Client()
try:
    client.delete_collection("papers")
except:
    pass

collection = client.create_collection("papers", embedding_function=embedding_func)

for i, chunk in enumerate(chunks):
    collection.add(documents=[chunk], ids=[str(i)])

print("✅ ChromaDB collection built successfully!")

# ==============================
# Whisper Model (Speech-to-Text)
# ==============================
stt_model = whisper.load_model("base")

def speech_to_text(audio_file):
    """Convert speech audio file to text using Whisper"""
    result = stt_model.transcribe(audio_file)
    return result["text"]

# ==============================
# TTS with gTTS
# ==============================
def text_to_speech(text, filename="response.mp3"):
    """Convert text to speech"""
    tts = gTTS(text=text, lang="en")
    tts.save(filename)
    return Audio(filename, autoplay=True)

# ==============================
# RAG Query Function
# ==============================
def rag_query(user_query):
    # Retrieve top 3 chunks from Chroma
    results = collection.query(query_texts=[user_query], n_results=3)
    retrieved_texts = " ".join([doc for doc in results["documents"][0]])

    # Combine query with retrieved context
    prompt = f"""
You are an AI assistant. Use the following retrieved text plus your own knowledge
to answer the question.

Retrieved Context:
{retrieved_texts}

User Question:
{user_query}

Answer clearly and concisely:
"""

    # Gemini Response
    model = genai.GenerativeModel("gemini-1.5-flash")
    response = model.generate_content(prompt)
    return response.text

# ==============================
# Interactive Usage
# ==============================
print("✅ System ready! You can now query by text or voice.")

# Example 1: Text Query
user_question = "What are the main contributions of this paper?"
rag_answer = rag_query(user_question)
print("🤖 RAG Answer:", rag_answer)
display(text_to_speech(rag_answer))

# Example 2: Voice Query (Upload audio file)
print("\n🎤 Upload an audio file to ask a question...")
uploaded_audio = files.upload()
audio_path = list(uploaded_audio.keys())[0]

query_text = speech_to_text(audio_path)
print("🎤 Recognized Speech:", query_text)

rag_answer = rag_query(query_text)
print("🤖 RAG Answer:", rag_answer)
display(text_to_speech(rag_answer))



[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/803.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m798.7/803.2 kB[0m [31m28.4 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m803.2/803.2 kB[0m [31m17.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m98.2/98.2 kB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Saving Sample.pdf to Sample.pdf
✅ Uploaded file: Sample.pdf
📄 Extracted 51 chunks from PDF


/root/.cache/chroma/onnx_models/all-MiniLM-L6-v2/onnx.tar.gz: 100%|██████████| 79.3M/79.3M [00:00<00:00, 86.5MiB/s]


✅ ChromaDB collection built successfully!


100%|███████████████████████████████████████| 139M/139M [00:08<00:00, 17.9MiB/s]


✅ System ready! You can now query by text or voice.
🤖 RAG Answer: The provided text is an abstract and references section, not the full paper.  Therefore, its main contributions cannot be determined. The abstract touches upon the theoretical exploration of the asymmetry between electric and magnetic monopoles, posing questions about the universe's preference for electric monopoles and potential implications in alternate realities.  It does not state specific findings or novel contributions.




🎤 Upload an audio file to ask a question...


KeyboardInterrupt: 