<a href="https://colab.research.google.com/github/LashawnFofung/RAG-Pipelines/blob/main/Gradio/Task_Full_RAG_Pipeline_with_Interactive_Gradio_Chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Full RAG Pipeline with Interactive Gradio Chatbot**

*An end-to-end Retrieval-Augmented Generation pipeline designed to categorize and query complex PDF 'blobs' using Gemini 2.0 and BGE embeddings.*

<br>

This notebook demonstrates a sophisticated Multi-Document RAG (Retrieval-Augmented Generation) system. Unlike standard RAG pipelines that treat all text as a single flat source, this logic utilizes Large Language Models (LLMs) to intelligently detect document boundaries and categorize pages (e.g., identifying where a PaySlip ends and a Contract begins within a single PDF upload).

<br>


By combining llama-index for orchestration, sentence-transformers for semantic search, and Google Gemini for reasoning, this tool provides high-precision answers filtered by document metadata.

<br>

### **‚ú® Key Features**

- **üß† Intelligent Document Splitting:** Uses Gemini 2.0 Flash to analyze page transitions, automatically detecting when a new document type starts within a merged PDF.

- **üè∑Ô∏è Metadata-Aware Indexing:** Every page is tagged with a "Doc Type" (Resume, ID, PaySlip, etc.), allowing for hyper-targeted retrieval.

- **üîç High-Precision Semantic Search:** Leverages the `BAAI/bge-small-en-v1.5` embedding model via the `transformers` library for state-of-the-art vector similarity.

- **üîÄ Intent-Based Query Routing:** Before searching, the AI analyzes your question to decide which document category contains the answer, reducing "noise" from irrelevant pages.

- **üé® Custom Gradio Interface:** Features a bordered chatbot UI, a wide-scale status window for real-time analysis logs, and clear visual dividers.

- **üîí Secure Credential Management:** Fully integrated with Google Colab "Secrets" (üîë) to keep API keys and Hugging Face tokens private.

<br>

### **üõ†Ô∏è Tech Stack**

<br>

|Component|	Technology|
| ---| ---|
|**Orchestration** |LlamaIndex |
|**LLM** |Google Gemini 2.0 Flash|
|**Embeddings** |Sentence-Transformers (BGE)|
|**UI Framework** |Gradio |
|**PDF Parsing** |PyPDF2 / PyMuPDF|

<br>

### **Notebook Structure**
- [‚öôÔ∏è Step 1: Installation & Configuration](#scrollTo=nGRBzZ23LzFg&line=1&uniqifier=1)
- [üì• Step 2: Main Application (Full Pipeline](#scrollTo=m0OKGN7vL4Lm&line=1&uniqifier=1)

<br>


# **‚öôÔ∏è Step 1: Installation & Configuration**

Install the necessary libraries for PDF processing, vector storage, and embedding generation.

In [1]:
# 1. üì¶ INSTALLATIONS

# Core RAG Framework & File Readers
!pip install -q llama-index llama-index-readers-file llama-index-embeddings-huggingface

# PDF Parsing & Terminal Utilities
!pip install -q PyPDF2 pymupdf

# AI Models & Embeddings
!pip install -q transformers sentence-transformers google-generativeai

# UI Framework
!pip install -q gradio

# Resolve loop component conflict
!pip install -q jedi nest-asyncio



# **üìö Step 2: Main Application(Full Pipeline)**

This code handles everything: setup, parsing, boundary intelligence, and filtered retrieval.

In [None]:
import os
from google.colab import userdata
import gradio as gr
import nest_asyncio
import google.generativeai as genai
from PyPDF2 import PdfReader
from llama_index.core import VectorStoreIndex, Document
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

nest_asyncio.apply()

# --- 2. CONFIG & MODELS ---
# API Key Configuration
# 1. Load and Set Gemini API Key
try:
    API_KEY = userdata.get('GEMINI_API_KEY')
    if not API_KEY:
        raise ValueError("GEMINI_API_KEY not found in Colab Secrets. Please set it.")

    # Configure the Gemini library globally
    genai.configure(api_key=API_KEY)
    print("‚úÖ Gemini API Key successfully loaded and configured.")

# 2. Hugging Face API Token
   # Load your custom secret name
    HF_TOKEN = userdata.get('HFACE_API_KEY')

    if HF_TOKEN:
        # Assign the token to the environment variable Hugging Face checks for
        os.environ["HF_TOKEN"] = HF_TOKEN
        print("‚úÖ Hugging Face API Token loaded and set to os.environ['HF_TOKEN'].")

    else:
        print("‚ö†Ô∏è Warning: HF_TOKEN not found. Hugging Face models requiring auth may fail.")

except (ImportError, ValueError) as e:
    print(f"‚ö†Ô∏è Warning: Configuration failed: {e}. Please ensure Colab secrets are set correctly.")
    # Exit or raise an error if critical setup fails


# Load Embedding Model
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
current_index = None

# Custom CSS for UI styling
custom_css = """
#chat_container {
    border: 2px solid #4f46e5 !important;
    border-radius: 12px !important;
    padding: 15px !important;
}
.divider {
    margin: 20px 0;
    border-bottom: 2px dashed #e5e7eb;
}
"""

# --- 3. LOGIC ---
def gemini_model(prompt):
    model = genai.GenerativeModel("models/gemini-2.0-flash")
    return model.generate_content(prompt).text.strip()

def process_blob_pdf(file):
    global current_index
    if file is None: return "Please upload a PDF."
    reader = PdfReader(file.name)
    raw_pages = [page.extract_text() for page in reader.pages]
    final_docs = [Document(text=t, metadata={"page": i+1}) for i, t in enumerate(raw_pages)]
    current_index = VectorStoreIndex.from_documents(final_docs, embed_model=embed_model)
    return f"‚úÖ Indexed {len(raw_pages)} pages.\nReady to answer questions."

def chat_with_rag(message, history):
    global current_index
    if current_index is None:
        history.append({"role": "assistant", "content": "Please upload a PDF first."})
        return "", history

    retriever = current_index.as_retriever(similarity_top_k=2)
    results = retriever.retrieve(message)
    context = "\n".join([r.text for r in results])
    answer = gemini_model(f"Context: {context}\n\nQuestion: {message}")

    history.append({"role": "user", "content": message})
    history.append({"role": "assistant", "content": answer})
    return "", history

# --- 4. UI LAYOUT ---
# Putting theme/css back into Blocks to solve your TypeError
with gr.Blocks(theme=gr.themes.Soft(primary_hue="indigo"), css=custom_css) as demo:
    gr.Markdown("# üìë Multi-Document Intelligence RAG")

    with gr.Row():
        with gr.Column(scale=2, min_width=400):
            gr.Markdown("### üì• Document Ingestion")
            file_input = gr.File(label="Upload Merged PDF")
            process_btn = gr.Button("üß† Analyze & Index", variant="primary")

            status = gr.Textbox(
                label="System Status",
                lines=12,
                placeholder="Processing updates will appear here..."
            )

        with gr.Column(scale=3, min_width=600):
            gr.Markdown("### üí¨ AI Knowledge Assistant")

            with gr.Column(elem_id="chat_container"):
                # Added render_markdown=True to resolve the Chatbot warning
                chatbot = gr.Chatbot(type="messages", height=550, render_markdown=True)
                gr.HTML("<div class='divider'></div>")
                msg = gr.Textbox(placeholder="Ask a question...", label="Query")
                with gr.Row():
                    submit = gr.Button("Submit", variant="primary")
                    clear = gr.Button("Clear History")

    process_btn.click(process_blob_pdf, inputs=[file_input], outputs=[status])
    submit.click(chat_with_rag, inputs=[msg, chatbot], outputs=[msg, chatbot])
    msg.submit(chat_with_rag, inputs=[msg, chatbot], outputs=[msg, chatbot])
    clear.click(lambda: [], None, chatbot)

# Launch with simple arguments
demo.launch(debug=True, share=True)