# üöÄ InsightRAG: Retrieval-Augmented Generation Document Q&A System
Welcome to **InsightRAG**, an advanced system for intelligent question answering over your documents using Retrieval-Augmented Generation (RAG).
 
---
 
### üìÑ What Can You Do With This Notebook?
- **Upload and process PDF documents** effortlessly
- **Ask questions** about your document content
- **Receive AI-generated answers** with relevant page citations
- **Save and download session logs** for future reference
 
---
 
### üß© System Components Overview
- **PDF Processing**: Extracts and chunks text from your PDF documents
- **Embedding Generation**: Converts text chunks into numerical representations for efficient search
- **Vector Search**: Finds the most relevant document sections for your queries
- **Language Model**: Generates coherent answers using Microsoft's Phi-2
- **Web Interface**: Modern, user-friendly interface for seamless interaction

## ‚öôÔ∏è Environment Setup & Imports
Before you start, ensure all required packages are installed and the necessary libraries are imported.
 
**Key Libraries:**
- `PyMuPDF`: Fast PDF processing
- `Flask` & `flask_cors`: Web application framework and CORS support
- `Sentence-transformers`: For generating text embeddings
- `Transformers`: For loading and running the language model
 
> _Tip: Run the next cell to install all dependencies in one go!_

In [None]:
!pip install PyMuPDF flask_cors pyngrok nest_asyncio

import re
import os
import fitz
import torch
import logging
import sys
import atexit
import datetime
from pathlib import Path
from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer, AutoModelForCausalLM
from flask import Flask, request, jsonify, render_template_string, redirect
from werkzeug.utils import secure_filename
from flask_cors import CORS
from transformers import BitsAndBytesConfig
import nest_asyncio
import warnings
from pyngrok import ngrok
import glob

# At the top of your script
COLAB_UPLOADS_PATH = "/content/uploads"
if not os.path.exists(COLAB_UPLOADS_PATH):
    os.makedirs(COLAB_UPLOADS_PATH)

Collecting PyMuPDF
  Downloading pymupdf-1.25.5-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (3.4 kB)
Collecting flask_cors
  Downloading flask_cors-5.0.1-py3-none-any.whl.metadata (961 bytes)
Collecting pyngrok
  Downloading pyngrok-7.2.4-py3-none-any.whl.metadata (8.7 kB)
Downloading pymupdf-1.25.5-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (20.0 MB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m20.0/20.0 MB[0m [31m35.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading flask_cors-5.0.1-py3-none-any.whl (11 kB)
Downloading pyngrok-7.2.4-py3-none-any.whl (23 kB)
Installing collected packages: pyngrok, PyMuPDF, flask_cors
Successfully installed PyMuPDF-1.25.5 flask_cors-5.0.1 pyngrok-7.2.4


## üìù Logging System Setup
A robust logging system is configured to help you:
- Track all queries and responses
- Monitor retrieved document chunks
- Measure system performance
- Capture error messages for debugging
 
**Log Outputs:**
1. Detailed technical logs (`.log` files)
2. Human-readable session summaries (`.txt` files)
 
> _Logs are automatically saved for every session!_

In [None]:
warnings.filterwarnings('ignore')

# Setup directories and logging
logs_dir = Path("/content/rag_logs")
logs_dir.mkdir(exist_ok=True)
uploads_dir = Path("/content/uploads")
uploads_dir.mkdir(exist_ok=True)

session_timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
log_filename = logs_dir / f"rag_session_{session_timestamp}.log"

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.StreamHandler(sys.stdout),
        logging.FileHandler(log_filename)
    ]
)
logger = logging.getLogger(__name__)
session_history = []

## ? PDF Processing & Embedding Generation
This section defines the core functions to:
1. **Extract text** from PDF documents
2. **Split text** into manageable chunks for efficient retrieval
3. **Generate embeddings** for each chunk using a transformer model
4. **Store embeddings** for fast and accurate search
 
> _Uses the `all-mpnet-base-v2` model for high-quality, efficient embeddings._

In [None]:
def save_session_summary():
    try:
        summary_filename = logs_dir / f"session_summary_{session_timestamp}.txt"
        with open(summary_filename, 'w', encoding='utf-8') as f:
            f.write(f"RAG Session Summary - {datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
            f.write("="*80 + "\n\n")
            if session_history:
                for i, interaction in enumerate(session_history, 1):
                    f.write(f"Interaction {i}:\n")
                    f.write("-"*40 + "\n")
                    f.write(f"Query: {interaction['query']}\n")
                    f.write(f"Timestamp: {interaction['timestamp']}\n")
                    f.write(f"Retrieved Pages: {interaction['citations']}\n")
                    f.write(f"Response: {interaction['response']}\n")
                    f.write("\n" + "="*40 + "\n\n")
            else:
                f.write("No queries were made during this session.\n")
        logger.info(f"Session summary saved to {summary_filename}")
    except Exception as e:
        logger.error(f"Error saving session summary: {e}")

def process_input_source(source):
    pdf_files = []
    if isinstance(source, str):
        if os.path.isdir(source):
            pdf_files.extend(glob.glob(os.path.join(source, "*.pdf")))
        elif os.path.isfile(source) and source.lower().endswith('.pdf'):
            pdf_files.append(source)
    return pdf_files

def process_multiple_pdfs(pdf_paths):
    all_chunks = []
    all_page_numbers = []
    all_sources = []

    for pdf_path in pdf_paths:
        try:
            doc = fitz.open(pdf_path)
            pdf_name = os.path.basename(pdf_path)
            for page_num, page in enumerate(doc, 1):
                text = page.get_text()
                chunks = text.split('\n\n')
                for chunk in chunks:
                    if chunk.strip():
                        all_chunks.append(chunk.strip())
                        all_page_numbers.append(page_num)
                        all_sources.append(pdf_name)
            doc.close()
            logger.info(f"Processed {pdf_name}: {len(chunks)} chunks extracted")
        except Exception as e:
            logger.error(f"Error processing {pdf_path}: {str(e)}")
    return all_chunks, all_page_numbers, all_sources

embedding_model = SentenceTransformer('all-mpnet-base-v2')
def generate_embeddings(text_chunks):
    return embedding_model.encode(text_chunks)

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.4k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## üóÇÔ∏è Vector Store Implementation
A simple, memory-based vector store is implemented to:
- Store text chunks and their embeddings
- Perform fast similarity search using cosine similarity
- Return the most relevant text chunks with page numbers and sources
- Log similarity scores for transparency and debugging
 
> _Best for documents up to a few thousand pages. For larger datasets, consider a scalable vector database._

In [None]:
class SimpleVectorStore:
    def __init__(self, embeddings, texts, page_numbers, sources):
        self.embeddings = embeddings
        self.texts = texts
        self.page_numbers = page_numbers
        self.sources = sources

    def search(self, query_embedding, top_k=3):
        similarities = torch.nn.functional.cosine_similarity(
            torch.tensor(query_embedding).unsqueeze(0),
            torch.tensor(self.embeddings)
        )
        top_indices = similarities.argsort(descending=True)[:top_k]
        return [(self.texts[i], self.page_numbers[i], self.sources[i]) for i in top_indices]

## ü§ñ Language Model Setup: Microsoft Phi-2
This section configures the **Phi-2** language model for generating answers:
- Loads the model in 16-bit precision for efficiency
- Sets up prompt templates for high-quality responses
- Configures generation parameters for optimal output
- Cleans up model outputs for clarity
 
**Why Phi-2?**
- Excellent performance on knowledge-based tasks
- Efficient resource usage (runs locally)
- Reliable, coherent answers

In [None]:
model_id = "microsoft/phi-2"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True
)

def generate_response(prompt, max_new_tokens=500):
    try:
        inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            temperature=0.7,
            do_sample=True,
            top_p=0.95,
            pad_token_id=tokenizer.eos_token_id
        )
        response = tokenizer.decode(outputs[0], skip_special_tokens=True)
        cleaned_response = response
        if "Response:" in response:
            cleaned_response = response.split("Response:")[-1].strip()
        elif "Answer:" in response:
            cleaned_response = response.split("Answer:")[-1].strip()
        return cleaned_response
    except Exception as e:
        logger.error(f"Error in generate_response: {str(e)}")
        raise e

tokenizer_config.json:   0%|          | 0.00/7.34k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/1.08k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/99.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/735 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/35.7k [00:00<?, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/564M [00:00<?, ?B/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

## üîó RAG Pipeline: Bringing It All Together
This section combines all components into a seamless RAG pipeline:
1. **Process incoming queries** from the user
2. **Retrieve relevant context** from your documents
3. **Generate prompts** for the language model
4. **Produce answers** with clear citations
5. **Log all interactions** for transparency
 
**Pipeline Features:**
- Accurate context retrieval
- Automatic citation tracking
- Robust error handling
- Clean, user-friendly responses

In [None]:
def convert_markdown_to_html(text):
    """Convert markdown-style formatting to HTML."""
    # Convert **text** to <strong>text</strong>
    text = re.sub(r'\*\*(.*?)\*\*', r'<strong>\1</strong>', text)
    # Convert __text__ to <strong>text</strong> (alternative bold syntax)
    text = re.sub(r'__(.*?)__', r'<strong>\1</strong>', text)
    return text

def remove_duplicate_lines(text):
    """Removes repeated lines from the model's output."""
    lines = text.split("\n")
    seen = set()
    new_lines = []
    for line in lines:
        stripped_line = line.strip()
        if stripped_line and stripped_line not in seen:
            seen.add(stripped_line)
            new_lines.append(line)
    return "\n".join(new_lines)

def remove_repeated_questions(text, question):
    """Ensures the question does not appear multiple times in the response."""
    pattern = re.compile(re.escape(question), re.IGNORECASE)
    occurrences = pattern.findall(text)
    if len(occurrences) > 1:
        first_occurrence = True
        def replacer(match):
            nonlocal first_occurrence
            if first_occurrence:
                first_occurrence = False
                return match.group(0)
            else:
                return ""
        text = pattern.sub(replacer, text)
    return text

def format_answer(question, answer, citations):
    """Formats the final answer in HTML with enhanced styling."""

    # Convert markdown formatting in the answer
    answer = convert_markdown_to_html(answer)

    def structure_content(text):
        sections = text.split('\n\n')
        main_content = sections[0]

        lists = []
        for section in sections[1:]:
            if any(line.strip().startswith(('‚Ä¢', '-', '*', '1.', '2.', '3.')) for line in section.split('\n')):
                lists.append(section)

        return main_content, lists

    main_content, additional_sections = structure_content(answer)

    # Format lists if they exist
    list_html = ""
    if additional_sections:
        list_html = "<div class='additional-content'>"
        for section in additional_sections:
            items = [line.strip().lstrip('‚Ä¢-*123456789. ') for line in section.split('\n') if line.strip()]
            list_html += "<ul class='content-list'>"
            for item in items:
                list_html += f"<li>{convert_markdown_to_html(item)}</li>"
            list_html += "</ul>"
        list_html += "</div>"

    # Format citations
    citations_html = " ".join(
        f'<span class="citation-tag">{citation}</span>'
        for citation in citations
    )

    return f"""
    <div class="answer-message">
        <div class="answer-header">
            <div class="question-text">Q: {question}</div>
        </div>

        <div class="answer-content">
            <div class="main-content">{main_content}</div>
            {list_html}
        </div>

        <div class="answer-footer">
            <div class="citations-section">
                <span class="citations-label">Sources:</span>
                <div class="citations-container">{citations_html}</div>
            </div>
        </div>
    </div>
    """

def rag_pipeline(query, vector_store, max_context_length=5000):
    try:
        query_embedding = embedding_model.encode([query])[0]
        relevant_texts_with_info = vector_store.search(query_embedding)
        citations = []
        context_parts = []
        total_length = 0
        context_set = set()

        for text, page, source in relevant_texts_with_info:
            cleaned_text = text.strip()
            if cleaned_text not in context_set:
                context_set.add(cleaned_text)
                citations.append(f"{source} (Page {page})")
                if len(cleaned_text) > max_context_length:
                    cleaned_text = cleaned_text[:max_context_length] + "..."
                if total_length + len(cleaned_text) <= max_context_length:
                    context_parts.append(cleaned_text)
                    total_length += len(cleaned_text)
                else:
                    break

        context = " ".join(context_parts)

        prompt = f"""You are a chatbot which answers question(s) to the point. The context will contain the relevant information retrieved from a vector store based on vector search. The question will be question asked by the user from a web chatbot. If the information provided is not sufficient to answer the question accurately, state that you don't have enough information to provide a reliable answer. Do not refer the context in the answer. Do not give additional references to the information as this will be given as citations separately. Do not have any prefix to the answer like 'according to the context' or something. Start with the answer. Try to answer the question using the context and the question. Only when asked for troubleshooting also find out the troubleshooting steps for the same if present in the context. The output should only contain the answer and nothing else. If there is no context provided then please repond \"Sorry. I don't find information for the question.\" And dont create answer on your own.

Question: {query}

Answer:"""

        response = generate_response(prompt)
        response = remove_duplicate_lines(response)
        response = remove_repeated_questions(response, query)

        session_history.append({
            'timestamp': datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
            'query': query,
            'citations': citations,
            'response': response
        })

        return format_answer(query, response, citations)
    except Exception as e:
        return f"""
        <div class="error">
            I apologize, but I encountered an error while processing your query.
            Please try asking a more specific or shorter question.
        </div>"""

## üåê Web Interface: Modern Q&A Experience
A sleek, Flask-based web interface is provided with:
- **Modern, responsive design** for all devices
- **Real-time interaction** and instant feedback
- **Question history tracking** for easy review
- **Session management** and log downloading
 
**Colab-Optimized Features:**
- Loading indicators and error handling
- Clear citation display
- Easy session control and export

In [None]:
INITIAL_TEMPLATE = '''
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>InsightRAG - Document Setup</title>
    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap" rel="stylesheet">
    <style>
        :root {
            --primary-color: #3182ce;
            --secondary-color: #2c5282;
            --background-light: #f0f4f8;
            --text-dark: #1a202c;
            --text-muted: #4a5568;
            --border-color: #e2e8f0;
            --white: #ffffff;
            --error-color: #e53e3e;
        }

        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }

        body {
            font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;
            line-height: 1.6;
            color: var(--text-dark);
            background-color: var(--background-light);
            display: flex;
            justify-content: center;
            align-items: center;
            min-height: 100vh;
            padding: 1rem;
        }

        .container {
            background: var(--white);
            border-radius: 16px;
            box-shadow: 0 12px 32px rgba(0, 0, 0, 0.08), 0 4px 8px rgba(0, 0, 0, 0.04);
            width: 100%;
            max-width: 600px;
            padding: 2.5rem;
            animation: fadeIn 0.5s ease-out;
        }

        @keyframes fadeIn {
            from { opacity: 0; transform: translateY(20px); }
            to { opacity: 1; transform: translateY(0); }
        }

        .app-logo {
            display: flex;
            justify-content: center;
            align-items: center;
            margin-bottom: 1.5rem;
            font-size: 3rem;
        }

        .title {
            text-align: center;
            margin-bottom: 1.5rem;
            color: var(--text-dark);
        }

        .title h1 {
            font-size: 1.75rem;
            font-weight: 700;
        }

        .subtitle {
            text-align: center;
            color: var(--text-muted);
            margin-bottom: 2rem;
        }

        .input-methods {
            display: grid;
            gap: 1rem;
        }

        .method-card {
            border: 2px solid var(--border-color);
            border-radius: 12px;
            padding: 1.25rem;
            display: flex;
            align-items: center;
            gap: 1rem;
            cursor: pointer;
            transition: all 0.3s ease;
        }

        .method-card:hover {
            border-color: var(--primary-color);
            box-shadow: 0 8px 24px rgba(49, 130, 206, 0.1);
        }

        .method-card.active {
            border-color: var(--primary-color);
            background-color: rgba(49, 130, 206, 0.05);
        }

        .method-icon {
            font-size: 2rem;
            color: var(--primary-color);
        }

        .method-details {
            flex-grow: 1;
        }

        .method-title {
            font-weight: 600;
            margin-bottom: 0.25rem;
        }

        .method-description {
            color: var(--text-muted);
            font-size: 0.875rem;
        }

        .input-section {
            display: none;
            margin-top: 1.5rem;
            animation: slideIn 0.3s ease-out;
        }

        .input-section.active {
            display: block;
        }

        @keyframes slideIn {
            from { opacity: 0; transform: translateY(-10px); }
            to { opacity: 1; transform: translateY(0); }
        }

        .input-field {
            width: 100%;
            padding: 0.75rem 1rem;
            border: 2px solid var(--border-color);
            border-radius: 8px;
            font-size: 1rem;
            transition: all 0.3s ease;
        }

        .input-field:focus {
            outline: none;
            border-color: var(--primary-color);
            box-shadow: 0 0 0 3px rgba(49, 130, 206, 0.1);
        }

        .submit-button {
            width: 100%;
            padding: 0.875rem 1.25rem;
            background-color: var(--primary-color);
            color: var(--white);
            border: none;
            border-radius: 8px;
            font-weight: 600;
            cursor: pointer;
            transition: all 0.3s ease;
            margin-top: 1rem;
        }

        .submit-button:hover {
            background-color: var(--secondary-color);
        }

        .submit-button:disabled {
            background-color: var(--border-color);
            cursor: not-allowed;
        }

        .error-message {
            color: var(--error-color);
            margin-top: 0.5rem;
            text-align: center;
            font-size: 0.875rem;
        }

        .loading-spinner {
            border: 4px solid rgba(49, 130, 206, 0.1);
            border-top: 4px solid var(--primary-color);
            border-radius: 50%;
            width: 40px;
            height: 40px;
            animation: spin 1s linear infinite;
            margin: 1rem auto;
            display: none;
        }

        .loading-text {
            text-align: center;
            color: var(--text-muted);
            margin-top: 0.5rem;
            font-size: 0.875rem;
            display: none;
        }

        @keyframes spin {
            0% { transform: rotate(0deg); }
            100% { transform: rotate(360deg); }
        }

        @media (max-width: 480px) {
            .container {
                padding: 1.5rem;
            }
        }
    </style>
</head>
<body>
    <div class="container">
        <div class="app-logo">üìö</div>
        <div class="title">
            <h1>InsightRAG</h1>
        </div>
        <div class="subtitle">
            Upload documents to start your intelligent Q&A journey
        </div>

        <div class="input-methods">
            <div id="method-1" class="method-card" onclick="showInputSection('1')">
                <div class="method-icon">üì§</div>
                <div class="method-details">
                    <div class="method-title">Upload PDF Files</div>
                    <div class="method-description">Select and upload documents from your computer</div>
                </div>
            </div>

            <div id="method-3" class="method-card" onclick="showInputSection('3')">
                <div class="method-icon">üìÇ</div>
                <div class="method-details">
                    <div class="method-title">Local Directory</div>
                    <div class="method-description">Upload PDF files from a local directory</div>
                </div>
            </div>
        </div>

        <div id="input-1" class="input-section">
            <input
                type="file"
                id="pdf-file"
                class="input-field"
                accept=".pdf"
                multiple
                onchange="handleFileSelect(this)"
                aria-label="PDF file upload"
            >
            <div id="file-count-1" style="margin-top: 0.5rem; color: var(--text-muted);"></div>
            <button
                id="submit-1"
                class="submit-button"
                onclick="submitForm('1')"
                disabled
            >
                Process PDFs
            </button>
        </div>

        <div id="input-3" class="input-section">
            <input
                type="file"
                id="dir-files"
                class="input-field"
                webkitdirectory
                directory
                multiple
                onchange="handleDirectorySelect(this)"
                aria-label="Directory upload"
            >
            <div id="file-count-3" style="margin-top: 0.5rem; color: var(--text-muted);"></div>
            <button
                id="submit-3"
                class="submit-button"
                onclick="submitForm('3')"
                disabled
            >
                Process Directory
            </button>
        </div>

        <div id="error" class="error-message"></div>
        <div id="loading" class="loading-spinner"></div>
        <div id="loading-text" class="loading-text">Processing PDF files...</div>
    </div>

    <script>
        function handleFileSelect(input) {
            const fileCount = Array.from(input.files).filter(file =>
                file.name.toLowerCase().endsWith('.pdf')).length;
            updateFileCount('1', fileCount);
        }

        function handleDirectorySelect(input) {
            const fileCount = Array.from(input.files).filter(file =>
                file.name.toLowerCase().endsWith('.pdf')).length;
            updateFileCount('3', fileCount);
        }

        function updateFileCount(method, count) {
            const fileCountDiv = document.getElementById(`file-count-${method}`);
            const submitButton = document.getElementById(`submit-${method}`);

            if (count > 0) {
                fileCountDiv.textContent = `Selected \$2count} PDF file${count === 1 ? '' : 's'}`;
                submitButton.disabled = false;
            } else {
                fileCountDiv.textContent = 'No PDF files selected';
                submitButton.disabled = true;
            }
        }

        function submitForm(method) {
            const fileInput = method === '1' ?
                document.getElementById('pdf-file') :
                document.getElementById('dir-files');
            const submitButton = document.getElementById(`submit-${method}`);
            const errorElement = document.getElementById('error');
            const loadingSpinner = document.getElementById('loading');
            const loadingText = document.getElementById('loading-text');

            // Reset previous states
            errorElement.textContent = '';

            // Validate files
            if (!fileInput.files || fileInput.files.length === 0) {
                errorElement.textContent = 'Please select files first';
                return;
            }

            // Create FormData and append files
            const formData = new FormData();
            formData.append('method', method);

            // Log the number of files being processed
            console.log(`Processing \$2fileInput.files.length} files...`);

            // Add all files to formData
            Array.from(fileInput.files).forEach(file => {
                if (file.name.toLowerCase().endsWith('.pdf')) {
                    formData.append('file', file);
                    console.log(`Adding file: \$2file.name}`);
                }
            });

            // Show loading state
            submitButton.disabled = true;
            loadingSpinner.style.display = 'block';
            loadingText.style.display = 'block';
            loadingText.textContent = `Processing \$2fileInput.files.length} files...`;

            // Send to server
            fetch('/setup', {
                method: 'POST',
                body: formData
            })
            .then(response => {
                if (!response.ok) {
                    throw new Error(`HTTP error! status: \$2response.status}`);
                }
                return response.json();
            })
            .then(data => {
                console.log('Server response:', data);
                if (data.success) {
                    window.location.href = '/chat';
                } else {
                    throw new Error(data.error || 'Unknown error occurred');
                }
            })
            .catch(error => {
                console.error('Error:', error);
                errorElement.textContent = error.message || 'Failed to process files. Please try again.';
            })
            .finally(() => {
                submitButton.disabled = false;
                loadingSpinner.style.display = 'none';
                loadingText.style.display = 'none';
            });
        }

        function showInputSection(method) {
            const methodCards = document.querySelectorAll('.method-card');
            const inputSections = document.querySelectorAll('.input-section');

            methodCards.forEach(card => card.classList.remove('active'));
            document.getElementById(`method-${method}`).classList.add('active');

            inputSections.forEach(section => section.classList.remove('active'));
            document.getElementById(`input-${method}`).classList.add('active');
        }

        // Initialize when the document loads
        document.addEventListener('DOMContentLoaded', () => {
            showInputSection('1');
        });
    </script>
</body>
</html>
'''

HTML_TEMPLATE = '''
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>InsightRAG - Document Q&A</title>
    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap" rel="stylesheet">
    <style>
        /* Root Variables */
        :root {
            --primary-color: #3182ce;
            --secondary-color: #2c5282;
            --success-color: #48bb78;
            --background-light: #f0f4f8;
            --text-dark: #1a202c;
            --text-muted: #4a5568;
            --border-color: #e2e8f0;
            --white: #ffffff;
            --error-color: #e53e3e;
            --chat-bg: #f8fafc;
        }

        /* Base Styles */
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }

        body {
            font-family: 'Inter', -apple-system, BlinkMacSystemFont, sans-serif;
            line-height: 1.6;
            color: var(--text-dark);
            background-color: var(--background-light);
            display: flex;
            justify-content: center;
            min-height: 100vh;
            padding: 1rem;
        }

        /* Chat Container */
        .chat-container {
            display: flex;
            flex-direction: column;
            height: 100vh;
            max-height: 100vh;
            width: 100%;
            max-width: 1000px;
            background: var(--white);
            border-radius: 16px;
            box-shadow: 0 12px 32px rgba(0, 0, 0, 0.08);
            overflow: hidden;
        }

        /* Chat Header */
        .chat-header {
            display: flex;
            justify-content: space-between;
            align-items: center;
            padding: 1.25rem 1.5rem;
            border-bottom: 1px solid var(--border-color);
            background-color: var(--white);
            z-index: 100;
        }

        .chat-header-left {
            display: flex;
            align-items: center;
            gap: 12px;
        }

        .app-logo {
            width: 40px;
            height: 40px;
            border-radius: 10px;
            background-color: var(--primary-color);
            display: flex;
            align-items: center;
            justify-content: center;
            color: var(--white);
            font-size: 20px;
        }

        .app-info h1 {
            font-size: 1.25rem;
            color: var(--text-dark);
        }

        .app-info p {
            font-size: 0.875rem;
            color: var(--text-muted);
        }

        /* Intro Message */
        .intro-message {
            text-align: center;
            padding: 2rem 1rem;
            max-width: 600px;
            margin: 2rem auto;
            background: var(--white);
            border-radius: 12px;
            box-shadow: 0 4px 6px rgba(0, 0, 0, 0.05);
            animation: fadeIn 0.5s ease-out;
        }

        .intro-icon {
            font-size: 3rem;
            margin-bottom: 1rem;
        }

        .intro-message h2 {
            color: var(--text-dark);
            margin-bottom: 1rem;
            font-size: 1.5rem;
        }

        .intro-message p {
            color: var(--text-muted);
            margin-bottom: 1.5rem;
            line-height: 1.6;
        }

        .intro-tips {
            text-align: left;
            background: var(--chat-bg);
            padding: 1.5rem;
            border-radius: 8px;
            margin-top: 1.5rem;
        }

        .intro-tips h3 {
            color: var(--text-dark);
            font-size: 1rem;
            margin-bottom: 0.75rem;
        }

        .intro-tips ul {
            list-style: none;
            padding: 0;
        }

        .intro-tips li {
            color: var(--text-muted);
            margin-bottom: 0.5rem;
            padding-left: 1.5rem;
            position: relative;
        }

        .intro-tips li::before {
            content: '‚Ä¢';
            color: var(--primary-color);
            position: absolute;
            left: 0.5rem;
        }
 /* Chat Content Area */
        .chat-content {
            flex: 1;
            overflow-y: auto;
            padding: 1rem;
            scroll-behavior: smooth;
            position: relative;
            display: flex;
            flex-direction: column;
        }

        .message-container {
            flex: 1;
            display: flex;
            flex-direction: column;
            gap: 20px;
            padding-bottom: 1rem;
        }

        /* Message Styles */
        .message {
            display: flex;
            gap: 12px;
            max-width: 85%;
            animation: slideIn 0.3s ease-out;
        }

        .message.user-message {
            flex-direction: row-reverse;
            margin-left: auto;
        }

        /* Avatar Styles */
        .avatar {
            width: 40px;
            height: 40px;
            border-radius: 50%;
            flex-shrink: 0;
            display: flex;
            align-items: center;
            justify-content: center;
            font-size: 20px;
            color: var(--white);
            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
        }

        .avatar.ai {
            background-color: var(--primary-color);
        }

        .avatar.user {
            background-color: var(--success-color);
        }

        /* Message Content */
        .message-content {
            background: var(--white);
            padding: 12px 16px;
            border-radius: 12px;
            box-shadow: 0 2px 4px rgba(0,0,0,0.05);
            position: relative;
            transition: all 0.3s ease;
        }

        .user-message .message-content {
            background: var(--primary-color);
            color: var(--white);
        }

        .message-time {
            font-size: 0.75rem;
            color: var(--text-muted);
            margin-top: 4px;
        }

        /* Answer Styles */
        .answer-message {
            background: var(--white);
            border-radius: 12px;
            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
            margin: 1rem 0;
            overflow: hidden;
            transition: all 0.3s ease;
        }

        .answer-message:hover {
            box-shadow: 0 4px 8px rgba(0,0,0,0.15);
        }

        .answer-header {
            padding: 1rem;
            border-bottom: 1px solid var(--border-color);
        }

        .question-text {
            font-weight: 600;
            color: var(--text-dark);
        }

        .answer-content {
            padding: 1rem;
        }

        .main-content {
            color: var(--text-dark);
            line-height: 1.6;
            margin-bottom: 1rem;
        }

        .additional-content {
            margin-top: 1rem;
            padding-top: 1rem;
            border-top: 1px solid var(--border-color);
        }

        .content-list {
            list-style-type: none;
            padding-left: 1rem;
        }

        .content-list li {
            position: relative;
            padding-left: 1.5rem;
            margin-bottom: 0.5rem;
            line-height: 1.5;
        }

        .content-list li::before {
            content: '‚Ä¢';
            position: absolute;
            left: 0;
            color: var(--primary-color);
        }

        .answer-footer {
            padding: 1rem;
            background: var(--chat-bg);
            border-top: 1px solid var(--border-color);
        }

        .citations-section {
            display: flex;
            flex-wrap: wrap;
            gap: 0.5rem;
            align-items: center;
        }

        .citations-label {
            font-weight: 500;
            color: var(--text-muted);
        }

        .citations-container {
            display: flex;
            flex-wrap: wrap;
            gap: 0.5rem;
        }

        .citation-tag {
            background: rgba(49, 130, 206, 0.1);
            color: var(--primary-color);
            padding: 0.25rem 0.5rem;
            border-radius: 4px;
            font-size: 0.875rem;
            transition: all 0.2s ease;
        }

        .citation-tag:hover {
            background: rgba(49, 130, 206, 0.2);
        }

 /* Chat Input Area */
        .chat-input-area {
            position: sticky;
            bottom: 0;
            left: 0;
            right: 0;
            background: var(--white);
            padding: 1rem 1.5rem;
            border-top: 1px solid var(--border-color);
            z-index: 100;
            display: flex;
            gap: 12px;
        }

        .chat-input {
            flex-grow: 1;
            padding: 0.75rem 1rem;
            border: 2px solid var(--border-color);
            border-radius: 8px;
            font-size: 1rem;
            transition: all 0.3s ease;
        }

        .chat-input:focus {
            outline: none;
            border-color: var(--primary-color);
            box-shadow: 0 0 0 3px rgba(49, 130, 206, 0.1);
        }

        .chat-submit-btn {
            padding: 0.75rem 1.5rem;
            background-color: var(--primary-color);
            color: var(--white);
            border: none;
            border-radius: 8px;
            font-weight: 500;
            cursor: pointer;
            transition: all 0.3s ease;
            display: flex;
            align-items: center;
            gap: 8px;
        }

        .chat-submit-btn:disabled {
            background-color: var(--border-color);
            cursor: not-allowed;
        }

        /* New Messages Indicator */
        .new-messages-indicator {
            position: fixed;
            bottom: 80px;
            left: 50%;
            transform: translateX(-50%);
            background: var(--primary-color);
            color: white;
            padding: 8px 16px;
            border-radius: 20px;
            cursor: pointer;
            opacity: 0;
            transition: opacity 0.3s ease;
            pointer-events: none;
            z-index: 1000;
        }

        .new-messages-indicator.visible {
            opacity: 1;
            pointer-events: auto;
        }

        /* Animations */
        @keyframes slideIn {
            from { opacity: 0; transform: translateY(10px); }
            to { opacity: 1; transform: translateY(0); }
        }

        @keyframes fadeIn {
            from { opacity: 0; transform: translateY(20px); }
            to { opacity: 1; transform: translateY(0); }
        }

        /* Markdown-style text formatting */
        .answer-content strong {
            color: var(--primary-color);
            font-weight: 600;
        }

        .answer-content em {
            color: var(--secondary-color);
            font-style: italic;
        }

        .answer-content code {
            background: var(--chat-bg);
            padding: 0.2em 0.4em;
            border-radius: 3px;
            font-family: monospace;
            font-size: 0.9em;
        }

        /* Responsive Design */
        @media (max-width: 768px) {
            .chat-container {
                height: 100vh;
                border-radius: 0;
                margin: -1rem;
            }

            .message {
                max-width: 90%;
            }

            .chat-input-area {
                padding: 0.75rem;
            }

            .citations-section {
                flex-direction: column;
                align-items: flex-start;
            }
        }
    </style>
</head>
<body>
    <div class="chat-container">
        <div class="chat-header">
            <div class="chat-header-left">
                <div class="app-logo">ü§ñ</div>
                <div class="app-info">
                    <h1>InsightRAG</h1>
                    <p>AI Document Assistant</p>
                </div>
            </div>
            <button class="chat-submit-btn" onclick="endSession()">End Session</button>
        </div>

        <div class="chat-content" id="chatContent">
            <!-- Intro Message -->
            <div class="intro-message" id="introMessage">
                <div class="intro-icon">üëã</div>
                <h2>Welcome to InsightRAG!</h2>
                <p>I'm your AI assistant for document analysis. I can help you understand your documents better by answering questions about their content.</p>
                <div class="intro-tips">
                    <h3>Tips for better results:</h3>
                    <ul>
                        <li>Ask specific questions about the document content</li>
                        <li>One question at a time works best</li>
                        <li>I'll provide sources for my answers</li>
                    </ul>
                </div>
            </div>
            <div class="message-container" id="messageContainer"></div>
        </div>

        <!-- New Messages Indicator -->
        <div class="new-messages-indicator" id="newMessagesIndicator">
            New messages ‚Üì
        </div>

        <div class="chat-input-area">
            <input
                type="text"
                class="chat-input"
                id="queryInput"
                placeholder="Ask a question about your documents..."
                aria-label="Query input"
            >
            <button
                class="chat-submit-btn"
                onclick="submitQuery()"
                id="submitQueryBtn"
            >
                Ask
            </button>
        </div>
    </div>
    <script>
        // Debug configuration
        const DEBUG = true;

        // DOM Elements
        const queryInput = document.getElementById('queryInput');
        const submitQueryBtn = document.getElementById('submitQueryBtn');
        const messageContainer = document.getElementById('messageContainer');
        const chatContent = document.getElementById('chatContent');
        const newMessagesIndicator = document.getElementById('newMessagesIndicator');

        // Logging functions
        function log(...args) {
            if (DEBUG) {
                console.log('[InsightRAG]:', ...args);
            }
        }

        function logError(...args) {
            if (DEBUG) {
                console.error('[InsightRAG Error]:', ...args);
            }
        }

        // Utility functions
        function formatTimestamp() {
            return new Date().toLocaleTimeString([], {
                hour: '2-digit',
                minute: '2-digit'
            });
        }

        function createMessageElement(isUser, content, timestamp) {
            log('Creating message element:', { isUser, timestamp });
            const messageDiv = document.createElement('div');
            messageDiv.classList.add('message');
            if (isUser) messageDiv.classList.add('user-message');

            const avatar = document.createElement('div');
            avatar.classList.add('avatar');
            avatar.classList.add(isUser ? 'user' : 'ai');
            avatar.innerHTML = isUser ? 'üë§' : 'ü§ñ';

            const contentWrapper = document.createElement('div');

            const messageContent = document.createElement('div');
            messageContent.classList.add('message-content');
            messageContent.innerHTML = content;

            const timeSpan = document.createElement('div');
            timeSpan.classList.add('message-time');
            timeSpan.textContent = timestamp;

            contentWrapper.appendChild(messageContent);
            contentWrapper.appendChild(timeSpan);

            if (isUser) {
                messageDiv.appendChild(contentWrapper);
                messageDiv.appendChild(avatar);
            } else {
                messageDiv.appendChild(avatar);
                messageDiv.appendChild(contentWrapper);
            }

            log('Message element created:', messageDiv);
            return messageDiv;
        }

        function scrollToBottom() {
            const previousScroll = chatContent.scrollTop;
            chatContent.scrollTop = chatContent.scrollHeight;
            log('Scrolled to bottom:', {
                previous: previousScroll,
                new: chatContent.scrollTop,
                height: chatContent.scrollHeight
            });
            newMessagesIndicator.classList.remove('visible');
        }

        function checkScroll() {
            const isScrolledToBottom = chatContent.scrollHeight - chatContent.clientHeight <= chatContent.scrollTop + 100;
            log('Scroll check:', {
                isAtBottom: isScrolledToBottom,
                scrollHeight: chatContent.scrollHeight,
                clientHeight: chatContent.clientHeight,
                scrollTop: chatContent.scrollTop
            });

            if (!isScrolledToBottom) {
                newMessagesIndicator.classList.add('visible');
            } else {
                newMessagesIndicator.classList.remove('visible');
            }
        }

        function submitQuery() {
            const query = queryInput.value.trim();
            if (!query) return;

            log('Submitting query:', query);

            // Hide intro message when first message is sent
            const introMessage = document.getElementById('introMessage');
            if (introMessage) {
                log('Hiding intro message');
                introMessage.style.display = 'none';
            }

            const timestamp = formatTimestamp();
            log('Timestamp:', timestamp);

            // Add user message
            const userMessage = createMessageElement(true, query, timestamp);
            messageContainer.appendChild(userMessage);
            log('Added user message to container');
            scrollToBottom();

            // Disable input during processing
            queryInput.disabled = true;
            submitQueryBtn.disabled = true;
            log('Disabled input controls');

            // Show loading state
            const loadingMessage = createMessageElement(
                false,
                '<div class="typing-indicator">Thinking...</div>',
                timestamp
            );
            messageContainer.appendChild(loadingMessage);
            log('Added loading indicator');
            scrollToBottom();

            // Send query to server
            log('Sending request to server...');
            fetch('/', {
                method: 'POST',
                headers: {
                    'Content-Type': 'application/x-www-form-urlencoded',
                },
                body: `query=${encodeURIComponent(query)}`
            })
            .then(response => {
                log('Received response from server');
                if (!response.ok) {
                    throw new Error(`HTTP error! status: \$2response.status}`);
                }
                return response.json();
            })
            .then(data => {
                log('Processing server response:', data);

                // Remove loading message
                messageContainer.removeChild(loadingMessage);
                log('Removed loading indicator');

                // Add AI response
                const aiMessage = createMessageElement(false, data.response, timestamp);
                messageContainer.appendChild(aiMessage);
                log('Added AI response to container');

                // Log the HTML structure of the response for debugging
                log('Response HTML structure:', {
                    fullHTML: data.response,
                    citations: data.response.match(/<span class="citation-tag">(.*?)<\/span>/g),
                    mainContent: data.response.match(/<div class="main-content">(.*?)<\/div>/s)
                });

                // Reset input state
                queryInput.value = '';
                queryInput.disabled = false;
                submitQueryBtn.disabled = true;
                log('Reset input controls');

                // Scroll to bottom
                scrollToBottom();
            })
            .catch(error => {
                logError('Error processing query:', error);
                messageContainer.removeChild(loadingMessage);

                const errorMessage = createMessageElement(
                    false,
                    '<div class="error-message">An error occurred. Please try again.</div>',
                    timestamp
                );
                messageContainer.appendChild(errorMessage);

                queryInput.disabled = false;
                submitQueryBtn.disabled = false;
                scrollToBottom();
            });
        }

        function endSession() {
            if (confirm('Are you sure you want to end this session?')) {
                log('Ending session...');
                fetch('/shutdown', { method: 'POST' })
                    .then(() => {
                        log('Session ended successfully');
                        window.location.href = '/';
                    })
                    .catch(error => {
                        logError('Error ending session:', error);
                        alert('Failed to end session. Please try again.');
                    });
            }
        }

        // Event Listeners
        document.addEventListener('DOMContentLoaded', () => {
            log('Document loaded, initializing...');

            chatContent.addEventListener('scroll', checkScroll);
            log('Scroll listener added');

            newMessagesIndicator.addEventListener('click', scrollToBottom);
            log('New messages indicator listener added');

            queryInput.addEventListener('input', () => {
                const isEmpty = queryInput.value.trim() === '';
                submitQueryBtn.disabled = isEmpty;
                log('Input changed:', { isEmpty, value: queryInput.value });
            });

            queryInput.addEventListener('keypress', (e) => {
                if (e.key === 'Enter' && !submitQueryBtn.disabled) {
                    log('Enter key pressed, submitting query');
                    submitQuery();
                }
            });

            // Observer setup
            const observer = new MutationObserver(mutations => {
                mutations.forEach(mutation => {
                    if (mutation.addedNodes.length) {
                        log('New content added to message container:', mutation.addedNodes);
                        scrollToBottom();
                    }
                });
            });

            observer.observe(messageContainer, { childList: true, subtree: true });
            log('Message container observer set up');

            // Window resize handler
            window.addEventListener('resize', () => {
                log('Window resized');
                scrollToBottom();
            });

            // Initial state
            submitQueryBtn.disabled = true;
            log('Initial setup complete');
        });
    </script>
</body>
</html>
'''

# Flask app initialization and routes
app = Flask(__name__)
CORS(app)
vector_store = None

@app.after_request
def after_request(response):
    response.headers.add('Access-Control-Allow-Origin', '*')
    response.headers.add('Access-Control-Allow-Headers', 'Content-Type')
    response.headers.add('Access-Control-Allow-Methods', 'GET,POST')
    return response

@app.route('/', methods=['GET'])
def index():
    return render_template_string(INITIAL_TEMPLATE)

@app.route('/chat', methods=['GET'])
def chat():
    if vector_store is None:
        return redirect('/')
    return render_template_string(HTML_TEMPLATE)

@app.route('/setup', methods=['POST'])
def setup():
    try:
        method = request.form.get('method')
        pdf_files = []

        if 'file' not in request.files:
            return jsonify({'success': False, 'error': 'No files uploaded'})

        files = request.files.getlist('file')
        if not files or files[0].filename == '':
            return jsonify({'success': False, 'error': 'No files selected'})

        # Create upload directory if it doesn't exist
        if not os.path.exists(uploads_dir):
            os.makedirs(uploads_dir)

        # Process uploaded files based on method
        upload_timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
        current_upload_dir = uploads_dir / f"upload_{upload_timestamp}"
        current_upload_dir.mkdir(exist_ok=True)

        for file in files:
            if file and file.filename.lower().endswith('.pdf'):
                if method == '1':  # Single file upload
                    filename = secure_filename(file.filename)
                    filepath = str(current_upload_dir / filename)
                else:  # Directory upload
                    relative_path = Path(file.filename)
                    filepath = str(current_upload_dir / relative_path.name)

                # Ensure directory exists
                os.makedirs(os.path.dirname(filepath), exist_ok=True)

                # Save file
                file.save(filepath)
                pdf_files.append(filepath)
                logger.info(f"Saved file: {file.filename}")

        if not pdf_files:
            return jsonify({'success': False, 'error': 'No valid PDF files found'})

        logger.info(f"Processing {len(pdf_files)} PDF files...")

        # Process the PDFs and create vector store
        global vector_store
        text_chunks, page_numbers, sources = process_multiple_pdfs(pdf_files)

        if not text_chunks:
            return jsonify({'success': False, 'error': 'No text content extracted from PDFs'})

        embeddings = generate_embeddings(text_chunks)
        vector_store = SimpleVectorStore(embeddings, text_chunks, page_numbers, sources)

        logger.info("Vector store created successfully")
        return jsonify({'success': True})

    except Exception as e:
        logger.error(f"Error in setup: {str(e)}")
        return jsonify({'success': False, 'error': str(e)})

@app.route('/', methods=['POST'])
def process_query():
    if not vector_store:
        return jsonify({'error': 'No documents loaded. Please setup the system first.'})

    query = request.form.get('query')
    if not query:
        return jsonify({'error': 'No query provided'})

    try:
        response = rag_pipeline(query, vector_store)
        return jsonify({'response': response})
    except Exception as e:
        return jsonify({
            'error': 'An error occurred while processing your query',
            'details': str(e)
        })

@app.route('/shutdown', methods=['POST'])
def shutdown():
    save_session_summary()
    return jsonify({'success': True})

## Main Execution
Run the complete system:
1. Accept PDF upload
2. Process document
3. Initialize vector store
4. Start web interface
5. Enable log downloading

Usage Instructions:
1. Run all cells in order
2. Upload your PDF when prompted
3. Use the web interface to ask questions
4. Click "End Session" when finished
5. Download logs if desired

In [None]:
from pyngrok import ngrok
ngrok.set_auth_token("2fzv7c7qnWIxLtf5dEdmaLRkoSH_gFrgHndwpLjoLdtynkEw")

def run_flask_with_public_url():
    try:
        nest_asyncio.apply()
        ngrok.kill()
        public_url = ngrok.connect(5000)

        print("\n" + "="*50)
        print("üöÄ RAG System is Live!")
        print("="*50)
        print(f"Main URL: {public_url}")
        print("="*50)

        app.run(
            host='0.0.0.0',
            port=5000,
            debug=False,
            use_reloader=False
        )
    except Exception as e:
        print(f"Error starting server: {str(e)}")
        raise e
if __name__ == "__main__":
    run_flask_with_public_url()


üöÄ RAG System is Live!
Main URL: NgrokTunnel: "https://39ca-34-125-239-233.ngrok-free.app" -> "http://localhost:5000"
 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:5000
 * Running on http://172.28.0.12:5000
INFO:werkzeug:[33mPress CTRL+C to quit[0m
INFO:werkzeug:127.0.0.1 - - [16/Apr/2025 06:39:31] "GET / HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [16/Apr/2025 06:39:32] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
INFO:werkzeug:127.0.0.1 - - [16/Apr/2025 06:39:46] "POST /setup HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [16/Apr/2025 06:39:47] "GET /chat HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [16/Apr/2025 06:40:14] "POST / HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [16/Apr/2025 06:40:42] "POST / HTTP/1.1" 200 -
