# Medical RAG System - Admin Panel

## üîß Administrative Interface

Use this interface to manage the medical knowledge base, rebuild indexes, and perform system maintenance.

**‚ö†Ô∏è Access Control**: This panel should only be accessible to administrators.

---

## üéì Educational Mode Toggle

This notebook supports **two learning paths** for understanding RAG systems:

### üìö Local Mode (FAISS + JSON)
**Best for:** Learning RAG fundamentals, offline development, rapid experimentation

**You'll learn:**
- How to chunk documents with semantic boundaries
- How embedding generation works with Azure OpenAI
- How FAISS vector indexes are built and queried
- How to cache and persist data locally

**Storage:** Everything saved in local `cache/` directory

### ‚òÅÔ∏è Azure Mode (Cosmos DB + Azure AI Search)
**Best for:** Understanding production RAG systems, cloud architecture, scalability

**You'll learn:**
- How to store documents in Azure Cosmos DB (NoSQL database)
- How to populate Azure AI Search with embeddings
- How HNSW vector search works in the cloud
- How to manage production-scale RAG infrastructure

**Storage:** Documents in Cosmos DB, vectors in Azure AI Search

---

**Current Mode:** The system will detect `STORAGE_MODE` from your `.env` file and show you the appropriate workflow below.

In [None]:
# System initialization
import warnings
warnings.filterwarnings('ignore')

import ipywidgets as widgets
from IPython.display import display, HTML, clear_output
import os
from pathlib import Path
import json
from datetime import datetime

from rag import config

# Display current mode prominently
mode_color = "#0066cc" if config.STORAGE_MODE == "local" else "#28a745"
mode_icon = "üìö" if config.STORAGE_MODE == "local" else "‚òÅÔ∏è"
mode_name = "Local (FAISS + JSON)" if config.STORAGE_MODE == "local" else "Azure (Cosmos DB + AI Search)"

display(HTML(f'''
<div style="background-color: {mode_color}; color: white; padding: 15px; border-radius: 8px; margin-bottom: 20px;">
    <h2 style="margin: 0;">{mode_icon} Current Mode: {mode_name}</h2>
    <p style="margin: 5px 0 0 0; opacity: 0.9;">
        To change modes, update <code style="background-color: rgba(255,255,255,0.2); padding: 2px 6px; border-radius: 3px;">STORAGE_MODE</code> in your <code style="background-color: rgba(255,255,255,0.2); padding: 2px 6px; border-radius: 3px;">.env</code> file and restart this notebook.
    </p>
</div>
'''))

# Import mode-specific modules
if config.STORAGE_MODE == "azure":
    from rag import azure_cosmos, azure_search
    print("‚úÖ Azure modules loaded (Cosmos DB + AI Search)")
else:
    from rag.cache import save_chunks, save_faiss_index, save_metadata, load_chunks, load_faiss_index
    print("‚úÖ Local cache modules loaded (FAISS + JSON)")

print(f"üìÅ Data directory: {config.DATA_DIR}")
if config.STORAGE_MODE == "local":
    print(f"üì¶ Cache directory: {config.CACHE_DIR}")
else:
    print(f"‚òÅÔ∏è  Cosmos DB: {config.COSMOS_DB_NAME}")
    print(f"üîç Azure Search Index: {config.AZURE_SEARCH_INDEX_NAME}")

---

## üìä System Status

In [None]:
# Status display
status_output = widgets.Output()

def refresh_status(button=None):
    with status_output:
        clear_output()
        
        if config.STORAGE_MODE == "azure":
            # Azure mode status
            try:
                from rag import azure_cosmos, azure_search
                
                # Get counts from Azure services
                doc_count = 0
                chunk_count = 0
                search_count = 0
                
                try:
                    cosmos_stats = azure_cosmos.get_stats()
                    doc_count = cosmos_stats.get('document_count', 0)
                    chunk_count = cosmos_stats.get('chunk_count', 0)
                except Exception as e:
                    display(HTML(f'<p style="color: #dc3545;">‚ö†Ô∏è Could not connect to Cosmos DB: {str(e)}</p>'))
                
                try:
                    search_count = azure_search.get_document_count()
                except Exception as e:
                    display(HTML(f'<p style="color: #dc3545;">‚ö†Ô∏è Could not connect to Azure Search: {str(e)}</p>'))
                
                # Build status HTML
                status_html = '<div style="background-color: #f0fff4; padding: 20px; border-radius: 10px; border-left: 5px solid #28a745;">'
                status_html += '<h3 style="margin-top: 0; color: #28a745;">‚òÅÔ∏è Azure Mode Status</h3>'
                status_html += '<h4 style="color: #666;">Azure Cosmos DB (Document Storage)</h4>'
                status_html += f'<p><strong>Documents:</strong> {doc_count:,}</p>'
                status_html += f'<p><strong>Chunks:</strong> {chunk_count:,}</p>'
                status_html += '<h4 style="color: #666; margin-top: 15px;">Azure AI Search (Vector Index)</h4>'
                status_html += f'<p><strong>Indexed Chunks:</strong> {search_count:,}</p>'
                
                if chunk_count > 0 and search_count == chunk_count:
                    status_html += f'<p style="color: #28a745; font-weight: bold; margin-top: 15px;">‚úÖ System is fully synchronized and operational!</p>'
                elif chunk_count > 0 and search_count == 0:
                    status_html += f'<p style="color: #ffc107; font-weight: bold; margin-top: 15px;">‚ö†Ô∏è Chunks exist in Cosmos DB but not indexed in Azure Search. Run "Process Documents" to populate the index.</p>'
                elif chunk_count > 0 and search_count != chunk_count:
                    status_html += f'<p style="color: #ffc107; font-weight: bold; margin-top: 15px;">‚ö†Ô∏è Mismatch: {chunk_count} chunks in Cosmos DB but {search_count} in Azure Search. Consider rebuilding.</p>'
                else:
                    status_html += f'<p style="color: #dc3545; font-weight: bold; margin-top: 15px;">‚ùå No data found. Run "Process Documents" to populate Azure services.</p>'
                
                status_html += '</div>'
                display(HTML(status_html))
                
            except Exception as e:
                display(HTML(f'<p style="color: #dc3545;">Error checking Azure status: {str(e)}</p>'))
        
        else:
            # Local mode status
            from rag.cache import load_chunks, load_faiss_index
            
            # Check for existing data
            chunks = load_chunks()
            index = load_faiss_index()
            
            # Count PDFs
            pdf_count = len(list(config.PDF_DIR.glob('*.pdf')))
            
            # Check cache files
            cache_files = {
                'chunks.pkl': (config.CACHE_DIR / 'chunks.pkl').exists(),
                'faiss_index.bin': (config.CACHE_DIR / 'faiss_index.bin').exists(),
                'metadata.json': (config.CACHE_DIR / 'chunk_metadata.json').exists()
            }
            
            # Build status HTML
            status_html = '<div style="background-color: #f0f9ff; padding: 20px; border-radius: 10px; border-left: 5px solid #0066cc;">'
            status_html += '<h3 style="margin-top: 0; color: #0066cc;">üìö Local Mode Status</h3>'
            status_html += f'<p><strong>PDF Documents:</strong> {pdf_count}</p>'
            status_html += f'<p><strong>Processed Chunks:</strong> {len(chunks) if chunks else 0}</p>'
            status_html += f'<p><strong>FAISS Index:</strong> {"‚úÖ Built" if index else "‚ùå Not found"}</p>'
            status_html += '<p><strong>Cache Files:</strong></p><ul>'
            for file, exists in cache_files.items():
                icon = '‚úÖ' if exists else '‚ùå'
                status_html += f'<li>{icon} {file}</li>'
            status_html += '</ul>'
            
            if chunks:
                status_html += f'<p style="color: #28a745; font-weight: bold; margin-top: 15px;">‚úÖ System is operational and ready to serve queries.</p>'
            else:
                status_html += f'<p style="color: #dc3545; font-weight: bold; margin-top: 15px;">‚ùå System needs initialization. Please process documents below.</p>'
            
            status_html += '</div>'
            display(HTML(status_html))

refresh_button = widgets.Button(
    description='üîÑ Refresh Status',
    button_style='info',
    layout=widgets.Layout(width='200px', margin='10px 0')
)
refresh_button.on_click(refresh_status)

display(refresh_button)
display(status_output)
refresh_status()

---

## üìÑ Document Management

Upload PDF documents to the system for processing.

In [None]:
# File upload interface
upload_output = widgets.Output()

file_upload = widgets.FileUpload(
    accept='.pdf',
    multiple=True,
    description='Upload PDFs'
)

def handle_upload(change):
    with upload_output:
        clear_output()
        uploaded_files = change['new']
        
        if not uploaded_files:
            return
        
        display(HTML(f'<p>üì§ Uploading {len(uploaded_files)} file(s)...</p>'))
        
        for file_info in uploaded_files:
            filename = file_info['name']
            content = file_info['content']
            filepath = config.PDF_DIR / filename
            
            with open(filepath, 'wb') as f:
                f.write(content)
            
            display(HTML(f'<p style="color: #28a745;">‚úÖ Uploaded: {filename}</p>'))
        
        display(HTML('<p style="font-weight: bold; margin-top: 15px;">Upload complete! Now run "Process Documents" below.</p>'))
        refresh_status()

file_upload.observe(handle_upload, names='value')

display(file_upload)
display(upload_output)

---

## ‚öôÔ∏è Processing Pipeline

Process documents through the complete RAG pipeline:

**Local Mode:** extraction ‚Üí chunking ‚Üí header generation ‚Üí embedding ‚Üí FAISS indexing ‚Üí local cache

**Azure Mode:** extraction ‚Üí chunking ‚Üí header generation ‚Üí Cosmos DB storage ‚Üí embedding generation ‚Üí Azure AI Search indexing

Click the button below to see step-by-step execution with educational explanations!

In [None]:
# Processing controls
process_output = widgets.Output()

process_button = widgets.Button(
    description='üöÄ Process Documents',
    button_style='success',
    icon='cogs',
    layout=widgets.Layout(width='200px', height='45px', margin='10px 0')
)

rebuild_button = widgets.Button(
    description='üî® Rebuild Index',
    button_style='warning',
    icon='refresh',
    layout=widgets.Layout(width='200px', height='45px', margin='10px 0')
)

def process_documents_azure(process_output):
    """Azure mode: Process documents and populate Cosmos DB + Azure AI Search."""
    from rag import azure_cosmos, azure_search
    from rag.embeddings import get_embeddings_batch
    import numpy as np
    import time
    
    display(HTML('<h3>‚òÅÔ∏è Starting Azure Processing Pipeline...</h3>'))
    display(HTML('<p style="color: #666; font-style: italic;">This will teach you how production RAG systems work in the cloud!</p>'))
    
    try:
        # Step 1: Load documents from both JSON and PDFs
        display(HTML('''
        <div style="background-color: #e7f3ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">üìñ Step 1/7: Loading Documents</h4>
            <p><strong>Learning:</strong> Documents can come from multiple sources (JSON files, PDFs, APIs, etc.)</p>
        </div>
        '''))
        
        from rag.ingestion import extract_text_from_pdfs, load_json_documents
        
        # Load JSON documents (web-scraped)
        json_docs = load_json_documents(config.DATA_DIR)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Loaded {len(json_docs)} JSON documents</p>'))
        
        # Extract PDF documents
        pdf_docs = extract_text_from_pdfs(config.PDF_DIR)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Extracted {len(pdf_docs)} PDF documents</p>'))
        
        # Combine all documents
        documents = json_docs + pdf_docs
        display(HTML(f'<p style="color: #0066cc; font-weight: bold;">üìö Total: {len(documents)} documents loaded into memory</p>'))
        
        # Step 2: Save documents to Cosmos DB
        display(HTML('''
        <div style="background-color: #e7f3ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">‚òÅÔ∏è Step 2/7: Storing Documents in Cosmos DB</h4>
            <p><strong>Learning:</strong> Azure Cosmos DB is a globally distributed NoSQL database. It stores your documents with automatic indexing, low-latency access, and built-in replication.</p>
            <p><strong>Why this matters:</strong> Unlike local files, Cosmos DB provides enterprise-grade durability, scalability, and multi-region support.</p>
        </div>
        '''))
        
        azure_cosmos.save_documents(documents)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Saved {len(documents)} documents to Cosmos DB container: <code>{config.COSMOS_CONTAINER_DOCUMENTS}</code></p>'))
        
        # Step 3: Chunk documents
        display(HTML('''
        <div style="background-color: #e7f3ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">‚úÇÔ∏è Step 3/7: Semantic Chunking</h4>
            <p><strong>Learning:</strong> Documents are too long to embed as single units. We split them into smaller "chunks" at semantic boundaries (paragraphs, sections) for better retrieval granularity.</p>
            <p><strong>Why this matters:</strong> Smaller chunks = more precise retrieval. A query about "diabetes symptoms" will match the specific paragraph, not the entire 50-page document.</p>
        </div>
        '''))
        
        from rag.chunking import SemanticChunker
        chunker = SemanticChunker(max_words=config.SEMANTIC_MAX_WORDS)
        chunks = chunker.chunk_documents(documents)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Created {len(chunks)} semantic chunks (max {config.SEMANTIC_MAX_WORDS} words each)</p>'))
        
        # Step 4: Generate contextual headers
        display(HTML('''
        <div style="background-color: #e7f3ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">üè∑Ô∏è Step 4/7: Contextual Header Generation</h4>
            <p><strong>Learning:</strong> Each chunk gets a "contextual header" that describes its hierarchical position in the document (e.g., "NIH Guidelines ‚Üí Diabetes ‚Üí Type 2 ‚Üí Treatment Options").</p>
            <p><strong>Why this matters:</strong> This is the secret sauce! Headers provide context that dramatically improves embedding quality and retrieval accuracy. A chunk about "insulin dosing" is more meaningful when you know it's from a diabetes treatment protocol.</p>
            <p><strong>How it works:</strong> We use Azure OpenAI's GPT model to analyze document structure and generate these headers automatically.</p>
        </div>
        '''))
        
        from rag.headers import ContextualHeaderGenerator
        header_gen = ContextualHeaderGenerator()
        chunks = header_gen.generate_headers_batch(chunks, batch_size=config.BATCH_SIZE)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Generated contextual headers for {len(chunks)} chunks using Azure OpenAI</p>'))
        
        # Step 5: Save chunks to Cosmos DB
        display(HTML('''
        <div style="background-color: #e7f3ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">‚òÅÔ∏è Step 5/7: Storing Chunks in Cosmos DB</h4>
            <p><strong>Learning:</strong> Now we store the processed chunks (with headers) in a separate Cosmos DB container.</p>
            <p><strong>Why separate containers:</strong> Documents and chunks have different access patterns. This separation allows us to query chunks efficiently without loading entire documents.</p>
        </div>
        '''))
        
        azure_cosmos.save_chunks(chunks)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Saved {len(chunks)} chunks to Cosmos DB container: <code>{config.COSMOS_CONTAINER_CHUNKS}</code></p>'))
        
        # Step 6: Generate embeddings
        display(HTML('''
        <div style="background-color: #e7f3ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">üßÆ Step 6/7: Generating Vector Embeddings</h4>
            <p><strong>Learning:</strong> Embeddings are numerical vector representations of text. Azure OpenAI's <code>text-embedding-3-large</code> model converts each chunk (with header) into a 3072-dimensional vector.</p>
            <p><strong>Why this matters:</strong> Vectors enable semantic similarity search. "What are symptoms of diabetes?" will match chunks about "signs of high blood sugar" even though they use different words!</p>
            <p><strong>Rate limiting:</strong> We batch requests ({config.EMBED_BATCH_SIZE} at a time) with delays to respect API limits.</p>
        </div>
        '''))
        
        texts = [c.augmented_chunk for c in chunks]
        embeddings_list = []
        batch_size = config.EMBED_BATCH_SIZE
        total_batches = (len(texts) + batch_size - 1) // batch_size
        
        for i in range(0, len(texts), batch_size):
            batch = texts[i:i + batch_size]
            batch_num = i // batch_size + 1
            
            batch_emb = get_embeddings_batch(batch)
            embeddings_list.extend(batch_emb)
            
            display(HTML(f'<p>üìä Batch {batch_num}/{total_batches} complete ({len(batch)} embeddings)</p>'))
            
            if batch_num < total_batches:
                time.sleep(config.EMBED_DELAY_SECONDS)
        
        embeddings = np.asarray(embeddings_list, dtype=np.float32)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Generated {embeddings.shape[0]:,} embeddings with {embeddings.shape[1]:,} dimensions each</p>'))
        
        # Step 7: Create Azure AI Search index and upload
        display(HTML('''
        <div style="background-color: #e7f3ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">üîç Step 7/7: Populating Azure AI Search Index</h4>
            <p><strong>Learning:</strong> Azure AI Search is a managed vector database that uses the HNSW (Hierarchical Navigable Small World) algorithm for approximate nearest neighbor search.</p>
            <p><strong>How HNSW works:</strong> It builds a multi-layer graph where each layer has progressively fewer nodes. Search starts at the top (sparse) layer and zooms down to find nearest neighbors efficiently.</p>
            <p><strong>Why Azure Search:</strong> Handles billions of vectors, sub-second queries, automatic scaling, and enterprise security.</p>
            <p><strong>What we're indexing:</strong> Each chunk's embedding (3072-D vector) along with metadata (title, source, header, etc.)</p>
        </div>
        '''))
        
        # Create index if needed
        azure_search.create_search_index(embedding_dimensions=3072)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Created/verified Azure Search index: <code>{config.AZURE_SEARCH_INDEX_NAME}</code></p>'))
        
        # Upload chunks with embeddings
        azure_search.upload_chunks(chunks, embeddings, batch_size=100)
        search_count = azure_search.get_document_count()
        display(HTML(f'<p style="color: #28a745;">‚úÖ Uploaded {search_count:,} documents to Azure AI Search</p>'))
        
        # Success message
        display(HTML(f'''
            <div style="background-color: #d4edda; border: 1px solid #c3e6cb; color: #155724; padding: 20px; border-radius: 10px; margin-top: 20px;">
                <h3 style="margin-top: 0;">üéâ Azure Processing Complete!</h3>
                <p><strong>What you just learned:</strong></p>
                <ul>
                    <li>‚úÖ How to store documents in Cosmos DB (globally distributed NoSQL)</li>
                    <li>‚úÖ How semantic chunking improves retrieval granularity</li>
                    <li>‚úÖ How contextual headers enhance embedding quality</li>
                    <li>‚úÖ How Azure OpenAI generates vector embeddings</li>
                    <li>‚úÖ How Azure AI Search indexes vectors with HNSW algorithm</li>
                </ul>
                <p><strong>Your production RAG system is now live!</strong></p>
                <ul style="margin-bottom: 0;">
                    <li>üìÑ Documents in Cosmos DB: {len(documents):,}</li>
                    <li>‚úÇÔ∏è Chunks created: {len(chunks):,}</li>
                    <li>üßÆ Embeddings generated: {embeddings.shape[0]:,} √ó {embeddings.shape[1]:,}D</li>
                    <li>üîç Vectors in Azure Search: {search_count:,}</li>
                </ul>
            </div>
        '''))
        
        refresh_status()
        
    except Exception as e:
        display(HTML(f'<p style="color: #dc3545; font-weight: bold;">‚ùå Error: {str(e)}</p>'))
        import traceback
        display(HTML(f'<pre style="background-color: #f8f9fa; padding: 10px; border-radius: 5px; font-size: 11px;">{traceback.format_exc()}</pre>'))

def process_documents_local(process_output):
    """Local mode: Process documents and build FAISS index."""
    from rag.cache import save_chunks, save_faiss_index, save_metadata
    
    display(HTML('<h3>üìö Starting Local Processing Pipeline...</h3>'))
    display(HTML('<p style="color: #666; font-style: italic;">This will teach you how RAG systems work from the ground up!</p>'))
    
    try:
        # Step 1: Load documents from both JSON and PDFs
        display(HTML('''
        <div style="background-color: #f0f9ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">üìñ Step 1/6: Loading Documents</h4>
            <p><strong>Learning:</strong> Documents can come from multiple sources (JSON files, PDFs, APIs, etc.)</p>
        </div>
        '''))
        
        from rag.ingestion import extract_text_from_pdfs, load_json_documents
        
        # Load JSON documents (web-scraped)
        json_docs = load_json_documents(config.DATA_DIR)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Loaded {len(json_docs)} JSON documents</p>'))
        
        # Extract PDF documents
        pdf_docs = extract_text_from_pdfs(config.PDF_DIR)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Extracted {len(pdf_docs)} PDF documents</p>'))
        
        # Combine all documents
        documents = json_docs + pdf_docs
        display(HTML(f'<p style="color: #0066cc; font-weight: bold;">üìö Total: {len(documents)} documents</p>'))
        
        # Step 2: Chunk documents
        display(HTML('''
        <div style="background-color: #f0f9ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">‚úÇÔ∏è Step 2/6: Semantic Chunking</h4>
            <p><strong>Learning:</strong> We split documents into smaller chunks at semantic boundaries for better retrieval precision.</p>
            <p><strong>Why:</strong> Smaller chunks mean more accurate matches. A query about "symptoms" will retrieve just that section, not the entire document.</p>
        </div>
        '''))
        
        from rag.chunking import SemanticChunker
        chunker = SemanticChunker(max_words=config.SEMANTIC_MAX_WORDS)
        chunks = chunker.chunk_documents(documents)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Created {len(chunks)} chunks</p>'))
        
        # Step 3: Generate contextual headers
        display(HTML('''
        <div style="background-color: #f0f9ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">üè∑Ô∏è Step 3/6: Contextual Headers</h4>
            <p><strong>Learning:</strong> We add contextual headers (e.g., "NIH ‚Üí Diabetes ‚Üí Treatment") to each chunk before embedding.</p>
            <p><strong>Why:</strong> Context improves embedding quality. "Insulin dosing" means more when you know it's from a diabetes treatment guide.</p>
        </div>
        '''))
        
        from rag.headers import ContextualHeaderGenerator
        header_gen = ContextualHeaderGenerator()
        chunks = header_gen.generate_headers_batch(chunks, batch_size=config.BATCH_SIZE)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Generated headers for all chunks</p>'))
        
        # Step 4: Generate embeddings with batching
        display(HTML('''
        <div style="background-color: #f0f9ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">üßÆ Step 4/6: Vector Embeddings</h4>
            <p><strong>Learning:</strong> Azure OpenAI converts text into 3072-dimensional vectors that capture semantic meaning.</p>
            <p><strong>How:</strong> Similar concepts have similar vectors. "diabetes" and "high blood sugar" will be close in vector space.</p>
        </div>
        '''))
        
        from rag.embeddings import get_embeddings_batch
        from rag.cache import save_embeddings
        import time
        
        texts_to_embed = [f"{chunk.ctx_header}\n\n{chunk.raw_chunk}" for chunk in chunks]
        embeddings = []
        batch_size = config.EMBED_BATCH_SIZE
        total_batches = (len(texts_to_embed) + batch_size - 1) // batch_size
        
        for i in range(0, len(texts_to_embed), batch_size):
            batch = texts_to_embed[i:i + batch_size]
            batch_embeddings = get_embeddings_batch(batch)
            
            # Check for zero vectors (failed embeddings)
            if batch_embeddings and any(sum(emb) == 0 for emb in batch_embeddings):
                raise RuntimeError(f"Embedding generation failed for batch {i//batch_size + 1} (returned zero vectors)")
            
            embeddings.extend(batch_embeddings)
            batch_num = i // batch_size + 1
            display(HTML(f'<p>üìä Completed batch {batch_num}/{total_batches}</p>'))
            
            # Delay between batches (except last)
            if batch_num < total_batches:
                time.sleep(config.EMBED_DELAY_SECONDS)
        
        display(HTML(f'<p style="color: #28a745;">‚úÖ Generated {len(embeddings)} embeddings</p>'))
        
        # Step 5: Build FAISS index
        display(HTML('''
        <div style="background-color: #f0f9ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">üîç Step 5/6: FAISS Index</h4>
            <p><strong>Learning:</strong> FAISS (Facebook AI Similarity Search) builds an index for fast nearest-neighbor search.</p>
            <p><strong>Algorithm:</strong> We use IndexFlatIP (Inner Product) with normalized vectors for cosine similarity.</p>
            <p><strong>How it works:</strong> When you search, FAISS compares your query vector against all stored vectors and returns the closest matches.</p>
        </div>
        '''))
        
        import numpy as np
        import faiss
        embeddings_array = np.array(embeddings).astype('float32')
        dimension = embeddings_array.shape[1]
        index = faiss.IndexFlatIP(dimension)  # Inner product for cosine similarity
        faiss.normalize_L2(embeddings_array)  # Normalize for cosine similarity
        index.add(embeddings_array)
        display(HTML(f'<p style="color: #28a745;">‚úÖ Built FAISS index with {index.ntotal} vectors ({dimension} dimensions)</p>'))
        
        # Step 6: Save everything to cache
        display(HTML('''
        <div style="background-color: #f0f9ff; padding: 15px; border-left: 4px solid #0066cc; margin: 15px 0;">
            <h4 style="margin-top: 0;">üíæ Step 6/6: Save to Local Cache</h4>
            <p><strong>Learning:</strong> We save chunks (Python pickle), index (binary), and metadata (JSON) to disk.</p>
            <p><strong>Why cache:</strong> Rebuilding takes time. Caching lets us reload instantly for demos and queries.</p>
        </div>
        '''))
        
        save_chunks(chunks)
        save_faiss_index(index)
        
        # Build metadata for retrieval
        chunk_records = []
        for i, chunk in enumerate(chunks):
            chunk_records.append({
                'chunk_id': chunk.chunk_id,
                'doc_title': chunk.doc_title,
                'source_url': chunk.source_url,
                'ctx_header': chunk.ctx_header,
                'chunk_index': chunk.chunk_index
            })
        save_metadata(chunk_records)
        
        display(HTML('<p style="color: #28a745;">‚úÖ Saved to cache</p>'))
        
        # Success message
        display(HTML(f'''
            <div style="background-color: #d4edda; border: 1px solid #c3e6cb; color: #155724; padding: 20px; border-radius: 10px; margin-top: 20px;">
                <h3 style="margin-top: 0;">üéâ Local Processing Complete!</h3>
                <p><strong>What you just learned:</strong></p>
                <ul>
                    <li>‚úÖ How to chunk documents semantically</li>
                    <li>‚úÖ How contextual headers improve retrieval</li>
                    <li>‚úÖ How Azure OpenAI generates embeddings</li>
                    <li>‚úÖ How FAISS indexes vectors for fast search</li>
                    <li>‚úÖ How to cache data for quick reloading</li>
                </ul>
                <p><strong>Your local RAG system is ready!</strong></p>
                <ul style="margin-bottom: 0;">
                    <li>JSON documents: {len(json_docs)}</li>
                    <li>PDF documents: {len(pdf_docs)}</li>
                    <li>Total documents processed: {len(documents)}</li>
                    <li>Chunks created: {len(chunks)}</li>
                    <li>Embeddings generated: {len(embeddings)}</li>
                    <li>Index built: {index.ntotal} vectors</li>
                </ul>
            </div>
        '''))
        
        refresh_status()
        
    except Exception as e:
        display(HTML(f'<p style="color: #dc3545; font-weight: bold;">‚ùå Error: {str(e)}</p>'))
        import traceback
        display(HTML(f'<pre style="background-color: #f8f9fa; padding: 10px; border-radius: 5px; font-size: 11px;">{traceback.format_exc()}</pre>'))

def process_documents(button):
    """Route to appropriate processing function based on storage mode."""
    with process_output:
        clear_output(wait=True)
        
        if config.STORAGE_MODE == "azure":
            process_documents_azure(process_output)
        else:
            process_documents_local(process_output)

def rebuild_index(button):
    """Rebuild index from existing chunks."""
    with process_output:
        clear_output(wait=True)
        
        if config.STORAGE_MODE == "azure":
            # Azure rebuild: regenerate embeddings and re-upload to Azure Search
            display(HTML('<h3>üî® Rebuilding Azure AI Search Index...</h3>'))
            
            try:
                from rag import azure_cosmos, azure_search
                from rag.embeddings import get_embeddings_batch
                import numpy as np
                import time
                
                # Load existing chunks from Cosmos DB
                chunks = azure_cosmos.load_chunks()
                if not chunks:
                    display(HTML('<p style="color: #dc3545;">‚ùå No chunks found in Cosmos DB. Please process documents first.</p>'))
                    return
                
                display(HTML(f'<p>üì¶ Loaded {len(chunks)} chunks from Cosmos DB</p>'))
                
                # Regenerate embeddings
                display(HTML('<p>üßÆ Regenerating embeddings...</p>'))
                texts = [c.augmented_chunk for c in chunks]
                embeddings_list = []
                batch_size = config.EMBED_BATCH_SIZE
                total_batches = (len(texts) + batch_size - 1) // batch_size
                
                for i in range(0, len(texts), batch_size):
                    batch = texts[i:i + batch_size]
                    batch_num = i // batch_size + 1
                    
                    batch_emb = get_embeddings_batch(batch)
                    embeddings_list.extend(batch_emb)
                    display(HTML(f'<p>üìä Batch {batch_num}/{total_batches} complete</p>'))
                    
                    if batch_num < total_batches:
                        time.sleep(config.EMBED_DELAY_SECONDS)
                
                embeddings = np.asarray(embeddings_list, dtype=np.float32)
                display(HTML(f'<p style="color: #28a745;">‚úÖ Generated {embeddings.shape[0]:,} embeddings</p>'))
                
                # Recreate index and upload
                azure_search.create_search_index(embedding_dimensions=3072, force_recreate=True)
                azure_search.upload_chunks(chunks, embeddings, batch_size=100)
                search_count = azure_search.get_document_count()
                
                display(HTML(f'''
                    <div style="background-color: #d4edda; border: 1px solid #c3e6cb; color: #155724; padding: 20px; border-radius: 10px; margin-top: 20px;">
                        <h3 style="margin-top: 0;">‚úÖ Azure Index Rebuilt!</h3>
                        <p style="margin-bottom: 0;">Azure AI Search updated with {search_count:,} vectors.</p>
                    </div>
                '''))
                
                refresh_status()
                
            except Exception as e:
                display(HTML(f'<p style="color: #dc3545; font-weight: bold;">‚ùå Error: {str(e)}</p>'))
                import traceback
                display(HTML(f'<pre style="background-color: #f8f9fa; padding: 10px; border-radius: 5px; font-size: 11px;">{traceback.format_exc()}</pre>'))
        
        else:
            # Local rebuild: same as before
            display(HTML('<h3>üî® Rebuilding FAISS Index...</h3>'))
            
            try:
                from rag.cache import load_chunks, save_faiss_index
                from rag.embeddings import get_embeddings_batch
                import numpy as np
                import faiss
                import time
                
                # Load existing chunks
                chunks = load_chunks()
                if not chunks:
                    display(HTML('<p style="color: #dc3545;">‚ùå No chunks found. Please process documents first.</p>'))
                    return
                
                display(HTML(f'<p>üì¶ Loaded {len(chunks)} existing chunks</p>'))
                
                # Regenerate embeddings with batching
                display(HTML('<p>üßÆ Regenerating embeddings...</p>'))
                
                texts_to_embed = [f"{chunk.ctx_header}\n\n{chunk.raw_chunk}" for chunk in chunks]
                embeddings = []
                batch_size = config.EMBED_BATCH_SIZE
                total_batches = (len(texts_to_embed) + batch_size - 1) // batch_size
                
                for i in range(0, len(texts_to_embed), batch_size):
                    batch = texts_to_embed[i:i + batch_size]
                    batch_embeddings = get_embeddings_batch(batch)
                    
                    # Check for zero vectors (failed embeddings)
                    if batch_embeddings and any(sum(emb) == 0 for emb in batch_embeddings):
                        raise RuntimeError(f"Embedding generation failed for batch {i//batch_size + 1} (returned zero vectors)")
                    
                    embeddings.extend(batch_embeddings)
                    batch_num = i // batch_size + 1
                    display(HTML(f'<p>üìä Completed batch {batch_num}/{total_batches}</p>'))
                    
                    # Delay between batches (except last)
                    if batch_num < total_batches:
                        time.sleep(config.EMBED_DELAY_SECONDS)
                
                display(HTML(f'<p style="color: #28a745;">‚úÖ Generated {len(embeddings)} embeddings</p>'))
                
                # Rebuild index
                display(HTML('<p>üîç Building new FAISS index...</p>'))
                embeddings_array = np.array(embeddings).astype('float32')
                dimension = embeddings_array.shape[1]
                index = faiss.IndexFlatIP(dimension)
                faiss.normalize_L2(embeddings_array)
                index.add(embeddings_array)
                
                # Save
                save_faiss_index(index)
                
                display(HTML(f'''
                    <div style="background-color: #d4edda; border: 1px solid #c3e6cb; color: #155724; padding: 20px; border-radius: 10px; margin-top: 20px;">
                        <h3 style="margin-top: 0;">‚úÖ Index Rebuilt Successfully!</h3>
                        <p style="margin-bottom: 0;">FAISS index updated with {index.ntotal} vectors.</p>
                    </div>
                '''))
                
                refresh_status()
                
            except Exception as e:
                display(HTML(f'<p style="color: #dc3545; font-weight: bold;">‚ùå Error: {str(e)}</p>'))
                import traceback
                display(HTML(f'<pre style="background-color: #f8f9fa; padding: 10px; border-radius: 5px; font-size: 11px;">{traceback.format_exc()}</pre>'))

process_button.on_click(process_documents)
rebuild_button.on_click(rebuild_index)

display(widgets.HBox([process_button, rebuild_button]))
display(process_output)

---

## üóëÔ∏è Data Management

**Local Mode:** Clear local cache files

**Azure Mode:** Delete data from Cosmos DB and Azure AI Search

In [None]:
# Data management
cache_output = widgets.Output()

clear_cache_button = widgets.Button(
    description='üóëÔ∏è Clear All Data',
    button_style='danger',
    layout=widgets.Layout(width='200px', margin='10px 0')
)

def clear_cache(button):
    with cache_output:
        clear_output()
        
        if config.STORAGE_MODE == "azure":
            # Azure mode: delete from Cosmos DB and Azure Search
            display(HTML('<p style="color: #dc3545; font-weight: bold;">‚ö†Ô∏è WARNING: This will delete all data from Azure services!</p>'))
            display(HTML('<p>Clearing Azure data...</p>'))
            
            try:
                from rag import azure_cosmos, azure_search
                
                # Delete from Cosmos DB
                display(HTML('<p>üóëÔ∏è Deleting documents from Cosmos DB...</p>'))
                azure_cosmos.delete_all_documents()
                display(HTML('<p style="color: #28a745;">‚úÖ Documents deleted</p>'))
                
                display(HTML('<p>üóëÔ∏è Deleting chunks from Cosmos DB...</p>'))
                azure_cosmos.delete_all_chunks()
                display(HTML('<p style="color: #28a745;">‚úÖ Chunks deleted</p>'))
                
                # Delete Azure Search index
                display(HTML('<p>üóëÔ∏è Deleting Azure AI Search index...</p>'))
                from azure.search.documents.indexes import SearchIndexClient
                from azure.core.credentials import AzureKeyCredential
                
                index_client = SearchIndexClient(
                    endpoint=config.AZURE_SEARCH_ENDPOINT,
                    credential=AzureKeyCredential(config.AZURE_SEARCH_KEY)
                )
                
                try:
                    index_client.delete_index(config.AZURE_SEARCH_INDEX_NAME)
                    display(HTML('<p style="color: #28a745;">‚úÖ Azure Search index deleted</p>'))
                except Exception:
                    display(HTML('<p style="color: #ffc107;">‚ö†Ô∏è Index not found or already deleted</p>'))
                
                display(HTML('<p style="font-weight: bold; margin-top: 15px; color: #28a745;">‚úÖ All Azure data cleared. Run "Process Documents" to rebuild.</p>'))
                refresh_status()
                
            except Exception as e:
                display(HTML(f'<p style="color: #dc3545;">‚ùå Error: {str(e)}</p>'))
        
        else:
            # Local mode: clear cache files
            display(HTML('<p>‚ö†Ô∏è Clearing local cache files...</p>'))
            
            cache_files = [
                config.CACHE_DIR / 'chunks.pkl',
                config.CACHE_DIR / 'faiss_index.bin',
                config.CACHE_DIR / 'chunk_metadata.json'
            ]
            
            for filepath in cache_files:
                if filepath.exists():
                    filepath.unlink()
                    display(HTML(f'<p style="color: #28a745;">‚úÖ Deleted: {filepath.name}</p>'))
            
            display(HTML('<p style="font-weight: bold; margin-top: 15px;">Cache cleared. Run "Process Documents" to rebuild.</p>'))
            refresh_status()

clear_cache_button.on_click(clear_cache)

display(clear_cache_button)
display(cache_output)

---

## üìà System Information

In [None]:
# Display system configuration
if config.STORAGE_MODE == "azure":
    info_html = f'''
<div style="background-color: #f0fff4; padding: 20px; border-radius: 10px; border: 1px solid #28a745;">
    <h3 style="margin-top: 0; color: #28a745;">‚öôÔ∏è Azure Mode Configuration</h3>
    <table style="width: 100%; border-collapse: collapse;">
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Storage Mode:</td>
            <td style="padding: 8px;"><code>azure</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Azure OpenAI Endpoint:</td>
            <td style="padding: 8px;"><code>{config.AZURE_OPENAI_ENDPOINT[:50]}...</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Cosmos DB:</td>
            <td style="padding: 8px;"><code>{config.COSMOS_DB_NAME}</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Cosmos Containers:</td>
            <td style="padding: 8px;"><code>{config.COSMOS_CONTAINER_DOCUMENTS}</code>, <code>{config.COSMOS_CONTAINER_CHUNKS}</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Azure Search Index:</td>
            <td style="padding: 8px;"><code>{config.AZURE_SEARCH_INDEX_NAME}</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Embedding Model:</td>
            <td style="padding: 8px;"><code>{config.AOAI_EMBED_MODEL}</code> (3072 dimensions)</td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Chat Model:</td>
            <td style="padding: 8px;"><code>{config.AOAI_CHAT_MODEL}</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Max Chunk Words:</td>
            <td style="padding: 8px;">{config.SEMANTIC_MAX_WORDS}</td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Embedding Batch Size:</td>
            <td style="padding: 8px;">{config.EMBED_BATCH_SIZE}</td>
        </tr>
        <tr>
            <td style="padding: 8px; font-weight: bold;">Vector Search Algorithm:</td>
            <td style="padding: 8px;">HNSW (Hierarchical Navigable Small World)</td>
        </tr>
    </table>
    <p style="margin-top: 15px; color: #666; font-size: 14px;">
        <strong>üìö Learning:</strong> This configuration shows your production Azure infrastructure. 
        Data is stored in globally distributed services with automatic scaling and enterprise security.
    </p>
</div>
'''
else:
    info_html = f'''
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; border: 1px solid #dee2e6;">
    <h3 style="margin-top: 0; color: #495057;">‚öôÔ∏è Local Mode Configuration</h3>
    <table style="width: 100%; border-collapse: collapse;">
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Storage Mode:</td>
            <td style="padding: 8px;"><code>local</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Data Directory:</td>
            <td style="padding: 8px;"><code>{config.DATA_DIR}</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">PDF Directory:</td>
            <td style="padding: 8px;"><code>{config.PDF_DIR}</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Cache Directory:</td>
            <td style="padding: 8px;"><code>{config.CACHE_DIR}</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Embedding Model:</td>
            <td style="padding: 8px;"><code>{config.AOAI_EMBED_MODEL}</code> (3072 dimensions)</td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Chat Model:</td>
            <td style="padding: 8px;"><code>{config.AOAI_CHAT_MODEL}</code></td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Max Chunk Words:</td>
            <td style="padding: 8px;">{config.SEMANTIC_MAX_WORDS}</td>
        </tr>
        <tr style="border-bottom: 1px solid #dee2e6;">
            <td style="padding: 8px; font-weight: bold;">Embedding Batch Size:</td>
            <td style="padding: 8px;">{config.EMBED_BATCH_SIZE}</td>
        </tr>
        <tr>
            <td style="padding: 8px; font-weight: bold;">Vector Search Algorithm:</td>
            <td style="padding: 8px;">FAISS IndexFlatIP (exact cosine similarity)</td>
        </tr>
    </table>
    <p style="margin-top: 15px; color: #666; font-size: 14px;">
        <strong>üìö Learning:</strong> This configuration shows your local development setup. 
        All data is stored on your machine for learning and experimentation.
    </p>
</div>
'''

display(HTML(info_html))