# 🔬 Aavishkar.ai Expert System Notebook

<img src="https://github.com/astitvac/AI4Science/raw/main/assets/AA_Main_Banner.jpg" alt="Aavishkar.ai Banner" width="600"/>

### <span style="color:#6C5CE7;">AI for Science</span>
<small><i>Democratizing advanced AI capabilities for scientific research</i></small>

---

## 👋 Welcome to Aavishkar.ai Expert Systems!

<small>This notebook is part of the <b>Aavishkar.ai AI4Science</b> initiative, which develops LLM-based expert systems to enhance scientific research workflows. Our expert systems formalize scientific cognitive processes using Large Language Models, structured knowledge representations, and interactive interfaces.</small>

### 🧠 About This Expert System

<small>This notebook implements one of the five scientific cognitive archetypes developed by Aavishkar.ai:</small>

<small>
1. 📚 <b>Literature Synthesist</b>: Identifies patterns, contradictions, and knowledge gaps across research corpora<br>
2. 🧪 <b>Experimental Architect</b>: Translates abstract hypotheses into methodologically sound experimental designs<br>
3. 📊 <b>Analytical Navigator</b>: Constructs adaptive analytical pathways through complex datasets<br>
4. 📝 <b>Research Documentarian</b>: Structures and articulates scientific findings and methodologies<br>
5. 🔄 <b>Interdisciplinary Translator</b>: Establishes conceptual bridges between disparate knowledge domains
</small>

### 👥 Who Can Use This?

<small>
Aavishkar.ai tools are designed for all practitioners of hypothesis-driven science:<br>
• 🎓 Academic researchers and students<br>
• 🏢 Commercial/industrial researchers<br>
• 🏛️ Government scientists<br>
• 🔭 Citizen scientists<br>
• 🧩 Independent researchers
</small>

<small>No matter your technical background or institutional affiliation, this notebook provides accessible AI capabilities for rigorous scientific work.</small>

---

### ⚙️ Setup Instructions

<small>

**Google Colab**
* Click on "Runtime" in the menu
* Select "Run all" to install dependencies and initialize the system
* Ensure you have your API keys ready for the LLM provider

**Local Environment**
* Ensure you have Python 3.8+ installed
* Install dependencies by running the installation cell below
* Set up your API keys as instructed in the initialization section

**Prerequisites**
* Python 3.8+
* API key for OpenAI or Google Vertex AI
* Basic familiarity with Jupyter notebooks

</small>

---

### 📜 License

<small>This project is licensed under the <b>MIT License</b></small>

<small>
<details>
<summary>View License Text</summary>
MIT License<br><br>
Copyright (c) 2023-2024 Aavishkar.ai<br><br>
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:<br><br>
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
</details>
</small>

<small>

### 🔗 Connect with Aavishkar.ai
* 📦 **GitHub**: [github.com/astitvac/AI4Science](https://github.com/astitvac/AI4Science)
* 🌐 **Website**: [aavishkar.ai](https://aavishkar.ai)
* 💬 **Community**: [Discord](https://discord.gg/aavishkar)
* 🤝 **Contribute**: [Contribution Guidelines](https://github.com/astitvac/AI4Science/tree/main/Contributing)

</small>

## ⚙️ Installation
<small>
This cell installs all required dependencies for this expert system notebook. The installation process uses uv for faster package management when available, with automatic fallback to standard pip.
Key components being installed:
LLM frameworks: LangChain and provider-specific libraries
Data modeling: Pydantic
UI: Gradio
Core utilities: Data processing and visualization libraries
Troubleshooting tips:
If you encounter errors, try running the cell again
For persistent issues, check your Python version (3.8+ required)
In Colab, restart the runtime if packages aren't recognized after installation
Note: Initial installation may take 1-2 minutes to complete. A confirmation message will appear when successful.
</small>

In [None]:
# Installation
import sys, os, subprocess, time
from IPython.display import HTML, display

# All required packages (no version constraints for better future-proofing)
ALL_PACKAGES = {
    "core": "langchain pydantic python-dotenv",
    "providers": "langchain-openai langchain-google-vertexai",
    "ui": "gradio",
    "data": "numpy pandas matplotlib plotly",
    "documents": "PyPDF2 pillow",
    "vectors": "chromadb sentence-transformers"
}

# Environment detection
IN_COLAB = 'google.colab' in sys.modules

def show(msg, type="info"):
    """Display styled message"""
    colors = {"info": "#3a7bd5", "success": "#00c853", "warning": "#f57c00", "error": "#d50000"}
    icons = {"info": "ℹ️", "success": "✅", "warning": "⚠️", "error": "❌"}
    display(HTML(f"<div style='color:white; background:{colors[type]}; padding:5px; margin:2px 0; border-radius:3px'>{icons[type]} {msg}</div>"))

def install_packages():
    """Install all packages using uv when possible, with minimal messaging"""
    start = time.time()
    show("Starting installation...", "info")
    
    # Try to use uv for faster installation
    try:
        subprocess.run("pip install -q uv", shell=True, check=True, timeout=30)
        installer = "uv pip"
    except:
        installer = "pip"
    
    # Install each category
    success_count = 0
    total_categories = len(ALL_PACKAGES)
    
    for category, packages in ALL_PACKAGES.items():
        try:
            # Install entire category at once for speed
            cmd = f"{installer} install -q {packages}"
            result = subprocess.run(cmd, shell=True, capture_output=True, timeout=120)
            
            if result.returncode == 0:
                success_count += 1
        except Exception:
            pass  # Silent failure, will be reflected in final success rate
    
    # Simple verification of core packages
    try:
        import langchain
        import pydantic
        import gradio
        verification = "with verification"
    except ImportError:
        verification = "with partial verification failures"
    
    # Single completion message with success rate
    elapsed = time.time() - start
    success_rate = int((success_count / total_categories) * 100)
    show(f"Installation completed in {elapsed:.1f}s ({success_rate}% success) {verification}", 
         "success" if success_rate > 80 else "warning")
    
    return success_rate > 80

# Run installation
install_packages()

## 🔧 Initialization

<small>This section configures the LLM provider, API keys, and core components needed for this expert system. The implementation follows a modular architecture that supports multiple AI providers and environments.</small>

## Purpose

<small>The initialization process:
1. **Sets up environment variables** including API keys
2. **Configures the LLM provider** with appropriate models and settings
3. **Initializes specialized capabilities** when needed (e.g., vision, embedding)
4. **Validates the environment** to ensure all requirements are met
</small>

## Configuration Options

<small>
You can customize the initialization by adjusting these parameters:

| Parameter | Description | Default |
|-----------|-------------|---------|
| **Provider** | AI service to use (OpenAI, Google, etc.) | OpenAI |
| **Model** | Specific model name | Depends on provider |
| **Temperature** | Creativity level (0.0-1.0) | 0.7 |
| **Features** | Additional capabilities to enable | None |

**💡 Tip**: For reproducible results, use lower temperature values (0.0-0.3).
</small>

## Provider Support

<small>
This notebook supports these LLM providers:

- **OpenAI**: GPT-4, GPT-3.5-Turbo
- **Google**: Gemini Pro, PaLM
- **Anthropic**: Claude (optional)
- **Local**: Ollama with various models (optional)

**Note**: Different providers may have varying capabilities and pricing structures.
</small>

## Setup Instructions

<small>
**For Google Colab:**
1. Store your API keys in Colab Secrets
2. Select your provider from the dropdown
3. Run the initialization cell

**For Local Environment:**
1. Create a `.env` file with your API keys
2. Select your provider
3. Run the initialization cell

**API Key Variables:**
- OpenAI: `OPENAI_API_KEY`
- Google: `GOOGLE_API_KEY`
- Anthropic: `ANTHROPIC_API_KEY`
</small>

## Troubleshooting

<small>
Common issues:
- **Authentication errors**: Check your API key is correctly set
- **Model unavailability**: Ensure you have access to the specified model
- **Import errors**: Run the installation cell first
- **Memory issues**: Select a smaller model or reduce context length

The initialization cell includes diagnostics that will help identify any configuration problems.
</small>


In [None]:
# 🔧 LLM Setup
import os, sys
from IPython.display import Markdown, display
from typing import Dict, Any, Tuple, Optional

# Colab form fields for configuration
# @title LLM Configuration
api_key = "" # @param {type:"string"}
model = "gpt-4o" # @param ["gpt-4o", "gpt-4-turbo", "gpt-4", "gpt-3.5-turbo"]
embedding_model = "text-embedding-3-small" # @param ["text-embedding-3-small", "text-embedding-3-large", "text-embedding-ada-002"]
temperature = 0.7 # @param {type:"slider", min:0, max:1, step:0.1}
debug = True # @param {type:"boolean"}

# Environment detection
IN_COLAB = 'google.colab' in sys.modules

def show(msg, type="info"):
    """Display styled message"""
    if type == "debug" and not debug:
        return
    colors = {"success": "#00C853", "info": "#2196F3", "warning": "#FF9800", "error": "#F44336", "debug": "#9C27B0"}
    icons = {"success": "✅", "info": "ℹ️", "warning": "⚠️", "error": "❌", "debug": "🔍"}
    display(Markdown(f"<div style='padding:8px;border-radius:4px;background:{colors[type]};color:white'>{icons[type]} {msg}</div>"))

def get_api_key() -> Optional[str]:
    """Get API key from various possible sources"""
    # Check form input first
    key = api_key
    
    # Try Colab secret if empty and in Colab
    if not key and IN_COLAB:
        try:
            from google.colab import userdata
            key = userdata.get('openai_api_key')
            if key:
                show("API key loaded from Colab secret", "success")
        except Exception as e:
            show(f"Error accessing Colab secrets: {e}", "debug")
    
    # Try environment variable
    if not key:
        key = os.environ.get("OPENAI_API_KEY", "")
        if key:
            show("API key loaded from environment variable", "debug")
    
    # Try .env file
    if not key:
        try:
            from dotenv import load_dotenv
            load_dotenv()
            key = os.environ.get("OPENAI_API_KEY", "")
            if key:
                show("API key loaded from .env file", "debug")
        except:
            pass
    
    # Final check and request if needed
    if not key:
        if IN_COLAB:
            show("""
            No API key found. Either:
            1. Add it in the form field above
            2. Set a Colab secret named 'openai_api_key'
            """, "warning")
        else:
            show("No API key found. Add it in the form field or set OPENAI_API_KEY environment variable", "warning")
        return None
        
    return key

# === PROVIDER-SPECIFIC: OPENAI ===
def initialize_models(api_key: str) -> Tuple[Optional[Any], Optional[Any]]:
    """Initialize OpenAI models with the provided API key"""
    from langchain_openai import ChatOpenAI, OpenAIEmbeddings
    
    # Set environment variable for consistency
    os.environ["OPENAI_API_KEY"] = api_key
    
    try:
        llm = ChatOpenAI(
            model_name=model,
            temperature=temperature,
            openai_api_key=api_key
        )
        
        embeddings = OpenAIEmbeddings(
            model=embedding_model,
            openai_api_key=api_key
        )
        
        show(f"OpenAI initialized with {model} and {embedding_model}", "success")
        return llm, embeddings
        
    except Exception as e:
        show(f"Error initializing OpenAI: {e}", "error")
        return None, None
# === END PROVIDER-SPECIFIC ===

def initialize_llm() -> Tuple[Optional[Any], Optional[Any]]:
    """Main function to set up and initialize LLM"""
    show("Initializing LLM...", "info")
    
    # Get API key
    key = get_api_key()
    if not key:
        return None, None
    
    # Initialize models
    llm, embeddings = initialize_models(key)
    
    if llm and embeddings:
        show("Initialization complete! LLM and embeddings ready to use.", "success")
    
    return llm, embeddings

# Run initialization
llm, embeddings = initialize_llm()

In [2]:
# 🛠️ Core Utilities
"""
Core utilities for Aavishkar.ai expert systems.
Includes logging, caching, error handling, and JSON parsing.
"""

import os, json, time, hashlib, functools
from typing import Dict, Any, Optional, Callable, Union
from IPython.display import Markdown, display

# === GLOBAL SETTINGS ===
DEBUG_MODE = False
CACHE_ENABLED = True
CACHE_DIR = "./cache"
os.makedirs(CACHE_DIR, exist_ok=True)

# === DISPLAY & ERROR HANDLING ===
def show(msg: str, level: str = "info") -> None:
    """Display formatted message with appropriate styling.
    
    Args:
        msg: Message to display
        level: Message level (success, info, warning, error, debug)
    """
    colors = {"success": "#00C853", "info": "#2196F3", "warning": "#FF9800", "error": "#F44336", "debug": "#9C27B0"}
    icons = {"success": "✅", "info": "ℹ️", "warning": "⚠️", "error": "❌", "debug": "🔍"}
    
    if level == "debug" and not DEBUG_MODE:
        return
        
    color = colors.get(level, colors["info"])
    icon = icons.get(level, icons["info"])
    display(Markdown(f"<div style='padding:6px;border-radius:4px;background:{color};color:white'>{icon} {msg}</div>"))

def retry(max_attempts: int = 3, delay: float = 1.0) -> Callable:
    """Decorator for retrying functions with exponential backoff."""
    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(1, max_attempts + 1):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if attempt == max_attempts:
                        raise
                    wait = delay * (2 ** (attempt - 1))
                    show(f"Attempt {attempt} failed: {str(e)}. Retrying in {wait:.1f}s...", "warning")
                    time.sleep(wait)
        return wrapper
    return decorator

# === CACHE SYSTEM ===
def cache_key(**kwargs) -> str:
    """Generate a cache key from input parameters."""
    serialized = json.dumps({k: v for k, v in kwargs.items() if v is not None}, sort_keys=True)
    return hashlib.md5(serialized.encode()).hexdigest()

def get_cache(key: str) -> Optional[Any]:
    """Get item from cache if available and not expired."""
    if not CACHE_ENABLED:
        return None
        
    path = os.path.join(CACHE_DIR, f"{key}.json")
    if not os.path.exists(path):
        return None
        
    try:
        with open(path, 'r') as f:
            data = json.load(f)
            
        # Check if expired (default: 1 day)
        if time.time() - data.get("timestamp", 0) > 86400:
            return None
            
        return data.get("value")
    except:
        return None

def set_cache(key, value):
    """Store value in cache with current timestamp."""
    if not CACHE_ENABLED:
        return
        
    path = os.path.join(CACHE_DIR, f"{key}.json")
    try:
        # Handle Pydantic models by converting to dictionaries
        def serialize_pydantic(obj):
            if hasattr(obj, 'model_dump'):  # Pydantic v2 models use model_dump
                return obj.model_dump()
            elif hasattr(obj, 'dict'):      # Older Pydantic models use dict()
                return obj.dict()
            elif isinstance(obj, list):
                return [serialize_pydantic(item) for item in obj]
            elif isinstance(obj, dict):
                return {k: serialize_pydantic(v) for k, v in obj.items()}
            return obj
            
        serialized_value = serialize_pydantic(value)
        
        with open(path, 'w') as f:
            json.dump({"timestamp": time.time(), "value": serialized_value}, f)
            
    except Exception as e:
        show(f"Cache write error: {str(e)}", "debug")

def clear_cache(older_than: Optional[int] = None) -> int:
    """Clear cache entries, optionally only those older than specified seconds.
    
    Returns:
        Number of entries cleared
    """
    if not os.path.exists(CACHE_DIR):
        return 0
        
    count = 0
    for filename in os.listdir(CACHE_DIR):
        if not filename.endswith('.json'):
            continue
            
        path = os.path.join(CACHE_DIR, filename)
        
        if older_than:
            try:
                with open(path, 'r') as f:
                    data = json.load(f)
                if time.time() - data.get("timestamp", 0) <= older_than:
                    continue
            except:
                pass
        
        try:
            os.remove(path)
            count += 1
        except:
            pass
            
    return count

# === LLM & JSON HELPERS ===
def call_llm_with_cache(llm, prompt: str, **kwargs) -> Any:
    """Call LLM with caching to avoid redundant API calls."""
    if CACHE_ENABLED:
        key = cache_key(prompt=prompt, **kwargs)
        cached = get_cache(key)
        if cached:
            show("Using cached response", "debug")
            return cached
    
    response = llm.invoke(prompt, **kwargs)
    
    if CACHE_ENABLED:
        set_cache(key, response)
    
    return response

def parse_json_safely(text: str, default: Any = None) -> Any:
    """Extract and parse JSON from text with multiple fallback strategies."""
    import re
    
    # Try direct parsing first
    try:
        return json.loads(text)
    except:
        pass
    
    # Try to extract JSON blocks
    try:
        # Try code blocks with JSON
        if "```json" in text:
            json_block = text.split("```json")[1].split("```")[0].strip()
            return json.loads(json_block)
            
        # Try any code blocks
        if "```" in text:
            code_block = text.split("```")[1].split("```")[0].strip()
            if code_block.strip().startswith(("{", "[")):
                return json.loads(code_block)
        
        # Try regex patterns for JSON objects/arrays
        patterns = [
            r'\{[\s\S]*?\}',  # JSON objects
            r'\[[\s\S]*?\]'   # JSON arrays
        ]
        
        for pattern in patterns:
            matches = re.findall(pattern, text)
            for match in matches:
                try:
                    return json.loads(match)
                except:
                    continue
    except:
        pass
    
    return default

# Initialization
show("Core utilities initialized", "debug")

## 📚 Literature Synthesis: Single Document Analysis

## Purpose
Extracts structured knowledge from individual scientific papers to facilitate comprehension and identify research opportunities.

## Core Functions

* **Information Extraction**
  * Identifies key concepts, methods, findings, and claims
  * Extracts experimental parameters and statistical results
  * Maps cited literature references

* **Knowledge Structure**
  * Creates concept hierarchies and relationships
  * Links methods to findings with evidence strength
  * Generates machine-readable knowledge representation

* **Research Gap Detection**
  * Identifies limitations acknowledged by authors
  * Highlights unanswered questions
  * Suggests logical extensions to the work

* **Visualization**
  * Generates concept relationship networks
  * Creates hierarchical representation of findings
  * Supports interactive exploration

## Input

* PDF upload, plain text, DOI, or ArXiv ID
* Best results with complete papers containing clear section structure

## Usage

1. Run setup cells (installation and initialization)
2. Upload or identify a single scientific paper
3. Review the extracted knowledge structure

**Note**: This single-document analyzer precedes multi-document synthesis capabilities.

---

*Implementation of the Literature Synthesist cognitive archetype from Aavishkar.ai*
```

## Data Models

## Purpose

<small>
This section defines the structured data representations that power our Literature Synthesis system. These Pydantic models perform two essential functions:

1. **Represent Knowledge**: Define how scientific concepts, relationships, and documents are structured
2. **Control System Behavior**: Configure how the system processes and analyzes content
</small>

## How It Works

<small>
Our system uses Pydantic models to ensure data validation and clear structure. Think of these models as "smart containers" that:

- Validate data to prevent errors
- Provide helpful error messages when something is wrong
- Include documentation for each field
- Support extensibility for specialized needs
</small>

## Core Models

<small>
Our implementation uses these key models:

| Model | Purpose |
|-------|---------|
| **LitSynthConfig** | Consolidated configuration for all system parameters |
| **Concept** | Scientific concepts extracted from literature |
| **Relationship** | Connections between scientific concepts |
| **ResearchGap** | Identified research gaps and opportunities |
| **LiteratureSynthesisOutput** | Complete analysis results container |

We've simplified the configuration into a single model (`LitSynthConfig`) to make customization easier. You can adjust parameters by modifying the `config` variable in the code cell.
</small>

## Customization

<small>
To customize the system behavior, simply modify the config variable:

```python
# Example: Increase sensitivity to detect more concepts
config.extraction_confidence = 0.6
config.max_concepts = 40

# Example: Focus only on high-importance concepts
config.min_concept_importance = "high"
```

This approach allows you to tune the system's behavior without changing the core models or implementation.
</small>


In [None]:
# 📋 Data Models
"""
This section defines the data structures used throughout the Literature Synthesis system.
You can customize these models to better match your scientific domain or analysis needs.

CUSTOMIZATION TIPS:
1. Update field descriptions to match your domain terminology
2. Adjust default values to better suit your analysis preferences
3. Add domain-specific fields to capture additional information
4. Modify validators to enforce domain-specific rules

Each model includes validators to ensure data integrity.
"""

from pydantic import BaseModel, Field, field_validator
from typing import List, Dict, Optional, Literal, Any
from datetime import datetime
from IPython.display import Markdown, display

def show_info(message):
    """Display styled info message."""
    display(Markdown(f"<div style='padding:5px;border-radius:3px;background:#2196F3;color:white'>ℹ️ {message}</div>"))

# ===== INPUT MODELS =====
# Models for document input and configuration

class DocumentSource(BaseModel):
    """Input document source with type and content.
    
    This model represents an input document to be analyzed.
    It supports multiple source types and includes metadata.
    
    Attributes:
        source_type: The type of document source (pdf, text, etc.)
        content: The document content or file path
        metadata: Additional information about the document
    """
    source_type: Literal["pdf", "text", "url", "doi", "arxiv"]
    content: str
    metadata: Dict[str, Any] = Field(default_factory=dict)
    
    # You can add custom validation methods here if needed
    # Example:
    # @field_validator('content')
    # @classmethod
    # def validate_content(cls, v):
    #     if len(v) < 10:  # Enforce minimum content length
    #         raise ValueError("Document content too short")
    #     return v

# ===== KNOWLEDGE REPRESENTATION MODELS =====
# Models for representing extracted knowledge

class Concept(BaseModel):
    """Scientific concept extracted from literature.
    
    This model represents a key scientific concept identified in the text.
    Each concept has a name, definition, and importance rating.
    
    Customization:
    - Add domain-specific concept types
    - Adjust default importance level
    - Add additional fields for your domain
    
    Attributes:
        name: The concept name or term
        definition: Clear definition of the concept
        importance: How important this concept is (high, medium, low)
        concept_type: Classification of the concept (term, method, theory, etc.)
        sources: List of sources where concept was found
        related_terms: List of related concepts
        confidence: Confidence score for extraction (0-1)
    """
    name: str
    definition: str
    importance: Literal["high", "medium", "low"] = "medium"
    concept_type: str = "term"
    sources: List[str] = Field(default_factory=list)
    related_terms: List[str] = Field(default_factory=list)
    confidence: float = Field(default=0.8, ge=0.0, le=1.0)
    
    # Example custom validator (commented out)
    # @field_validator('name')
    # @classmethod
    # def normalize_name(cls, v):
    #     """Normalize concept names to title case"""
    #     return v.strip().title()

class Relationship(BaseModel):
    """Connection between scientific concepts.
    
    This model represents how concepts relate to each other in the text.
    Each relationship connects two concepts with a specific relationship type.
    
    Customization:
    - Add domain-specific relationship types
    - Adjust confidence thresholds
    - Add fields for relationship strength or context
    
    Attributes:
        source: The source concept name
        target: The target concept name
        relationship_type: Type of relationship (causes, influences, etc.)
        evidence: Text evidence supporting this relationship
        confidence: Confidence score for extraction (0-1)
        bidirectional: Whether the relationship applies in both directions
    """
    source: str
    target: str
    relationship_type: str
    evidence: Optional[str] = None
    confidence: float = Field(default=0.7, ge=0.0, le=1.0)
    bidirectional: bool = False
    
    # Example of domain-specific validation (uncomment and customize)
    # @field_validator('relationship_type')
    # @classmethod
    # def validate_relationship_type(cls, v):
    #     """Validate relationship types match domain expectations"""
    #     valid_types = ["causes", "influences", "measures", "contains", "precedes"]
    #     if v.lower() not in valid_types:
    #         raise ValueError(f"Relationship type must be one of: {', '.join(valid_types)}")
    #     return v

class ResearchGap(BaseModel):
    """Identified research gap or opportunity.
    
    This model represents potential areas for future research,
    based on limitations or unexplored connections in the literature.
    
    Customization:
    - Adjust importance criteria
    - Add fields for research difficulty or potential impact
    - Add categorization for types of research gaps
    
    Attributes:
        description: Clear description of the research gap
        related_concepts: List of concepts related to this gap
        evidence: Text evidence supporting this gap
        importance: How significant this gap is (high, medium, low)
        confidence: Confidence score for extraction (0-1)
    """
    description: str
    related_concepts: List[str] = Field(default_factory=list)
    evidence: Optional[str] = None
    importance: Literal["high", "medium", "low"] = "medium"
    confidence: float = Field(default=0.7, ge=0.0, le=1.0)

class LiteratureSynthesisOutput(BaseModel):
    """Complete output from literature synthesis process.
    
    This model represents the complete results of analyzing a document,
    including concepts, relationships, research gaps, and a synthesis.
    
    Customization:
    - Add domain-specific metadata fields
    - Add fields for additional analysis artifacts
    
    Attributes:
        document_id: Unique identifier for the document
        document_metadata: Additional information about the document
        concepts: List of extracted concepts
        relationships: List of identified relationships
        research_gaps: List of potential research gaps
        synthesis_text: Comprehensive synthesis text
        timestamp: When the analysis was performed
    """
    document_id: str
    document_metadata: Dict[str, Any] = Field(default_factory=dict)
    concepts: List[Concept] = Field(default_factory=list)
    relationships: List[Relationship] = Field(default_factory=list)
    research_gaps: List[ResearchGap] = Field(default_factory=list)
    synthesis_text: Optional[str] = None
    timestamp: datetime = Field(default_factory=datetime.now)
    
    # You could add methods for analyzing results
    # Example:
    # def get_central_concepts(self, top_n=5):
    #     """Return the most connected concepts"""
    #     concept_connections = {}
    #     for rel in self.relationships:
    #         concept_connections[rel.source] = concept_connections.get(rel.source, 0) + 1
    #         concept_connections[rel.target] = concept_connections.get(rel.target, 0) + 1
    #     
    #     return sorted(concept_connections.items(), key=lambda x: x[1], reverse=True)[:top_n]

# ===== CONFIGURATION MODEL =====
# Controls system behavior and processing parameters

class LitSynthConfig(BaseModel):
    """Configuration for the Literature Synthesis system.
    
    This model controls how the system processes and analyzes documents.
    Adjust these parameters to customize the analysis for your needs.
    
    Customization Tips:
    - Increase chunk size for more context (but slower processing)
    - Adjust confidence thresholds based on LLM quality
    - Set your scientific domain for more targeted analysis
    - Update max counts based on document size and complexity
    
    Attributes:
        text_chunk_size: Characters per text chunk for processing
        text_chunk_overlap: Overlap between chunks to maintain context
        min_concept_importance: Minimum importance level to include
        extraction_confidence: Minimum confidence for concept extraction
        max_concepts: Maximum concepts to extract
        relationship_confidence: Minimum confidence for relationships
        max_relationships: Maximum relationships to extract
        scientific_domain: Optional domain specialization
    """
    # Text Processing Parameters
    text_chunk_size: int = Field(
        default=2000, ge=500, le=8000, 
        description="Characters per text chunk"
    )
    text_chunk_overlap: int = Field(
        default=200, ge=50, le=1000,
        description="Overlap between chunks"
    )
    
    # Concept Extraction Parameters
    min_concept_importance: Literal["low", "medium", "high"] = Field(
        default="medium", 
        description="Minimum importance level to include"
    )
    extraction_confidence: float = Field(
        default=0.7, ge=0.0, le=1.0,
        description="Minimum extraction confidence"
    )
    max_concepts: int = Field(
        default=25, ge=5, le=100,
        description="Maximum concepts to extract"
    )
    
    # Relationship Parameters
    relationship_confidence: float = Field(
        default=0.6, ge=0.0, le=1.0,
        description="Minimum relationship confidence"
    )
    max_relationships: int = Field(
        default=50, ge=10, le=200,
        description="Maximum relationships to extract"
    )
    
    # Customization
    scientific_domain: Optional[str] = Field(
        default=None,
        description="Scientific domain for specialized analysis"
    )
    
    @field_validator('text_chunk_overlap')
    @classmethod
    def validate_overlap(cls, v, info):
        """Ensure overlap is less than chunk size."""
        if 'text_chunk_size' in info.data and v >= info.data['text_chunk_size']:
            raise ValueError("text_chunk_overlap must be less than text_chunk_size")
        return v
        
    @field_validator('min_concept_importance')
    @classmethod
    def validate_importance(cls, v):
        """Validate importance level values."""
        valid_levels = ["low", "medium", "high"]
        if v not in valid_levels:
            raise ValueError(f"min_concept_importance must be one of {valid_levels}")
        return v
    
    # You can add domain-specific validation here
    # Example:
    # @field_validator('scientific_domain')
    # @classmethod
    # def validate_domain(cls, v):
    #     """Validate scientific domain if provided."""
    #     if v is not None:
    #         valid_domains = ["biology", "chemistry", "physics", "computer science", "medicine"]
    #         if v.lower() not in valid_domains:
    #             raise ValueError(f"Domain must be one of: {', '.join(valid_domains)}")
    #     return v

# Initialize default configuration
config = LitSynthConfig()

# Example of how to customize for a specific domain (commented out)
"""
# Customize for biology papers
bio_config = LitSynthConfig(
    text_chunk_size=3000,  # Longer chunks for biological context
    scientific_domain="biology",
    max_concepts=40,  # Biology papers often have more terms
    relationship_confidence=0.65  # Slightly higher threshold
)
"""

# Show confirmation
show_info("Data models configured successfully - feel free to customize them for your scientific domain")

## Core Functions

<small>
This section contains the heart of our Literature Synthesis system - the functions that analyze documents, extract key information, and generate insights.

## What's Included

1. **Prompt Library**: The instructions we give to the AI model
2. **Function Definitions**: The code that processes documents and manages the analysis

## How to Customize

You can easily modify the system's behavior by:

- **Changing prompts**: Edit the instructions to focus on specific types of information
- **Adjusting parameters**: Fine-tune the analysis by modifying the `config` settings

No coding knowledge is required - simply edit the text of prompts in the first code cell below.
</small>


In [None]:
# 📖 Prompt Library
"""
This section contains all prompts used by the system.
You can safely modify these prompts to customize how the AI analyzes your documents.
Each prompt includes clear instructions for the AI and expected output format.

CUSTOMIZATION TIPS:
1. Keep the output format instructions intact (especially for JSON responses)
2. Feel free to add domain-specific instructions or examples
3. You can emphasize certain aspects by adding more detailed instructions
4. Test your changes with small documents first
"""

PROMPTS = {
    # === CONCEPT EXTRACTION ===
    # Purpose: Extract scientific concepts with definitions and importance
    # Output: List of concept objects
    "concept_extraction": """
    You are a scientific knowledge extraction expert. Extract the key scientific concepts 
    from the following text.
    
    TEXT:
    {text}
    
    INSTRUCTIONS:
    1. Identify main scientific concepts, terms, and ideas
    2. Provide clear definitions for each concept
    3. Rate importance as "high", "medium", or "low"
    4. Focus on domain-specific terminology
    
    Return ONLY a JSON array of concepts with this structure:
    ```json
    [
      {{
        "name": "Concept name",
        "definition": "Clear definition of the concept",
        "importance": "high|medium|low",
        "concept_type": "term|method|theory|etc"
      }}
    ]
    ```
    
    CONFIG PARAMETERS:
    {config}
    """,
    
    # === RELATIONSHIP MAPPING ===
    # Purpose: Identify how concepts relate to each other
    # Output: List of relationship objects
    "relationship_mapping": """
    You are a scientific knowledge graph expert. Identify relationships between the 
    following scientific concepts.
    
    TEXT:
    {text}
    
    CONCEPTS:
    {concepts}
    
    INSTRUCTIONS:
    1. Analyze how concepts relate to each other in the text
    2. Identify specific relationship types (causes, influences, measures, etc.)
    3. Only include relationships with evidence in the text
    4. Assign confidence scores based on clarity of evidence
    
    Return ONLY a JSON array of relationships with this structure:
    ```json
    [
      {{
        "source": "Source concept name",
        "target": "Target concept name",
        "relationship_type": "Specific type of relationship",
        "evidence": "Brief evidence from text",
        "confidence": 0.8
      }}
    ]
    ```
    """,
    
    # === RESEARCH GAP IDENTIFICATION ===
    # Purpose: Identify potential research gaps or opportunities
    # Output: List of research gap objects
    "research_gap": """
    You are a research direction consultant. Identify potential research gaps based on 
    the following analysis.
    
    TEXT:
    {text}
    
    CONCEPTS:
    {concepts}
    
    RELATIONSHIPS:
    {relationships}
    
    INSTRUCTIONS:
    1. Identify unexplored connections between concepts
    2. Look for limitations mentioned in the text
    3. Consider methodological gaps
    4. Identify questions raised but not answered
    
    Return ONLY a JSON array of research gaps with this structure:
    ```json
    [
      {{
        "description": "Clear description of the research gap",
        "related_concepts": ["Concept1", "Concept2"],
        "evidence": "Supporting evidence from text",
        "importance": "high|medium|low"
      }}
    ]
    ```
    """,
    
    # === SYNTHESIS GENERATION ===
    # Purpose: Generate comprehensive synthesis of the analysis
    # Output: Formatted text summary
    "synthesis_generation": """
    You are a scientific research synthesizer. Create a comprehensive synthesis of the following analysis.
    
    TEXT EXTRACT:
    {text}
    
    CONCEPTS:
    {concepts}
    
    RELATIONSHIPS:
    {relationships}
    
    RESEARCH GAPS:
    {gaps}
    
    INSTRUCTIONS:
    1. Summarize the key concepts and their relationships
    2. Highlight the most important findings
    3. Discuss potential research opportunities
    4. Use clear, concise language suitable for researchers
    
    Format your response as markdown with sections for:
    1. Overview
    2. Key Concepts
    3. Relationships & Patterns
    4. Research Opportunities
    5. Conclusion
    """
}

# You can customize prompts above to better suit your specific domain or preferences
# The system will use these prompts when analyzing documents
# Remember to keep the JSON output structure intact for structured data prompts

show("Prompt library initialized with 4 customizable prompts", "success")

In [None]:
# 📊 Core Functions
"""
This section contains all the core functionality for the Literature Synthesis system.
Each function is designed to be modular, well-documented, and easy to customize.
"""

import json, re, os, time, hashlib
from typing import List, Dict, Any, Optional, Type
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableLambda

# ===== DOCUMENT PROCESSING =====
# These functions handle loading documents and text processing
# You can customize chunking parameters in the LitSynthConfig

def load_document(source_type: str, content: str) -> Dict[str, Any]:
    """Load document from various sources (text or PDF)
    
    Args:
        source_type: Type of document ("text" or "pdf")
        content: Document content or file path
        
    Returns:
        Dictionary with extracted text and metadata
    """
    if source_type == "text":
        return {"text": content, "metadata": {"source_type": "text", "length": len(content)}}
    elif source_type == "pdf":
        try:
            from pypdf import PdfReader
            reader = PdfReader(content)
            text = "\n\n".join(page.extract_text() for page in reader.pages)
            return {"text": text, "metadata": {"source_type": "pdf", "filename": os.path.basename(content), "pages": len(reader.pages)}}
        except Exception as e:
            return {"text": "", "metadata": {"error": str(e)}}
    else:
        return {"text": "", "metadata": {"error": "Unsupported source type"}}

def chunk_text(text: str, config: LitSynthConfig) -> List[str]:
    """Split text into semantically coherent chunks optimized for scientific papers
    
    Args:
        text: The text to chunk
        config: Configuration with chunk_size and overlap parameters
        
    Returns:
        List of text chunks optimized for scientific content
    """
    if not text: 
        return []
    
    # Safety bounds for config values
    chunk_size = min(max(config.text_chunk_size, 1000), 4000)
    overlap = min(config.text_chunk_overlap, chunk_size // 4)
    
    # Scientific paper section headers (common patterns in papers)
    section_headers = [
        r'\n+\s*ABSTRACT\s*\n+',
        r'\n+\s*INTRODUCTION\s*\n+', 
        r'\n+\s*METHODS?\s*\n+',
        r'\n+\s*RESULTS\s*\n+',
        r'\n+\s*DISCUSSION\s*\n+',
        r'\n+\s*CONCLUSION\s*\n+',
        r'\n+\s*REFERENCES\s*\n+'
    ]
    
    # First try to split by major sections
    section_splits = [0]
    for pattern in section_headers:
        for match in re.finditer(pattern, text, re.IGNORECASE):
            section_splits.append(match.start())
    section_splits.append(len(text))
    section_splits = sorted(set(section_splits))
    
    # Generate base chunks from sections
    base_chunks = []
    if len(section_splits) > 2:  # More than just start and end
        for i in range(len(section_splits) - 1):
            start, end = section_splits[i], section_splits[i+1]
            if end - start > 100:  # Avoid tiny sections
                base_chunks.append(text[start:end].strip())
    else:
        base_chunks = [text]  # No clear sections, use whole text
    
    # Further split any chunks that exceed the max size
    final_chunks = []
    for chunk in base_chunks:
        if len(chunk) <= chunk_size:
            final_chunks.append(chunk)
        else:
            # Split large chunks by paragraphs or sentences
            start = 0
            while start < len(chunk):
                end = min(start + chunk_size, len(chunk))
                
                if end < len(chunk):
                    # Try paragraph boundaries
                    para_end = chunk.rfind("\n\n", start + (chunk_size // 2), end)
                    if para_end > start + 200:
                        end = para_end + 2
                    else:
                        # Fall back to sentence boundaries
                        for sep in [". ", ".\n", "? ", "! "]:
                            sent_end = chunk.rfind(sep, start + (chunk_size // 2), end)
                            if sent_end > start + 200:
                                end = sent_end + len(sep)
                                break
                
                final_chunks.append(chunk[start:end].strip())
                start = max(end - overlap, start + (chunk_size // 2))
    
    return final_chunks

# ===== LLM INTEGRATION =====
# These functions manage LLM interactions with enhanced robustness
# The parsing logic handles various LLM output formats

def create_chain(prompt_name: str, output_model=None):
    """Create an LCEL chain for a specific LLM task
    
    Args:
        prompt_name: Name of the prompt from PROMPTS dictionary
        output_model: Optional Pydantic model for structured output
        
    Returns:
        LCEL chain configured for the specified task
    """
    # Guard against missing prompts
    if prompt_name not in PROMPTS:
        show(f"Prompt '{prompt_name}' not found in prompt library", "error")
        return None
        
    template = ChatPromptTemplate.from_template(PROMPTS[prompt_name])
    
    def parse_output(response):
        """Parse LLM response with multi-strategy approach"""
        content = response.content if hasattr(response, 'content') else response
        
        # For text output (synthesis), return directly
        if not output_model:
            return content
        
        # For structured output, try multiple parsing strategies:
        
        # 1. Code block extraction
        if isinstance(content, str) and "```" in content:
            try:
                # Handle ```json blocks
                if "```json" in content:
                    json_block = content.split("```json")[1].split("```")[0].strip()
                    data = json.loads(json_block)
                    return [output_model(**item) for item in data]
                # Handle any other code blocks
                else:
                    code_block = content.split("```")[1].split("```")[0].strip()
                    if code_block.strip().startswith("[") and code_block.strip().endswith("]"):
                        data = json.loads(code_block)
                        return [output_model(**item) for item in data]
            except Exception as e:
                show(f"Code block parsing error: {str(e)}", "debug")
        
        # 2. Direct JSON parsing
        try:
            data = json.loads(content if isinstance(content, str) else content.content)
            return [output_model(**item) for item in data]
        except Exception:
            pass
            
        # 3. Fallback regex extraction
        try:
            matches = re.findall(r'\[(.*?)\]', content, re.DOTALL)
            if matches:
                data = json.loads(f"[{max(matches, key=len)}]")
                return [output_model(**item) for item in data]
        except Exception as e:
            show(f"All parsing methods failed: {str(e)}", "debug")
        
        return []
    
    # Create and return the chain
    if "llm" in globals():
        return template | llm | RunnableLambda(parse_output)
    else:
        return RunnableLambda(lambda _: [] if output_model else "Placeholder output (no LLM configured)")

def cached_run(chain, inputs: Dict, key_prefix: str = ""):
    """Run chain with caching to minimize API calls
    
    This function handles the complexity of caching LLM responses and properly
    reconstructing Pydantic model objects when retrieving from cache. Without
    this reconstruction step, cached results would be plain dictionaries
    lacking the methods and behaviors of the original model classes.
    
    Args:
        chain: LCEL chain to run
        inputs: Input parameters 
        key_prefix: Cache key prefix for identification (e.g., "concepts", "relationships")
        
    Returns:
        Chain output (cached or fresh), with proper Pydantic model types
    """
    if not chain: 
        return [] if 'text' not in key_prefix else "No chain available"
    
    # Map key_prefix to the appropriate model class
    model_map = {
        "concepts": Concept,
        "relationships": Relationship,
        "gaps": ResearchGap
    }
    output_model = model_map.get(key_prefix)
    
    # Use the existing caching functions from Core Utilities
    if 'CACHE_ENABLED' in globals() and CACHE_ENABLED:
        # Prepare inputs for caching - limit text size for reasonable cache keys
        cache_inputs = {}
        for k, v in inputs.items():
            if k == 'text' and isinstance(v, str) and len(v) > 500:
                cache_inputs[k] = v[:500]  # Use first 500 chars of text for cache key
            else:
                cache_inputs[k] = v
                
        # Add prefix to differentiate between similar calls
        cache_inputs['_function'] = key_prefix
        
        # Try to get cached result
        try:
            key = cache_key(**cache_inputs)
            cached_result = get_cache(key)
            
            if cached_result is not None:
                show(f"Using cached result for {key_prefix}", "debug")
                
                # Convert dictionaries back to Pydantic models if needed
                if output_model and isinstance(cached_result, list) and cached_result:
                    # Check if we need to reconstruct models (if first item is a dict)
                    if isinstance(cached_result[0], dict):
                        try:
                            cached_result = [output_model(**item) for item in cached_result]
                        except Exception as e:
                            show(f"Model reconstruction error: {str(e)}", "debug")
                
                return cached_result
        except Exception as e:
            show(f"Cache access error: {str(e)}", "debug")
    
    # Run chain if not in cache or caching disabled
    try:
        result = chain.invoke(inputs)
        
        # Try to cache the result
        if 'CACHE_ENABLED' in globals() and CACHE_ENABLED:
            try:
                set_cache(key, result)
            except Exception as e:
                show(f"Cache storage error: {str(e)}", "debug")
                
        return result
    except Exception as e:
        error_msg = str(e)
        show(f"Error in {key_prefix}: {error_msg}", "error")
        return [] if 'text' not in key_prefix else f"Error: {error_msg}"

# ===== ANALYSIS FUNCTIONS =====
# These functions implement the core literature analysis capabilities
# Each function can be customized through the corresponding prompt

def extract_concepts(text: str, config: LitSynthConfig) -> List[Concept]:
    """Extract scientific concepts from text
    
    Args:
        text: Scientific text to analyze
        config: Configuration parameters
        
    Returns:
        List of extracted Concept objects with definitions
    """
    if not text or len(text.strip()) < 20:
        show("Text too short for concept extraction", "warning")
        return []
    
    show(f"Extracting concepts from text ({len(text)} chars)...", "info")
    
    try:
        # Create chain and run extraction
        chain = create_chain("concept_extraction", Concept)
        concepts = cached_run(chain, {"text": text, "config": config.model_dump()}, "concepts")
        
        # Filter and sort by importance
        importance_map = {"high": 3, "medium": 2, "low": 1}
        min_value = importance_map.get(config.min_concept_importance, 1)
        
        if not concepts:
            show("No concepts extracted", "warning")
            return []
            
        filtered = [c for c in concepts if importance_map.get(c.importance, 0) >= min_value]
        filtered.sort(key=lambda c: importance_map.get(c.importance, 0), reverse=True)
        
        show(f"Extracted {len(filtered[:config.max_concepts])} concepts", "info")
        return filtered[:config.max_concepts]
        
    except Exception as e:
        show(f"Error in concept extraction: {str(e)}", "error")
        return []

def identify_relationships(text: str, concepts: List[Concept], config: LitSynthConfig) -> List[Relationship]:
    """Map relationships between concepts
    
    Args:
        text: Source text
        concepts: List of extracted concepts
        config: Configuration parameters
        
    Returns:
        List of Relationship objects connecting concepts
    """
    if not text or not concepts: 
        return []
    
    # Format concepts for prompt
    concepts_text = "\n".join([f"- {c.name}: {c.definition}" for c in concepts])
    
    chain = create_chain("relationship_mapping", Relationship)
    relationships = cached_run(chain, {
        "text": text, 
        "concepts": concepts_text
    }, "relationships")
    
    # Filter valid relationships
    concept_names = {c.name for c in concepts}
    valid = [r for r in relationships 
            if r.source in concept_names and r.target in concept_names 
            and (not hasattr(r, 'confidence') or r.confidence >= config.relationship_confidence)]
    
    # Sort by confidence if available
    if valid and hasattr(valid[0], 'confidence'):
        valid.sort(key=lambda r: getattr(r, 'confidence', 0), reverse=True)
        
    return valid[:config.max_relationships]

def identify_research_gaps(text: str, concepts: List[Concept], relationships: List[Relationship]) -> List[ResearchGap]:
    """Identify research gaps and opportunities
    
    Args:
        text: Source text
        concepts: List of extracted concepts
        relationships: List of identified relationships
        
    Returns:
        List of ResearchGap objects
    """
    if not text: 
        return []
    
    # Prepare formatted inputs
    concepts_text = "\n".join([f"- {c.name}: {c.definition}" for c in concepts])
    relationships_text = "\n".join([f"- {r.source} {r.relationship_type} {r.target}" for r in relationships])
    
    chain = create_chain("research_gap", ResearchGap)
    return cached_run(chain, {
        "text": text, 
        "concepts": concepts_text, 
        "relationships": relationships_text
    }, "gaps")

def generate_synthesis(text: str, concepts: List[Concept], relationships: List[Relationship], gaps: List[ResearchGap]) -> str:
    """Generate synthesis text summarizing the analysis
    
    Args:
        text: Source text
        concepts: List of extracted concepts
        relationships: List of identified relationships
        gaps: List of research gaps
        
    Returns:
        Formatted synthesis text
    """
    if not text or not concepts: 
        return "Insufficient data for synthesis."
    
    # Prepare formatted inputs
    concepts_text = "\n".join([f"- {c.name}: {c.definition} (Importance: {c.importance})" for c in concepts])
    relationships_text = "\n".join([f"- {r.source} {r.relationship_type} {r.target}" for r in relationships])
    gaps_text = "\n".join([f"- {g.description} (Importance: {g.importance})" for g in gaps])
    
    chain = create_chain("synthesis_generation")
    return cached_run(chain, {
        "text": text, 
        "concepts": concepts_text, 
        "relationships": relationships_text,
        "gaps": gaps_text
    }, "synthesis")

# ===== MAIN ANALYSIS FUNCTION =====
# This orchestrates the entire analysis process

def analyze_document(source_type: str, content: str, config: LitSynthConfig = None) -> LiteratureSynthesisOutput:
    """Complete end-to-end document analysis
    
    Args:
        source_type: Document type ("text" or "pdf")
        content: Document content or file path
        config: Configuration parameters (optional)
        
    Returns:
        Complete analysis results in LiteratureSynthesisOutput container
    """
    config = config or LitSynthConfig()
    
    try:
        # 1. Load and process document
        doc_data = load_document(source_type, content)
        text = doc_data.get("text", "")
        if not text: 
            raise ValueError("Failed to extract text")
        doc_id = hashlib.md5(text[:1000].encode()).hexdigest()[:10]
        
        # 2. Extract concepts (chunking if needed)
        if len(text) > config.text_chunk_size:
            # Process in chunks
            chunks = chunk_text(text, config)
            all_concepts = []
            for chunk in chunks:
                all_concepts.extend(extract_concepts(chunk, config))
            
            # Deduplicate keeping highest importance
            concepts_map = {}
            importance_rank = {"high": 3, "medium": 2, "low": 1}
            for concept in all_concepts:
                name = concept.name.lower()
                if name not in concepts_map or importance_rank.get(concept.importance, 0) > importance_rank.get(concepts_map[name].importance, 0):
                    concepts_map[name] = concept
            concepts = list(concepts_map.values())
        else:
            # Process directly
            concepts = extract_concepts(text, config)
        
        # 3. Extract relationships, gaps, and generate synthesis
        relationships = identify_relationships(text, concepts, config)
        gaps = identify_research_gaps(text, concepts, relationships)
        synthesis_text = generate_synthesis(text, concepts, relationships, gaps)
        
        # 4. Create output container
        return LiteratureSynthesisOutput(
            document_id=doc_id,
            document_metadata=doc_data.get("metadata", {}),
            concepts=concepts,
            relationships=relationships,
            research_gaps=gaps,
            synthesis_text=synthesis_text
        )
        
    except Exception as e:
        # Return error container
        return LiteratureSynthesisOutput(
            document_id=f"error_{int(time.time())}",
            document_metadata={"error": str(e)},
            concepts=[],
            relationships=[],
            research_gaps=[],
            synthesis_text=f"Analysis error: {str(e)}"
        )

# Initialization complete
show("Core functions initialized", "success")

## Launch UI

In [None]:
# System Initialization
from pathlib import Path

try:
    # Initialize the Literature Synthesis System
    class LitSynthSystem:
        """Main system class that coordinates all components of the Literature Synthesis system."""
        
        def __init__(self, config=None):
            """Initialize the system with configuration and components."""
            # Use existing config or create new one
            self.config = config or globals().get('config', LitSynthConfig())
            
            # No cache setup needed - already handled in Core Utilities
        
        def analyze_document(self, source_type, content):
            """Main entry point to analyze a document."""
            return analyze_document(source_type, content, self.config)
        
        def extract_concepts_from_text(self, text):
            """Extract concepts from text directly."""
            return extract_concepts(text, self.config)
        
        def identify_relationships_from_concepts(self, text, concepts):
            """Identify relationships between concepts."""
            return identify_relationships(text, concepts, self.config)
        
        def identify_gaps_from_concepts_relationships(self, text, concepts, relationships):
            """Identify research gaps from concepts and relationships."""
            return identify_research_gaps(text, concepts, relationships)
        
        def generate_synthesis_from_components(self, text, concepts, relationships, gaps):
            """Generate synthesis from all components."""
            return generate_synthesis(text, concepts, relationships, gaps)

    # Initialize the system (connects previously defined components)
    litsynth = LitSynthSystem()
    
    # Confirm successful initialization - using show_info correctly
    show_info("Literature Synthesis System initialized successfully")
    
except Exception as e:
    # Use show from Core Utilities for error (with level parameter)
    error_msg = f"System failed to initialize: {str(e)}"
    show(f"{error_msg}\nPlease make sure you run the cells in order: Installation-Initialization-Data Models-Core Functions", "error")

In [None]:
# UI Structure
import gradio as gr
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

def create_ui():
    """Create optimized Literature Synthesis UI."""
    
    with gr.Blocks() as app:
        gr.Markdown("# Literature Synthesis Expert System")
        
        with gr.Tabs() as tabs:
            # === INPUT TAB ===
            with gr.Tab("Document Input"):
                with gr.Row():
                    with gr.Column(scale=3):
                        # PDF Upload (only option)
                        pdf_input = gr.File(
                            label="Upload Scientific PDF", 
                            file_types=[".pdf"],
                            file_count="single"
                        )
                        
                        with gr.Row():
                            analysis_mode = gr.Radio(
                                choices=["quick", "balanced", "thorough"],
                                label="Analysis Mode",
                                value="quick",
                                info="Quick: 30-40s, Balanced: 1-2min, Thorough: 3-5min"
                            )
                            analyze_btn = gr.Button("Analyze Document", variant="primary")
                    
                    with gr.Column(scale=2):
                        status_box = gr.Textbox(label="Status", interactive=False)
                        progress_bar = gr.Slider(
                            minimum=0, maximum=100, value=0, 
                            label="Processing Progress",
                            interactive=False
                        )
            
            # === CONCEPTS TAB ===
            with gr.Tab("Concepts & Relationships"):
                with gr.Row():
                    # Add document metadata box at the top
                    doc_info = gr.Markdown("*Upload and analyze a document to see results*")
                
                with gr.Row():
                    with gr.Column():
                        concepts_filter = gr.Radio(
                            choices=["all", "high", "medium", "low"],
                            label="Filter by Importance",
                            value="all"
                        )
                        concepts_table = gr.DataFrame(
                            headers=["Concept", "Definition", "Importance", "Confidence"]
                        )
                    
                    with gr.Column():
                        relationships_table = gr.DataFrame(
                            headers=["Source", "Relationship", "Target", "Evidence", "Confidence"]
                        )
            
            # === VISUALIZATION TAB ===
            with gr.Tab("Visualization"):
                with gr.Row():
                    with gr.Column():
                        with gr.Row():
                            min_confidence = gr.Slider(
                                minimum=0.0, maximum=1.0, value=0.5, step=0.1,
                                label="Minimum Confidence"
                            )
                            layout_type = gr.Dropdown(
                                choices=["Force-directed", "Circular", "Spectral", "Spring"],
                                label="Layout Type",
                                value="Force-directed"
                            )
                            refresh_viz_btn = gr.Button("Refresh")
                        
                        network_plot = gr.Plot(label="Concept Network")
                            
            # === SYNTHESIS TAB ===
            with gr.Tab("Research Synthesis"):
                with gr.Row():
                    with gr.Column(scale=3):
                        synthesis_output = gr.Markdown()
                    
                    with gr.Column(scale=2):
                        gr.Markdown("### Research Gaps")
                        gaps_table = gr.DataFrame(
                            headers=["Description", "Related Concepts", "Importance"]
                        )
                        
                        with gr.Row():
                            export_format = gr.Dropdown(
                                choices=["Markdown", "Text", "JSON"],
                                label="Export Format",
                                value="Markdown"
                            )
                            export_btn = gr.Button("Export")
                            
            # === SETTINGS TAB ===
            with gr.Tab("Settings"):
                with gr.Row():
                    with gr.Column():
                        # Analysis mode
                        gr.Markdown("#### Analysis Settings")
                        settings_analysis_mode = gr.Radio(
                            choices=["quick", "balanced", "thorough"],
                            label="Analysis Mode",
                            value="quick",
                            info="Affects document sampling and processing depth"
                        )
                        
                        # Text processing settings
                        gr.Markdown("#### Text Processing")
                        chunk_size = gr.Slider(
                            minimum=500, maximum=8000, value=2000, step=500,
                            label="Chunk Size (chars)",
                            info="Larger chunks capture more context but process slower"
                        )
                        chunk_overlap = gr.Slider(
                            minimum=50, maximum=1000, value=200, step=50,
                            label="Chunk Overlap"
                        )
                        
                        # Concept settings
                        gr.Markdown("#### Concepts")
                        min_importance = gr.Dropdown(
                            choices=["low", "medium", "high"],
                            label="Min Importance",
                            value="medium"
                        )
                        max_concepts = gr.Slider(
                            minimum=5, maximum=100, value=25, step=5,
                            label="Max Concepts"
                        )
                        
                        # Relationship settings
                        gr.Markdown("#### Relationships")
                        relationship_confidence = gr.Slider(
                            minimum=0.0, maximum=1.0, value=0.6, step=0.1,
                            label="Min Confidence"
                        )
                        max_relationships = gr.Slider(
                            minimum=10, maximum=200, value=50, step=10,
                            label="Max Relationships"
                        )
                        
                        apply_settings_btn = gr.Button("Apply Settings", variant="primary")
                        settings_status = gr.Textbox(label="Settings Status", interactive=False)
        
        # === State Variables ===
        results_state = gr.State(None)
    
    return app

# Create the UI
ui = create_ui()

# Display success message in notebook
show_info("UI structure defined successfully")

In [None]:
# UI Launch
import tempfile
import os
import time
import json
import matplotlib.pyplot as plt
import networkx as nx
import pandas as pd
from pathlib import Path

def launch_litsynth_ui():
    """Launch the Literature Synthesis Expert System UI."""
    
    # Document Processing Functions
    def smart_sample_document(text, sample_percentage=15, min_chars=4000, max_chars=15000):
        """Sample document to reduce processing time while keeping key sections."""
        if not text or len(text) <= min_chars:
            return (text, 100) if text else ("", 0)
            
        target_size = max(min_chars, min(max_chars, int(len(text) * sample_percentage / 100)))
        
        # Extract document sections
        import re
        section_patterns = {
            'abstract': r'(?i)abstract\s*\n',
            'introduction': r'(?i)(introduction|background)\s*\n',
            'methods': r'(?i)(methods|methodology|materials\s+and\s+methods)\s*\n',
            'results': r'(?i)results\s*\n',
            'discussion': r'(?i)discussion\s*\n',
            'conclusion': r'(?i)(conclusion|conclusions|summary)\s*\n'
        }
        
        sections = {}
        for name, pattern in section_patterns.items():
            matches = list(re.finditer(pattern, text))
            if matches:
                start = matches[0].end()
                next_starts = [m.start() for m in re.finditer('|'.join(section_patterns.values()), text[start:])]
                end = start + next_starts[0] if next_starts else len(text)
                sections[name] = (start, end)
        
        # Prioritize sections or use beginning-middle-end approach
        sampled_text = ""
        if sections:
            priority_order = ['abstract', 'introduction', 'conclusion', 'discussion', 'results', 'methods']
            chars_remaining = target_size
            
            for section in priority_order:
                if section in sections and chars_remaining > 0:
                    start, end = sections[section]
                    section_text = text[start:end]
                    chars_to_take = min(len(section_text), chars_remaining)
                    sampled_text += section_text[:chars_to_take] + "\n\n"
                    chars_remaining -= chars_to_take
        
        if len(sampled_text) < min_chars or not sections:
            sampled_text = ""
            part_size = target_size // 3
            
            sampled_text += text[:part_size] + "\n\n"
            if len(text) > part_size * 3:
                middle_start = len(text) // 2 - part_size // 2
                sampled_text += "[...]\n\n" + text[middle_start:middle_start + part_size] + "\n\n"
            if len(text) > part_size * 2:
                sampled_text += "[...]\n\n" + text[max(len(text) - part_size, part_size * 2)]
        
        coverage = min(100, round(len(sampled_text) / len(text) * 100))
        return sampled_text, coverage
    
    def process_pdf_document(pdf_file, analysis_mode="quick"):
        """Process PDF with staged concept extraction and relationship mapping."""
        if not pdf_file:
            return "Please upload a PDF file.", None, 0
        
        try:
            yield "Loading PDF document...", None, 0
            start_time = time.time()
            
            # Process file upload
            temp_path = Path(tempfile.mkdtemp()) / "uploaded.pdf"
            if hasattr(pdf_file, 'name'):
                with open(pdf_file.name, "rb") as src_file, open(temp_path, "wb") as dest_file:
                    dest_file.write(src_file.read())
            else:
                with open(temp_path, "wb") as f:
                    f.write(pdf_file)
            
            yield "Extracting text...", None, 10
            doc_data = load_document("pdf", str(temp_path))
            text = doc_data.get("text", "")
            
            if not text:
                return "Failed to extract text from PDF.", None, 0
            
            # Sample text based on analysis mode
            if len(text) <= 4000:
                sampled_text, coverage = text, 100
            else:
                sample_percent = {"quick": 10, "balanced": 25, "thorough": 50}.get(analysis_mode, 10)
                target_size = min(len(text), max(4000, int(len(text) * sample_percent / 100)))
                
                segment_size = min(target_size // 3, 3000)
                start_text = text[:segment_size]
                mid_point = len(text) // 2
                mid_text = text[mid_point - segment_size//2:mid_point + segment_size//2]
                end_text = text[max(0, len(text) - segment_size):]
                
                sampled_text = start_text + "\n\n[...]\n\n" + mid_text + "\n\n[...]\n\n" + end_text
                coverage = round((len(sampled_text) / len(text)) * 100)
            
            yield f"Processing document ({len(text)} characters, {coverage}% sample)...", None, 20
            doc_id = hashlib.md5(text[:1000].encode()).hexdigest()[:10]
            
            # Multi-stage analysis
            concepts, relationships, gaps = [], [], []
            synthesis = "No synthesis generated."
            
            # 1. Extract concepts
            try:
                yield "Extracting concepts...", None, 30
                current_config = config if 'config' in globals() else LitSynthConfig()
                concepts = extract_concepts(sampled_text, current_config)
                yield f"Found {len(concepts)} concepts", None, 50
            except Exception as e:
                print(f"ERROR in concept extraction: {str(e)}")
                yield f"Error extracting concepts: {str(e)}", None, 50
            
            if concepts:
                # 2. Identify relationships
                try:
                    yield "Identifying relationships...", None, 60
                    relationships = identify_relationships(sampled_text, concepts, current_config)
                    yield f"Found {len(relationships)} relationships", None, 70
                except Exception as e:
                    yield f"Error identifying relationships, continuing...", None, 70
                
                # 3. Identify research gaps
                try:
                    yield "Identifying research gaps...", None, 80
                    gaps = identify_research_gaps(sampled_text, concepts, relationships)
                    yield f"Found {len(gaps)} research gaps", None, 90
                except Exception as e:
                    yield f"Error identifying research gaps, continuing...", None, 90
                
                # 4. Generate synthesis
                try:
                    yield "Generating synthesis...", None, 95
                    synthesis = generate_synthesis(sampled_text, concepts, relationships, gaps)
                except Exception as e:
                    synthesis = "Synthesis generation failed. Please check the extracted concepts and relationships."
            else:
                yield "No concepts found, skipping further analysis...", None, 95
            
            # Package results
            results = LiteratureSynthesisOutput(
                document_id=doc_id,
                document_metadata={
                    "original_length": len(text),
                    "processed_length": len(sampled_text),
                    "coverage_percentage": coverage,
                    "processing_time": round(time.time() - start_time, 2),
                    "analysis_mode": analysis_mode
                },
                concepts=concepts or [],
                relationships=relationships or [],
                research_gaps=gaps or [],
                synthesis_text=synthesis or "No synthesis available."
            )
            
            processing_time = round(time.time() - start_time, 2)
            status_msg = (f"Analysis complete in {processing_time}s: {len(results.concepts)} concepts, "
                        f"{len(results.relationships)} relationships, {len(results.research_gaps)} research gaps "
                        f"({coverage}% of document processed)")
            
            yield status_msg, results, 100
            
        except Exception as e:
            return f"Error analyzing document: {str(e)}", None, 0
    
    # Visualization Function
    def create_network_visualization(concepts, relationships, min_confidence=0.5):
        """Create network visualization with improved readability."""
        if not concepts or len(concepts) < 2:
            fig, ax = plt.subplots(figsize=(8, 6))
            ax.text(0.5, 0.5, "Not enough concepts to create visualization", 
                   ha='center', va='center', fontsize=12)
            ax.axis('off')
            return fig
        
        # Create directed graph
        G = nx.DiGraph()
        
        # Add nodes and edges
        for concept in concepts:
            G.add_node(concept.name, importance=concept.importance, definition=concept.definition)
        
        edge_count = 0
        for rel in relationships:
            if rel.confidence >= min_confidence and rel.source in G.nodes and rel.target in G.nodes:
                G.add_edge(rel.source, rel.target, 
                          relationship=rel.relationship_type,
                          evidence=rel.evidence,
                          confidence=rel.confidence)
                edge_count += 1
        
        # Enhanced visualization styling
        plt.rcParams.update({'font.size': 12})
        fig, ax = plt.subplots(figsize=(12, 10))
        
        importance_colors = {"high": "#e41a1c", "medium": "#377eb8", "low": "#4daf4a"}
        node_colors = [importance_colors.get(G.nodes[node]["importance"], "#999999") for node in G.nodes]
        
        centrality = nx.degree_centrality(G)
        node_sizes = [3000 * (centrality[node] + 0.1) for node in G.nodes]
        
        pos = nx.spring_layout(G, k=0.4, seed=42)
        
        # Draw network with improved visibility
        nx.draw_networkx_nodes(G, pos, node_color=node_colors, node_size=node_sizes, 
                              alpha=0.85, edgecolors='white', linewidths=1.5)
        nx.draw_networkx_edges(G, pos, edge_color='#555555', width=2.0, alpha=0.7, 
                              arrows=True, arrowsize=20, node_size=node_sizes)
        
        # Improved label rendering
        labels_pos = {node: (pos[node][0], pos[node][1] + 0.02) for node in G.nodes}
        nx.draw_networkx_labels(G, labels_pos, font_size=12, font_weight='bold', 
                               bbox=dict(facecolor='white', alpha=0.7, edgecolor='none', 
                                        boxstyle='round,pad=0.3'))
        
        # Add legend
        legend_elements = [plt.Line2D([0], [0], marker='o', color='w', 
                                     markerfacecolor=color, markersize=12, 
                                     label=f"{importance.capitalize()} Importance") 
                          for importance, color in importance_colors.items()]
        ax.legend(handles=legend_elements, loc='upper right', fontsize=11, 
                 frameon=True, facecolor='white', edgecolor='#cccccc')
        
        ax.axis('off')
        plt.title(f"Concept Relationship Network ({edge_count} connections at {min_confidence:.1f}+ confidence)",
                 fontsize=14, fontweight='bold', pad=20)
        
        return fig
    
    # Display Helper Functions
    def update_doc_info(results):
        """Format document metadata for display."""
        if not results:
            return "*No document analyzed yet*"
            
        metadata = results.document_metadata
        return (f"### Document Analysis Details\n"
                f"**Coverage**: {metadata.get('coverage_percentage', 'Unknown')}% of document processed\n"
                f"**Processing Time**: {metadata.get('processing_time', 'Unknown')}s\n"
                f"**Analysis Mode**: {metadata.get('analysis_mode', 'Unknown')}\n"
                f"**Concepts**: {len(results.concepts)}, "
                f"**Relationships**: {len(results.relationships)}, "
                f"**Research Gaps**: {len(results.research_gaps)}")
    
    def update_concepts_display(results, filter_type="all"):
        """Format concepts data for display with optional filtering."""
        if not results or not hasattr(results, 'concepts') or not results.concepts:
            return None
        
        filtered_concepts = results.concepts if filter_type == "all" else [c for c in results.concepts if c.importance == filter_type]
        
        return pd.DataFrame({
            "Concept": [c.name for c in filtered_concepts],
            "Definition": [c.definition for c in filtered_concepts],
            "Importance": [c.importance.capitalize() for c in filtered_concepts],
            "Confidence": [f"{c.confidence:.2f}" for c in filtered_concepts]
        })
    
    def update_relationships_display(results):
        """Format relationships data for display."""
        if not results or not hasattr(results, 'relationships') or not results.relationships:
            return None
        
        return pd.DataFrame({
            "Source": [r.source for r in results.relationships],
            "Relationship": [r.relationship_type for r in results.relationships],
            "Target": [r.target for r in results.relationships],
            "Evidence": [r.evidence or "N/A" for r in results.relationships],
            "Confidence": [f"{r.confidence:.2f}" for r in results.relationships]
        })
    
    def update_gaps_display(results):
        """Format research gaps data for display."""
        if not results or not hasattr(results, 'research_gaps') or not results.research_gaps:
            return None
        
        return pd.DataFrame({
            "Description": [g.description for g in results.research_gaps],
            "Related Concepts": [", ".join(g.related_concepts) for g in results.research_gaps],
            "Importance": [g.importance.capitalize() for g in results.research_gaps]
        })
    
    def update_visualization(results, min_confidence):
        """Update network visualization based on minimum confidence."""
        if not results or not hasattr(results, 'concepts') or len(results.concepts) < 2:
            fig, ax = plt.subplots(figsize=(8, 6))
            ax.text(0.5, 0.5, "Not enough concepts to create visualization", 
                   ha='center', va='center', fontsize=12)
            ax.axis('off')
            return fig
        
        return create_network_visualization(results.concepts, results.relationships, min_confidence)
    
    def update_config(analysis_mode, chunk_size, chunk_overlap, min_importance, 
                     max_concepts, relationship_confidence, max_relationships):
        """Update system configuration."""
        try:
            new_config = LitSynthConfig(
                text_chunk_size=chunk_size,
                text_chunk_overlap=chunk_overlap,
                min_concept_importance=min_importance,
                max_concepts=max_concepts,
                relationship_confidence=relationship_confidence,
                max_relationships=max_relationships
            )
            
            litsynth.config = new_config
            settings_summary = f"Settings updated: {analysis_mode} mode, {chunk_size} chunk size, {min_importance} min importance"
            return new_config, settings_summary
        except Exception as e:
            return None, f"Error updating configuration: {str(e)}"
    
    # Create UI
    with gr.Blocks(css="""
        /* Table cell wrapping */
        table td {
            white-space: normal !important;
            word-wrap: break-word !important;
            max-width: 300px !important;
        }
        
        /* Clean styling */
        .section-header {
            font-weight: bold;
            margin-top: 10px;
            margin-bottom: 5px;
            border-bottom: 1px solid rgba(128, 128, 128, 0.3);
            padding-bottom: 3px;
        }
        
        .info-text {
            font-style: italic;
            margin-bottom: 10px;
        }
    """) as app:
        gr.Markdown("# Literature Synthesis Expert System")
        
        # State variables
        results_state = gr.State(None)
        
        with gr.Tabs() as tabs:
            # INPUT TAB
            with gr.Tab("Document Input"):
                with gr.Row():
                    with gr.Column(scale=3):
                        pdf_input = gr.File(
                            label="Upload Scientific PDF", 
                            file_types=[".pdf"],
                            file_count="single"
                        )
                        
                        with gr.Row():
                            analysis_mode = gr.Radio(
                                choices=["quick", "balanced", "thorough"],
                                label="Analysis Mode",
                                value="quick",
                                info="Quick: 30-40s, Balanced: 1-2min, Thorough: 3-5min"
                            )
                            analyze_btn = gr.Button("Analyze Document", variant="primary")
                    
                    with gr.Column(scale=2):
                        status_box = gr.Textbox(
                            label="Status", 
                            interactive=False,
                            value="Ready to analyze. Please upload a PDF document."
                        )
                        progress_bar = gr.Slider(
                            minimum=0, maximum=100, value=0, 
                            label="Processing Progress",
                            interactive=False
                        )
            
            # CONCEPTS TAB
            with gr.Tab("Concepts & Relationships"):
                with gr.Row():
                    doc_info = gr.Markdown("*Upload and analyze a document to see results*")
                
                gr.Markdown("### Understanding the Concepts Table")
                gr.Markdown("This table shows key concepts extracted from the document. Use the filter to focus on specific importance levels.")
                
                with gr.Row():
                    with gr.Column():
                        gr.Markdown("#### Key Concepts", elem_classes=["section-header"])
                        concepts_filter = gr.Radio(
                            choices=["all", "high", "medium", "low"],
                            label="Filter by Importance",
                            value="all"
                        )
                        concepts_table = gr.DataFrame(
                            headers=["Concept", "Definition", "Importance", "Confidence"]
                        )
                    
                    with gr.Column():
                        gr.Markdown("#### Relationships Between Concepts", elem_classes=["section-header"])
                        gr.Markdown("This table shows how concepts connect to each other.", elem_classes=["info-text"])
                        relationships_table = gr.DataFrame(
                            headers=["Source", "Relationship", "Target", "Evidence", "Confidence"]
                        )
            
            # VISUALIZATION TAB
            with gr.Tab("Visualization"):
                gr.Markdown("### Network Visualization Guide")
                gr.Markdown("This visualization shows how concepts relate to each other:")
                gr.Markdown("- **Red nodes**: High importance concepts")
                gr.Markdown("- **Blue nodes**: Medium importance concepts")
                gr.Markdown("- **Green nodes**: Low importance concepts")
                gr.Markdown("- **Node size**: Larger nodes have more connections")
                
                with gr.Row():
                    with gr.Column():
                        with gr.Row():
                            min_confidence = gr.Slider(
                                minimum=0.0, maximum=1.0, value=0.5, step=0.1,
                                label="Minimum Confidence",
                                info="Only show relationships with confidence above this threshold"
                            )
                            refresh_viz_btn = gr.Button("Refresh Visualization")
                        
                        network_plot = gr.Plot(label="Concept Network")
                            
            # SYNTHESIS TAB
            with gr.Tab("Research Synthesis"):
                with gr.Row():
                    with gr.Column(scale=3):
                        gr.Markdown("#### Overview & Key Findings", elem_classes=["section-header"])
                        synthesis_output = gr.Markdown()
                    
                    with gr.Column(scale=2):
                        gr.Markdown("#### Research Gaps", elem_classes=["section-header"])
                        gr.Markdown("These are potential areas for future research that weren't fully addressed.", elem_classes=["info-text"])
                        gaps_table = gr.DataFrame(
                            headers=["Description", "Related Concepts", "Importance"]
                        )
                            
            # SETTINGS TAB
            with gr.Tab("Settings"):
                gr.Markdown("### Settings Guide")
                gr.Markdown("These settings control how document analysis works. For most users, the default settings work well.", elem_classes=["info-text"])
                
                with gr.Row():
                    with gr.Column():
                        gr.Markdown("#### Analysis Settings", elem_classes=["section-header"])
                        settings_analysis_mode = gr.Radio(
                            choices=["quick", "balanced", "thorough"],
                            label="Analysis Mode",
                            value="quick",
                            info="Quick (30-40s): Basic overview | Balanced (1-2min): Standard analysis | Thorough (3-5min): Deep analysis"
                        )
                        
                        gr.Markdown("#### Text Processing", elem_classes=["section-header"])
                        chunk_size = gr.Slider(
                            minimum=500, maximum=8000, value=2000, step=500,
                            label="Chunk Size (chars)",
                            info="Larger = Better context but slower | Smaller = Faster but less context"
                        )
                        chunk_overlap = gr.Slider(
                            minimum=50, maximum=1000, value=200, step=50,
                            label="Chunk Overlap",
                            info="Higher overlap maintains context between chunks"
                        )
                        
                        gr.Markdown("#### Concepts Settings", elem_classes=["section-header"])
                        min_importance = gr.Dropdown(
                            choices=["low", "medium", "high"],
                            label="Min Importance",
                            value="medium",
                            info="Only include concepts above this importance level"
                        )
                        max_concepts = gr.Slider(
                            minimum=5, maximum=100, value=25, step=5,
                            label="Max Concepts",
                            info="Maximum number of concepts to extract"
                        )
                        
                        gr.Markdown("#### Relationships Settings", elem_classes=["section-header"])
                        relationship_confidence = gr.Slider(
                            minimum=0.0, maximum=1.0, value=0.6, step=0.1,
                            label="Min Confidence",
                            info="Only include relationships with confidence above this threshold"
                        )
                        max_relationships = gr.Slider(
                            minimum=10, maximum=200, value=50, step=10,
                            label="Max Relationships",
                            info="Maximum number of relationships to identify"
                        )
                        
                        apply_settings_btn = gr.Button("Apply Settings", variant="primary")
                        settings_status = gr.Textbox(label="Settings Status", interactive=False)
        
        # Event Handlers
        analyze_btn.click(
            fn=process_pdf_document,
            inputs=[pdf_input, analysis_mode],
            outputs=[status_box, results_state, progress_bar]
        ).then(
            fn=update_doc_info,
            inputs=[results_state],
            outputs=[doc_info]
        ).then(
            fn=update_concepts_display,
            inputs=[results_state, gr.State("all")],
            outputs=[concepts_table]
        ).then(
            fn=update_relationships_display,
            inputs=[results_state],
            outputs=[relationships_table]
        ).then(
            fn=update_gaps_display,
            inputs=[results_state],
            outputs=[gaps_table]
        ).then(
            fn=lambda x: x.synthesis_text if x and hasattr(x, 'synthesis_text') else "No synthesis available.",
            inputs=[results_state],
            outputs=[synthesis_output]
        ).then(
            fn=update_visualization,
            inputs=[results_state, min_confidence],
            outputs=[network_plot]
        )
        
        # Filter concepts by importance
        concepts_filter.change(
            fn=update_concepts_display,
            inputs=[results_state, concepts_filter],
            outputs=[concepts_table]
        )
        
        # Refresh visualization
        refresh_viz_btn.click(
            fn=update_visualization,
            inputs=[results_state, min_confidence],
            outputs=[network_plot]
        )
        
        # Update system settings
        apply_settings_btn.click(
            fn=update_config,
            inputs=[
                settings_analysis_mode, chunk_size, chunk_overlap, 
                min_importance, max_concepts, relationship_confidence, max_relationships
            ],
            outputs=[results_state, settings_status]
        )
    
    # Launch the app
    app.launch(inline=True, share=False)

# Launch with error handling
try:
    launch_litsynth_ui()
    show("UI launched successfully", "success")
except Exception as e:
    show("UI launch failed: " + str(e), "error")

## [Temp] Diagnostics

In [None]:
# Diagnostic code to identify cache serialization issues
import json, os, time, hashlib
from pydantic import BaseModel, Field
from typing import List, Dict

# Create a simple test model
class TestModel(BaseModel):
    name: str
    value: int
    tags: List[str] = Field(default_factory=list)

# Set up a test cache
CACHE_DIR = "./test_cache"
os.makedirs(CACHE_DIR, exist_ok=True)

# Test models
test_model = TestModel(name="Test Item", value=42, tags=["test", "diagnostics"])
test_list = [
    TestModel(name="Item 1", value=10, tags=["first"]),
    TestModel(name="Item 2", value=20, tags=["second"])
]

# Functions to check
def cache_key(**kwargs):
    serialized = json.dumps({k: v for k, v in kwargs.items() if v is not None}, sort_keys=True)
    return hashlib.md5(serialized.encode()).hexdigest()

def basic_set_cache(key, value):
    """Basic version with no special handling"""
    path = os.path.join(CACHE_DIR, f"{key}_basic.json")
    try:
        with open(path, 'w') as f:
            json.dump({"timestamp": time.time(), "value": value}, f)
        return True
    except Exception as e:
        print(f"Basic cache error: {str(e)}")
        return False

def model_dump_cache(key, value):
    """Version with model_dump"""
    path = os.path.join(CACHE_DIR, f"{key}_dump.json")
    try:
        # Convert Pydantic models to dict
        def convert_model(obj):
            if hasattr(obj, 'model_dump'):
                return obj.model_dump()
            elif isinstance(obj, list):
                return [convert_model(item) for item in obj]
            elif isinstance(obj, dict):
                return {k: convert_model(v) for k, v in obj.items()}
            return obj
            
        with open(path, 'w') as f:
            json.dump({"timestamp": time.time(), "value": convert_model(value)}, f)
        return True
    except Exception as e:
        print(f"Model dump cache error: {str(e)}")
        return False

def get_cache(key, suffix):
    """Get from cache"""
    path = os.path.join(CACHE_DIR, f"{key}_{suffix}.json")
    try:
        with open(path, 'r') as f:
            return json.load(f)["value"]
    except Exception as e:
        print(f"Get cache error ({suffix}): {str(e)}")
        return None

# Run diagnostics
print("=== CACHE DIAGNOSTICS ===")
print(f"Test model: {test_model}")
print(f"Type: {type(test_model)}")
print(f"Has model_dump: {hasattr(test_model, 'model_dump')}")
print(f"Has dict method: {hasattr(test_model, 'dict')}")

# Generate keys
single_key = cache_key(model="single")
list_key = cache_key(model="list")

# Test basic caching (expect failure)
print("\n=== BASIC CACHE TEST ===")
basic_result_single = basic_set_cache(single_key, test_model)
basic_result_list = basic_set_cache(list_key, test_list)
print(f"Basic cache single model: {'Success' if basic_result_single else 'Failed'}")
print(f"Basic cache model list: {'Success' if basic_result_list else 'Failed'}")

# Test model_dump caching
print("\n=== MODEL DUMP CACHE TEST ===")
dump_result_single = model_dump_cache(single_key, test_model)
dump_result_list = model_dump_cache(list_key, test_list)
print(f"Model dump cache single: {'Success' if dump_result_single else 'Failed'}")
print(f"Model dump cache list: {'Success' if dump_result_list else 'Failed'}")

# Test retrieving
print("\n=== RETRIEVAL TEST ===")
# Try to get data (only the model_dump version should have worked)
retrieved_data = get_cache(single_key, "dump")
print(f"Retrieved data: {retrieved_data}")
print(f"Retrieved type: {type(retrieved_data)}")

# Test if we can access attributes directly (should fail with dict)
print("\n=== ATTRIBUTE ACCESS TEST ===")
try:
    name = retrieved_data.name
    print(f"Direct access succeeded: {name}")
except Exception as e:
    print(f"Direct access failed: {str(e)}")

# Test recreating models from cached data
print("\n=== MODEL RECONSTRUCTION TEST ===")
try:
    reconstructed = TestModel(**retrieved_data)
    print(f"Reconstructed: {reconstructed}")
    print(f"Can access attribute: {reconstructed.name}")
    print(f"Is same type: {type(reconstructed) == type(test_model)}")
except Exception as e:
    print(f"Reconstruction failed: {str(e)}")

# Test with list
print("\n=== LIST RECONSTRUCTION TEST ===")
retrieved_list = get_cache(list_key, "dump")
try:
    if retrieved_list:
        reconstructed_list = [TestModel(**item) for item in retrieved_list]
        print(f"Reconstructed list length: {len(reconstructed_list)}")
        print(f"First item: {reconstructed_list[0]}")
        print(f"Can access attribute: {reconstructed_list[0].name}")
except Exception as e:
    print(f"List reconstruction failed: {str(e)}")

print("\n=== DIAGNOSIS COMPLETE ===")

## Test

In [None]:
# 📊 Testing & Evaluation Framework
"""
This modular testing framework evaluates LLM function performance.
You can extend it with custom test cases and metrics.

EXTENSIBILITY FEATURES:
1. Add new test cases by creating TestCase subclasses
2. Define custom metrics by adding to the MetricsCollector
3. Modify UI display with custom result formatters
"""

import time, json, uuid, re, tempfile, os
import pandas as pd
import matplotlib.pyplot as plt
import gradio as gr
from typing import Dict, List, Any, Optional, Tuple, Callable, Union, Type
from IPython.display import Markdown, display, HTML
import tiktoken
from enum import Enum

# ===== CORE METRICS SYSTEM =====

class MetricsCollector:
    """Collects and aggregates metrics for LLM function evaluation"""
    
    def __init__(self):
        self.reset()
    
    def reset(self):
        """Reset all metrics"""
        self.calls = []
        self.total_tokens = 0
        self.token_cost = 0
        self.start_time = time.time()
    
    def record_call(self, function_name: str, duration: float, tokens: int = 0, 
                   success: bool = True, metadata: Dict = None):
        """Record an LLM function call with metrics"""
        self.calls.append({
            "function": function_name,
            "duration": duration,
            "tokens": tokens,
            "success": success,
            "timestamp": time.time() - self.start_time,
            "metadata": metadata or {}
        })
        self.total_tokens += tokens
        
        # Estimate cost (very rough approximation)
        # Can be extended with more precise model-specific costs
        self.token_cost += tokens * 0.00001
    
    def get_stats(self) -> Dict[str, Any]:
        """Get aggregated statistics"""
        if not self.calls:
            return {
                "calls": 0, 
                "avg_duration": 0, 
                "max_duration": 0,
                "total_tokens": 0,
                "success_rate": 0,
                "est_cost": "$0.00"
            }
        
        durations = [c["duration"] for c in self.calls]
        success_count = sum(1 for c in self.calls if c["success"])
        
        return {
            "calls": len(self.calls),
            "avg_duration": sum(durations) / len(durations),
            "max_duration": max(durations),
            "total_tokens": self.total_tokens,
            "success_rate": success_count / len(self.calls) if self.calls else 0,
            "est_cost": f"${self.token_cost:.4f}"
        }
    
    def get_function_stats(self) -> Dict[str, Dict[str, Any]]:
        """Get statistics broken down by function"""
        if not self.calls:
            return {}
            
        functions = {}
        for call in self.calls:
            func_name = call["function"]
            if func_name not in functions:
                functions[func_name] = {
                    "calls": 0,
                    "durations": [],
                    "tokens": 0,
                    "successes": 0
                }
            
            functions[func_name]["calls"] += 1
            functions[func_name]["durations"].append(call["duration"])
            functions[func_name]["tokens"] += call["tokens"]
            if call["success"]:
                functions[func_name]["successes"] += 1
        
        # Calculate aggregates
        for name, data in functions.items():
            data["avg_duration"] = sum(data["durations"]) / len(data["durations"])
            data["max_duration"] = max(data["durations"])
            data["success_rate"] = data["successes"] / data["calls"]
            data["durations"] = None  # Remove raw data
            
        return functions

# Global metrics collector
metrics = MetricsCollector()

# ===== TOKEN ESTIMATION =====

def estimate_tokens(text_or_obj: Any) -> int:
    """Estimate token count for OpenAI models with various input types"""
    if not text_or_obj:
        return 0
    
    # Convert to string based on type
    if isinstance(text_or_obj, str):
        text = text_or_obj
    elif isinstance(text_or_obj, list):
        if not text_or_obj:
            return 0
        # Sample up to 3 items and extrapolate
        samples = min(3, len(text_or_obj))
        sample_text = "".join(str(item) for item in text_or_obj[:samples])
        tokens_per_item = estimate_tokens(sample_text) / samples
        return int(tokens_per_item * len(text_or_obj))
    else:
        # For other objects, convert to string representation
        text = str(text_or_obj)
    
    # Estimate tokens
    try:
        encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
        return len(encoding.encode(text))
    except:
        # Fallback estimation
        return len(text.split()) * 1.3

# ===== TEST RESULT MODEL =====

class TestStatus(Enum):
    PENDING = "pending"
    SUCCESS = "success"
    FAILURE = "failure"
    SKIPPED = "skipped"

class TestResult:
    """Holds results and metrics for a single test"""
    
    def __init__(self, name: str, description: str = ""):
        self.name = name
        self.description = description
        self.latency = 0
        self.status = TestStatus.PENDING
        self.output = None
        self.error = None
        self.details = {}
        self.tokens = 0
    
    def mark_success(self, output: Any = None, details: Dict = None):
        """Mark test as successful"""
        self.status = TestStatus.SUCCESS
        self.output = output
        if details:
            self.details.update(details)
    
    def mark_failure(self, error: str, output: Any = None):
        """Mark test as failed"""
        self.status = TestStatus.FAILURE
        self.error = error
        self.output = output
    
    def mark_skipped(self, reason: str = "Dependency failed"):
        """Mark test as skipped"""
        self.status = TestStatus.SKIPPED
        self.error = reason
    
    def add_detail(self, key: str, value: Any):
        """Add a detail to the result"""
        self.details[key] = value
    
    def is_success(self) -> bool:
        """Check if test was successful"""
        return self.status == TestStatus.SUCCESS
    
    def get_status_icon(self) -> str:
        """Get status icon for display"""
        icons = {
            TestStatus.SUCCESS: "✅",
            TestStatus.FAILURE: "❌",
            TestStatus.PENDING: "⏳",
            TestStatus.SKIPPED: "⏭️"
        }
        return icons.get(self.status, "❓")
    
    def to_dict(self) -> Dict[str, Any]:
        """Convert to dictionary for display"""
        return {
            "Name": self.name,
            "Status": f"{self.get_status_icon()} {self.status.value}",
            "Latency": f"{self.latency:.2f}s",
            "Tokens": self.tokens,
            "Details": self.details,
            "Error": self.error
        }

# ===== TEST CASE BASE CLASSES =====

class TestCase:
    """Base class for all test cases"""
    
    def __init__(self, name: str, description: str = ""):
        self.name = name
        self.description = description
        self.result = TestResult(name, description)
    
    def run(self) -> TestResult:
        """Run the test case"""
        start_time = time.time()
        try:
            self._execute()
            self.result.latency = time.time() - start_time
            return self.result
        except Exception as e:
            self.result.mark_failure(f"Unexpected error: {str(e)}")
            self.result.latency = time.time() - start_time
            return self.result
    
    def _execute(self):
        """Implement test logic in subclasses"""
        raise NotImplementedError("Subclasses must implement _execute")

class FunctionTestCase(TestCase):
    """Test case for evaluating a function"""
    
    def __init__(self, name: str, func: Callable, args: List = None, 
                kwargs: Dict = None, description: str = ""):
        super().__init__(name, description)
        self.func = func
        self.args = args or []
        self.kwargs = kwargs or {}
        self.is_llm_function = name in ["extract_concepts", "identify_relationships", 
                                        "identify_research_gaps", "generate_synthesis"]
    
    def _execute(self):
        """Execute function and record metrics"""
        try:
            # Run the function
            result = self.func(*self.args, **self.kwargs)
            
            # Calculate token usage for LLM functions
            tokens = 0
            if self.is_llm_function:
                # Estimate input tokens
                input_tokens = sum(estimate_tokens(arg) for arg in self.args 
                                  if isinstance(arg, (str, list)))
                
                # Estimate output tokens
                output_tokens = estimate_tokens(result)
                tokens = input_tokens + output_tokens
                
                # Record metrics
                metrics.record_call(
                    self.name, 
                    self.result.latency, 
                    tokens=tokens, 
                    success=True
                )
                
            self.result.tokens = tokens
            
            # Add appropriate details based on result type
            if isinstance(result, str):
                self.result.add_detail("length", len(result))
                if len(result) > 50:
                    self.result.add_detail("excerpt", result[:50] + "...")
            elif isinstance(result, list):
                self.result.add_detail("count", len(result))
                if result:
                    if hasattr(result[0], "name"):
                        self.result.add_detail("samples", ", ".join([item.name for item in result[:3]]))
                    elif hasattr(result[0], "description"):
                        self.result.add_detail("samples", result[0].description[:50] + "...")
            
            # Mark success
            self.result.mark_success(result)
            
        except Exception as e:
            # Record failure
            if self.is_llm_function:
                metrics.record_call(
                    self.name, 
                    self.result.latency, 
                    tokens=0, 
                    success=False
                )
            
            self.result.mark_failure(str(e))

# ===== TEST SUITE =====

class TestSuite:
    """Collection of test cases with dependencies"""
    
    def __init__(self, name: str):
        self.name = name
        self.test_cases = []
        self.results = []
        
    def add_test(self, test_case: TestCase):
        """Add a test case to the suite"""
        self.test_cases.append(test_case)
        return self
        
    def run(self) -> List[TestResult]:
        """Run all test cases in the suite"""
        self.results = []
        metrics.reset()
        
        for test_case in self.test_cases:
            result = test_case.run()
            self.results.append(result)
            
        return self.results
    
    def get_stats(self) -> Dict[str, Any]:
        """Get aggregated statistics"""
        return metrics.get_stats()

# ===== DOCUMENT TEST SUITE =====

class DocumentAnalysisSuite(TestSuite):
    """Test suite for document analysis functions"""
    
    def __init__(self, name: str = "Document Analysis"):
        super().__init__(name)
        
    @classmethod
    def create_sample_suite(cls):
        """Create a test suite with the reliable sample text"""
        suite = cls("Sample Text Analysis")
        
        RELIABLE_SAMPLE = """
        RNA sequencing (RNA-Seq) is a technique for analyzing gene expression patterns.
        It involves extracting RNA from biological samples, converting to cDNA, and 
        sequencing using next-generation platforms. This enables identification of 
        differentially expressed genes between conditions. RNA-Seq allows for the 
        detection of novel transcripts, alternative splicing events, and genetic variations.
        The methodology typically includes quality control, read alignment to a reference
        genome, quantification of expression levels, and differential expression analysis.
        """
        
        # Create config
        config = LitSynthConfig() if 'LitSynthConfig' in globals() else None
        
        # Add concept extraction test
        suite.add_test(FunctionTestCase(
            "extract_concepts", 
            extract_concepts,
            [RELIABLE_SAMPLE, config],
            description="Extract key concepts from sample text"
        ))
        
        # Make other tests dependent on concept extraction
        def add_dependent_tests(results):
            concepts = results[0].output if results and results[0].is_success() else []
            
            if concepts:
                # Add relationship test
                suite.add_test(FunctionTestCase(
                    "identify_relationships",
                    identify_relationships,
                    [RELIABLE_SAMPLE, concepts, config],
                    description="Identify relationships between concepts"
                ))
                
                # Add gaps test (depends on relationships)
                relationships = results[1].output if len(results) > 1 and results[1].is_success() else []
                if relationships:
                    suite.add_test(FunctionTestCase(
                        "identify_research_gaps",
                        identify_research_gaps,
                        [RELIABLE_SAMPLE, concepts, relationships],
                        description="Identify research gaps"
                    ))
                    
                    # Add synthesis test (depends on gaps)
                    gaps = results[2].output if len(results) > 2 and results[2].is_success() else []
                    suite.add_test(FunctionTestCase(
                        "generate_synthesis",
                        generate_synthesis,
                        [RELIABLE_SAMPLE, concepts, relationships, gaps],
                        description="Generate research synthesis"
                    ))
        
        # Store the callback for executing after initial test
        suite.add_dependent_tests = add_dependent_tests
        
        return suite
    
    @classmethod
    def create_pdf_suite(cls, pdf_path: str):
        """Create a test suite for a PDF document"""
        suite = cls(f"PDF Analysis: {os.path.basename(pdf_path)}")
        
        # Create config
        config = LitSynthConfig() if 'LitSynthConfig' in globals() else None
        
        # Add document loading test
        suite.add_test(FunctionTestCase(
            "load_document",
            load_document,
            ["pdf", pdf_path],
            description="Load document from PDF"
        ))
        
        # Make other tests dependent on document loading
        def add_dependent_tests(results):
            doc_data = results[0].output if results and results[0].is_success() else {}
            
            if doc_data and "text" in doc_data:
                text = doc_data["text"]
                
                # Add chunking test
                suite.add_test(FunctionTestCase(
                    "chunk_text",
                    chunk_text,
                    [text, config],
                    description="Split text into chunks"
                ))
                
                # Check if chunking succeeded and process the first chunk
                chunks = results[1].output if len(results) > 1 and results[1].is_success() else []
                if chunks:
                    # Use first chunk for analysis
                    test_text = chunks[0]
                    
                    # Add concept extraction
                    suite.add_test(FunctionTestCase(
                        "extract_concepts",
                        extract_concepts,
                        [test_text, config],
                        description="Extract key concepts"
                    ))
                    
                    # Continue with dependent tests if concepts were found
                    concepts = results[2].output if len(results) > 2 and results[2].is_success() else []
                    if concepts:
                        # Add relationship identification
                        suite.add_test(FunctionTestCase(
                            "identify_relationships",
                            identify_relationships,
                            [test_text, concepts, config],
                            description="Identify relationships"
                        ))
                        
                        # Continue with gaps and synthesis...
                        relationships = results[3].output if len(results) > 3 and results[3].is_success() else []
                        if relationships:
                            suite.add_test(FunctionTestCase(
                                "identify_research_gaps",
                                identify_research_gaps,
                                [test_text, concepts, relationships],
                                description="Identify research gaps"
                            ))
                            
                            gaps = results[4].output if len(results) > 4 and results[4].is_success() else []
                            suite.add_test(FunctionTestCase(
                                "generate_synthesis",
                                generate_synthesis,
                                [test_text, concepts, relationships, gaps],
                                description="Generate synthesis"
                            ))
        
        # Store the callback for executing after initial tests
        suite.add_dependent_tests = add_dependent_tests
        
        return suite
    
    def run(self) -> List[TestResult]:
        """Run tests with dependency management"""
        self.results = []
        metrics.reset()
        
        # First run the initial test cases
        initial_tests = self.test_cases[:1]  # First test
        for test in initial_tests:
            result = test.run()
            self.results.append(result)
        
        # Add dependent tests based on results of initial tests
        if hasattr(self, 'add_dependent_tests'):
            self.add_dependent_tests(self.results)
        
        # Run any new tests that were added
        for test in self.test_cases[len(self.results):]:
            result = test.run()
            self.results.append(result)
        
        return self.results

# ===== RESULTS VISUALIZATION =====

def format_results_table(results: List[TestResult]) -> str:
    """Format test results as HTML table"""
    html = """
    <style>
    .test-table {
        width: 100%;
        border-collapse: collapse;
        margin: 20px 0;
        font-family: inherit;
    }
    .test-table th, .test-table td {
        padding: 8px 12px;
        text-align: left;
        border-bottom: 1px solid #ddd;
    }
    .test-table th {
        font-weight: bold;
    }
    .success {
        font-weight: bold;
    }
    .failure {
        font-weight: bold;
    }
    .skipped {
        font-weight: bold;
    }
    .metrics {
        padding: 10px 15px;
        border-radius: 5px;
        margin-bottom: 15px;
    }
    </style>
    """
    # Get metrics
    stats = metrics.get_stats()
    html += f"""
    <div class="metrics">
        <h3>Test Summary</h3>
        <p>
            <strong>LLM Calls:</strong> {stats['calls']} | 
            <strong>Avg Time:</strong> {stats['avg_duration']:.2f}s | 
            <strong>Total Tokens:</strong> {stats['total_tokens']:,} |
            <strong>Est. Cost:</strong> {stats['est_cost']}
        </p>
    </div>
    """
    
    # Create table
    html += """
    <table class="test-table">
        <tr>
            <th>Function</th>
            <th>Status</th>
            <th>Latency</th>
            <th>Tokens</th>
            <th>Details</th>
        </tr>
    """
    
    # Add rows
    for result in results:
        status_class = {
            TestStatus.SUCCESS: "success",
            TestStatus.FAILURE: "failure",
            TestStatus.SKIPPED: "skipped",
            TestStatus.PENDING: ""
        }.get(result.status, "")
        
        details = "<br>".join([f"<strong>{k}:</strong> {v}" for k, v in result.details.items()])
        if result.error:
            details += f"<br><strong>Error:</strong> <span class='failure'>{result.error}</span>"
            
        token_count = f"{result.tokens:,}" if result.tokens > 0 else "-"
        
        html += f"""
        <tr>
            <td>{result.name}</td>
            <td class="{status_class}">{result.get_status_icon()} {result.status.value}</td>
            <td>{result.latency:.2f}s</td>
            <td>{token_count}</td>
            <td>{details}</td>
        </tr>
        """
    
    html += "</table>"
    return html

# ===== TEST INTERFACE =====

def create_test_interface():
    """Create Gradio interface for testing"""
    
    def run_sample_test():
        """Run tests on sample text"""
        suite = DocumentAnalysisSuite.create_sample_suite()
        results = suite.run()
        return format_results_table(results)
    
    def run_pdf_test(pdf_file):
        """Run tests on uploaded PDF"""
        if not pdf_file:
            return "Please upload a PDF file."
            
        # Save to temp file
        temp_path = os.path.join(tempfile.mkdtemp(), "test.pdf")
        with open(pdf_file.name, "rb") as src:
            with open(temp_path, "wb") as dest:
                dest.write(src.read())
        
        # Run tests
        suite = DocumentAnalysisSuite.create_pdf_suite(temp_path)
        results = suite.run()
        return format_results_table(results)
    
    # Create interface
    with gr.Blocks() as demo:
        gr.Markdown("## Literature Synthesis Testing Framework")
        gr.Markdown("Evaluate LLM function performance with standardized tests")
        
        with gr.Tabs() as tabs:
            with gr.Tab("Sample Test"):
                gr.Markdown("""
                Run tests using a reliable sample text to verify basic functionality.
                This is useful for quick validation of the system.
                """)
                sample_btn = gr.Button("Run Sample Test", variant="primary")
                sample_results = gr.HTML("Click the button to run sample tests")
                
                sample_btn.click(
                    fn=run_sample_test,
                    outputs=sample_results
                )
            
            with gr.Tab("PDF Test"):
                gr.Markdown("""
                Upload a PDF document to test the full document analysis pipeline.
                This evaluates all components with a real-world document.
                """)
                pdf_input = gr.File(
                    label="Upload PDF Document",
                    file_types=[".pdf"]
                )
                pdf_btn = gr.Button("Run PDF Test", variant="primary")
                pdf_results = gr.HTML("Upload a PDF and click the button to run tests")
                
                pdf_btn.click(
                    fn=run_pdf_test,
                    inputs=[pdf_input],
                    outputs=pdf_results
                )
                
            with gr.Tab("Custom Test"):
                gr.Markdown("""
                #### How to Create Custom Tests
                
                Add custom test cases by extending the framework:
                
                ```python
                # Example: Create a custom test case
                class MyCustomTest(TestCase):
                    def _execute(self):
                        # Implement your test logic here
                        result = my_function()
                        if result:
                            self.result.mark_success(result)
                        else:
                            self.result.mark_failure("Test failed")
                
                # Run your custom test
                test = MyCustomTest("custom_test", "My custom test description")
                result = test.run()
                print(f"Test result: {result.status}")
                ```
                
                View the source code for more examples and extension points.
                """)
    
    return demo

# Create and launch the test interface
test_interface = create_test_interface()
test_interface.launch(inline=True, share=False)

# Show confirmation
show("Testing framework initialized", "success")