# Part 1: Foundations - Code to LaTeX Agent

## 🎯 Research Scenario
You're writing a research paper and need to convert your Python algorithm implementations to publication-ready LaTeX mathematical notation. Manual conversion is tedious and error-prone - let's build an intelligent agent to automate this!

## 🎓 What You'll Learn

1. **Agent Fundamentals**: Core concepts without overwhelming complexity
2. **Multi-Provider LLM Setup**: Support for global research accessibility
3. **Hardcoded vs Intelligent Approaches**: When to use each method
4. **LangGraph Basics**: Workflow orchestration for research tasks

## 📋 Prerequisites Check
Before starting, ensure you've completed `../setup_environment.ipynb` successfully.

In [1]:
# Essential imports
import sys
import os
from pathlib import Path
import time

%load_ext autoreload
%autoreload 2


# Add modules to path
current_dir = Path.cwd()
modules_dir = current_dir / "modules"
sys.path.insert(0, str(modules_dir))

# Load environment variables
from dotenv import load_dotenv
load_dotenv(current_dir.parent / ".env")

print("🚀 Environment loaded successfully!")
print(f"📁 Working directory: {current_dir}")

🚀 Environment loaded successfully!
📁 Working directory: /home/liqi/PHMGA/tutorials_research/Part1_Foundations


# Section 1: The Research Problem (30 min)

## 🔍 Problem Analysis

**Scenario**: You have this Python code in your research notebook:

```python
# Euclidean distance calculation
distance = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)

# Gaussian probability density
pdf = (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu) / sigma)**2)

# Linear regression cost function
cost = (1 / (2 * m)) * np.sum((hypothesis - y)**2)
```

**Challenge**: Convert this to LaTeX for your paper:
- Manual conversion takes 1 minutes per expression
- Easy to make notation errors
- Inconsistent formatting across the paper
- Need to update when code changes

**Goal**: Build an intelligent agent that converts code to publication-ready LaTeX instantly!

In [2]:
# Let's define our test cases for this tutorial
research_code_examples = [
    # Basic mathematical functions
    "np.sqrt(x**2 + y**2)",
    "np.sin(theta) * np.cos(phi)", 
    "np.exp(-0.5 * (x/sigma)**2)",
    
    # Statistical formulas
    "(1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu) / sigma)**2)",
    "np.mean(data) + 1.96 * np.std(data) / np.sqrt(len(data))",
    
    # Linear algebra
    "np.dot(A, x) + b",
    "np.linalg.norm(gradient)",
    
    # Complex expressions
    "np.sum((hypothesis - y)**2) / (2 * len(y))",
    "alpha * learning_rate * np.gradient(cost_function)"
]

print("🧪 Test Cases for Code-to-LaTeX Conversion:")
print("=" * 50)
for i, code in enumerate(research_code_examples, 1):
    print(f"{i:2d}. {code}")
    
print(f"\n📊 Total test cases: {len(research_code_examples)}")
print("🎯 Goal: Convert all of these to publication-ready LaTeX!")

🧪 Test Cases for Code-to-LaTeX Conversion:
 1. np.sqrt(x**2 + y**2)
 2. np.sin(theta) * np.cos(phi)
 3. np.exp(-0.5 * (x/sigma)**2)
 4. (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu) / sigma)**2)
 5. np.mean(data) + 1.96 * np.std(data) / np.sqrt(len(data))
 6. np.dot(A, x) + b
 7. np.linalg.norm(gradient)
 8. np.sum((hypothesis - y)**2) / (2 * len(y))
 9. alpha * learning_rate * np.gradient(cost_function)

📊 Total test cases: 9
🎯 Goal: Convert all of these to publication-ready LaTeX!


# Section 2: Hardcoded Approach (30 min)

`★ Insight ─────────────────────────────────────`
Traditional approaches use regex patterns to match and replace code constructs. This works well for simple, predictable patterns but breaks down with complex nested expressions or unusual variable names.
`─────────────────────────────────────────────────`

Let's first try the traditional hardcoded approach using regex patterns.

In [3]:
# Import our hardcoded processor
from agent_basics import HardcodedMathProcessor, demonstrate_agent_evolution

# Show the evolution concepts
demonstrate_agent_evolution()

🎓 AGENT EVOLUTION DEMONSTRATION

1. HARDCODED APPROACH (Traditional)
   - Fast execution
   - Predictable patterns
   - Limited to known cases
   - Requires manual updates

2. LLM AGENT APPROACH (Modern)
   - Contextual understanding
   - Handles complex cases
   - Learns from examples
   - API dependency

3. HYBRID APPROACH (Optimal)
   - Best of both worlds
   - Fast for simple cases
   - Intelligent for complex cases
   - Graceful fallback strategy

🎯 Research Application:
Choose approach based on your specific needs:
• Speed critical + simple patterns → Hardcoded
• Complex understanding required → LLM Agent
• Production system → Hybrid


In [4]:
# Create and test the hardcoded processor
hardcoded_processor = HardcodedMathProcessor()

print("🔧 HARDCODED APPROACH TESTING")
print("=" * 40)

# Test on our research examples
hardcoded_results = []

for i, code in enumerate(research_code_examples[:5], 1):  # Test first 5
    print(f"\n🧪 Test {i}: {code}")
    result = hardcoded_processor.process(code)
    hardcoded_results.append(result)
    
    if result.success:
        print(f"✅ Success: {result.output}")
        print(f"   Confidence: {result.confidence:.2f}")
        print(f"   Time: {result.processing_time:.3f}s")
    else:
        print(f"❌ Failed: {result.output}")

# Calculate overall performance
success_rate = hardcoded_processor.get_success_rate()
avg_confidence = hardcoded_processor.get_average_confidence()

print(f"\n📊 HARDCODED PERFORMANCE:")
print(f"Success Rate: {success_rate:.1%}")
print(f"Average Confidence: {avg_confidence:.2f}")
print(f"Limitations: {len(hardcoded_processor.limitations)} known issues")

🔧 HARDCODED APPROACH TESTING

🧪 Test 1: np.sqrt(x**2 + y**2)
✅ Success: $\sqrt{x^{2} + y^{2}}$
   Confidence: 1.00
   Time: 0.002s

🧪 Test 2: np.sin(theta) * np.cos(phi)
✅ Success: $\sin(theta) * \cos(phi)$
   Confidence: 0.67
   Time: 0.000s

🧪 Test 3: np.exp(-0.5 * (x/sigma)**2)
✅ Success: $e^{-0.5 * (\frac{x}{sigma}}^2)$
   Confidence: 1.00
   Time: 0.000s

🧪 Test 4: (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu) / sigma)**2)
✅ Success: $(1 / (sigma * \sqrt{2 * np.pi})) * e^{-0.5 * ((x - mu} / sigma)^2)$
   Confidence: 1.00
   Time: 0.000s

🧪 Test 5: np.mean(data) + 1.96 * np.std(data) / np.sqrt(len(data))
✅ Success: $np.mean(data) + 1.96 * np.std(data) / \sqrt{len(data})$
   Confidence: 0.33
   Time: 0.000s

📊 HARDCODED PERFORMANCE:
Success Rate: 100.0%
Average Confidence: 0.80
Limitations: 5 known issues


In [5]:
# Analyze hardcoded approach strengths and weaknesses
print("🔍 HARDCODED APPROACH ANALYSIS")
print("=" * 35)

print("\n✅ STRENGTHS:")
strengths = [
    "Fast execution (no API calls)",
    "Predictable output format", 
    "No external dependencies",
    "Works offline",
    "Deterministic results"
]
for strength in strengths:
    print(f"   • {strength}")

print("\n❌ WEAKNESSES:")
for limitation in hardcoded_processor.limitations:
    print(f"   • {limitation}")

print("\n🎯 BEST USE CASES:")
use_cases = [
    "Simple, repetitive patterns",
    "High-volume processing where speed matters",
    "When API access is restricted",
    "Preprocessing step before LLM processing"
]
for use_case in use_cases:
    print(f"   • {use_case}")

🔍 HARDCODED APPROACH ANALYSIS

✅ STRENGTHS:
   • Fast execution (no API calls)
   • Predictable output format
   • No external dependencies
   • Works offline
   • Deterministic results

❌ WEAKNESSES:
   • Cannot handle nested function calls
   • Limited to predefined patterns
   • Struggles with complex expressions
   • No contextual understanding
   • Requires manual pattern updates

🎯 BEST USE CASES:
   • Simple, repetitive patterns
   • High-volume processing where speed matters
   • When API access is restricted
   • Preprocessing step before LLM processing


# Section 3: Multi-Provider LLM Setup (45 min)

`★ Insight ─────────────────────────────────────`
Modern research is global - your LLM setup should support researchers worldwide with different provider preferences and access restrictions. A flexible provider system lets you switch between Google Gemini, OpenAI, DashScope, and others seamlessly.
`─────────────────────────────────────────────────`

Now let's set up our multi-provider LLM system for intelligent conversion.

In [6]:
# Import our LLM provider system
from llm_providers import (
    ResearchLLMFactory, 
    LLMProvider, 
    create_research_llm,
    list_research_providers
)

# Check available providers
print("🔍 Checking Available LLM Providers...")
list_research_providers()

🔍 Checking Available LLM Providers...
🔍 Available LLM Providers for Research:
------------------------------------------------------------
❌ GOOGLE     - Google Gemini - Excellent for mathematical reasoning
   Default: gemini-2.5-pro
   Fast: gemini-2.5-flash
   ⚠️  Set GEMINI_API_KEY to enable

❌ OPENAI     - OpenAI GPT - Reliable for code understanding
   Default: gpt-4o
   Fast: gpt-4o-mini
   ⚠️  Set OPENAI_API_KEY to enable

✅ DASHSCOPE  - DashScope Qwen - Cost-effective with good performance
   Default: qwen-plus
   Fast: qwen-plus

✅ ZHIPUAI    - Zhipu AI GLM - Optimized for Chinese researchers
   Default: glm-4
   Fast: glm-4-flash

🎯 Recommended: DASHSCOPE


In [8]:
# Create LLM instance for research
try:
    # Try to create a research LLM (will auto-select best available)
    research_llm = create_research_llm(
        # model="dashscope",  # Specify model type
        temperature=0.3,  # Low temperature for consistent mathematical output
        fast_mode=False   # Use high-quality model for accuracy
    )
    
    print("✅ Successfully created research LLM!")
    print(f"   Model type: {type(research_llm).__name__}")
    
    # Test the LLM with a simple query
    test_response = research_llm.invoke("Hello! Please respond with 'LLM ready for research'")
    print(f"   Test response: {test_response.content}")
    
    llm_available = True
    
except Exception as e:
    print(f"❌ Failed to create LLM: {e}")
    print("\n💡 To fix this:")
    print("1. Make sure you have API keys in your .env file")
    print("2. Check your internet connection")
    print("3. Verify API key permissions and quotas")
    
    llm_available = False

✅ Successfully created research LLM!
   Model type: ChatOpenAI
   Test response: LLM ready for research


## 🤖 Building the Intelligent Agent

Now let's create our LLM-based code conversion agent that can understand context and handle complex mathematical expressions.

In [9]:
if llm_available:
    from agent_basics import LLMAgent
    from code_to_latex import create_research_converter, CodeLanguage, ConversionType
    
    # Create our specialized code-to-LaTeX converter
    latex_converter = create_research_converter(
        llm=research_llm,
        language=CodeLanguage.PYTHON,
        conversion_type=ConversionType.INLINE_MATH
    )
    
    print("🤖 Created Code-to-LaTeX Agent!")
    print(f"   Capabilities: {len(latex_converter.capabilities)}")
    print(f"   Function mappings: {len(latex_converter.function_mappings)}")
    print(f"   Greek letter support: {len(latex_converter.greek_mapping)}")
    
else:
    print("⏭️ Skipping agent creation (no LLM available)")
    print("   The concepts still apply - you can run this when LLM is configured")

🤖 Created Code-to-LaTeX Agent!
   Capabilities: 5
   Function mappings: 24
   Greek letter support: 14


In [21]:
latex_converter

<code_to_latex.CodeToLatexAgent at 0x7ba0405f3910>

In [10]:
if llm_available:
    print("🧪 TESTING LLM AGENT APPROACH")
    print("=" * 35)
    
    # Test on the same examples we used for hardcoded
    agent_results = []
    
    for i, code in enumerate(research_code_examples[:5], 1):
        print(f"\n🔬 Test {i}: {code}")
        
        start_time = time.time()
        result = latex_converter.convert_expression(code) # convert_algorithm
        
        agent_results.append(result)
        
        if result.success:
            print(f"✅ Success: {result.output}")
            print(f"   Confidence: {result.confidence:.2f}")
            print(f"   Time: {result.processing_time:.3f}s")
        else:
            print(f"❌ Failed: {result.metadata.get('error', 'Unknown error')}")
    
    # Calculate performance metrics
    agent_success_rate = latex_converter.get_success_rate()
    agent_avg_confidence = latex_converter.get_average_confidence()
    
    print(f"\n📊 AGENT PERFORMANCE:")
    print(f"Success Rate: {agent_success_rate:.1%}")
    print(f"Average Confidence: {agent_avg_confidence:.2f}")
    
else:
    print("⏭️ Skipping agent testing (no LLM available)")
    print("\n💡 Expected results with LLM agent:")
    print("   • Higher success rate on complex expressions")
    print("   • Better handling of nested functions")
    print("   • Contextual understanding of mathematical notation")
    print("   • Slower execution due to API calls")

🧪 TESTING LLM AGENT APPROACH

🔬 Test 1: np.sqrt(x**2 + y**2)
✅ Success: $\sqrt{x^2 + y^2}$
   Confidence: 1.00
   Time: 1.214s

🔬 Test 2: np.sin(theta) * np.cos(phi)
✅ Success: $\sin(\theta) \cdot \cos(\phi)$
   Confidence: 0.70
   Time: 1.567s

🔬 Test 3: np.exp(-0.5 * (x/sigma)**2)
✅ Success: $e^{-0.5 \left(\frac{x}{\sigma}\right)^2}$
   Confidence: 1.00
   Time: 1.220s

🔬 Test 4: (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu) / sigma)**2)
✅ Success: $\frac{1}{\sigma \sqrt{2 \pi}} \exp\left(-\frac{1}{2} \left(\frac{x - \mu}{\sigma}\right)^2\right)$
   Confidence: 1.00
   Time: 1.686s

🔬 Test 5: np.mean(data) + 1.96 * np.std(data) / np.sqrt(len(data))
✅ Success: $\text{mean}(\text{data}) + 1.96 \cdot \frac{\text{std}(\text{data})}{\sqrt{\text{len}(\text{data})}}$
   Confidence: 1.00
   Time: 2.207s

📊 AGENT PERFORMANCE:
Success Rate: 100.0%
Average Confidence: 0.94


$\frac{1}{\sigma \sqrt{2 \pi}} \exp\left(-\frac{1}{2} \left(\frac{x - \mu}{\sigma}\right)^2\right)$

## 📊 Direct Comparison: Hardcoded vs Agent

Let's compare both approaches side-by-side to understand when to use each.

In [23]:
if llm_available:
    from agent_basics import compare_approaches
    
    print("⚔️ HARDCODED vs AGENT COMPARISON")
    print("=" * 40)
    
    # Run comprehensive comparison
    comparison_results = compare_approaches(
        test_cases=research_code_examples[:5],  # Use first 5 for speed
        llm=research_llm
    )
    
    # Create comparison table
    print("\n📊 PERFORMANCE COMPARISON:")
    print("-" * 60)
    print(f"{'Approach':<12} {'Success Rate':<12} {'Avg Confidence':<15} {'Avg Time':<10}")
    print("-" * 60)
    
    for approach, metrics in comparison_results.items():
        print(f"{approach.title():<12} {metrics['success_rate']:<11.1%} {metrics['avg_confidence']:<14.2f} {metrics['avg_processing_time']:<9.3f}s")
    
    print("\n🎯 RECOMMENDATIONS:")
    print("• HARDCODED: Use for simple patterns, high-volume processing")
    print("• LLM AGENT: Use for complex expressions, varied notation")
    print("• HYBRID: Best of both - fast for simple, intelligent for complex")
    
else:
    print("⏭️ Skipping comparison (no LLM available)")
    print("\n📊 Expected comparison results:")
    print("Approach     Success Rate  Avg Confidence  Avg Time")
    print("Hardcoded    ~60%         ~0.4            ~0.001s")
    print("LLM Agent    ~95%         ~0.8            ~1.5s")
    print("Hybrid       ~95%         ~0.8            ~0.8s")

⚔️ HARDCODED vs AGENT COMPARISON

🧪 Testing HARDCODED approach...
  Case 1: np.sqrt(x**2 + y**2)...
  Case 2: np.sin(theta) * np.cos(phi)...
  Case 3: np.exp(-0.5 * (x/sigma)**2)...
  Case 4: (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 *...
  Case 5: np.mean(data) + 1.96 * np.std(data) / np.sqrt(len(...
    Success Rate: 100.00%
    Avg Confidence: 0.80
    Avg Time: 0.000s

🧪 Testing LLM_AGENT approach...
  Case 1: np.sqrt(x**2 + y**2)...
  Case 2: np.sin(theta) * np.cos(phi)...
  Case 3: np.exp(-0.5 * (x/sigma)**2)...
  Case 4: (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 *...
  Case 5: np.mean(data) + 1.96 * np.std(data) / np.sqrt(len(...
    Success Rate: 100.00%
    Avg Confidence: 0.94
    Avg Time: 1.199s

🧪 Testing HYBRID approach...
  Case 1: np.sqrt(x**2 + y**2)...
  Case 2: np.sin(theta) * np.cos(phi)...
  Case 3: np.exp(-0.5 * (x/sigma)**2)...
  Case 4: (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 *...
  Case 5: np.mean(data) + 1.96 * np.std(data) / np.sqrt(len

# Section 4: LangGraph Workflow (30 min)

`★ Insight ─────────────────────────────────────`

LangGraph provides structure to complex workflows without overwhelming complexity. For research applications, it helps organize multi-step processes like: Parse → Convert → Validate → Format → Output. This pattern is reusable across different research tasks.

`─────────────────────────────────────────────────`

Now let's organize our conversion process into a structured workflow using LangGraph.

#### 🎓 LANGGRAPH BASICS FOR RESEARCH
==================================================

🔄 Core Concepts:
1. STATE: Data that flows between processing steps
2. NODES: Individual processing functions
3. EDGES: Connections between nodes
4. GRAPH: Complete workflow definition

📊 Research Workflow Pattern:
   INPUT → ANALYZE → PROCESS → VALIDATE → OUTPUT

🎯 Benefits for Research:
• Reproducible workflows
• Clear processing steps
• Error handling and validation
• Easy to modify and extend

💡 When to Use LangGraph:
• Multi-step research processes
• Need for workflow visualization
• Complex decision logic
• Collaboration between different processing steps

In [17]:
# Create and test a simple research workflow
print("🔄 CREATING RESEARCH WORKFLOW")
print("=" * 35)
from graph_introduction import create_simple_research_workflow, ResearchTask,SimpleResearchState
# Build the workflow
research_workflow = create_simple_research_workflow()

print("✅ Workflow created with nodes:")
workflow_nodes = ["input", "analyze", "process", "validate", "output"]
for i, node in enumerate(workflow_nodes, 1):
    print(f"   {i}. {node.upper()}")

print(f"\n🔗 Workflow pattern: {' → '.join(workflow_nodes)}")

🔄 CREATING RESEARCH WORKFLOW
✅ Workflow created with nodes:
   1. INPUT
   2. ANALYZE
   3. PROCESS
   4. VALIDATE
   5. OUTPUT

🔗 Workflow pattern: input → analyze → process → validate → output


In [18]:
# Run the workflow with a research task
print("\n🚀 EXECUTING WORKFLOW")
print("=" * 25)

# Create a research task
sample_task = ResearchTask(
    task_id="tutorial_001",
    content="np.sqrt(x**2 + y**2)",
    task_type="code_conversion",
    metadata={"language": "python", "target": "latex"}
)

print(f"📋 Task: {sample_task.task_id}")
print(f"   Content: {sample_task.content}")
print(f"   Type: {sample_task.task_type}")

# Initial state
initial_state = SimpleResearchState(
    task=sample_task,
    current_stage="starting",
    analysis_result="",
    processed_content="",
    validation_status="",
    final_output="",
    processing_history=[],
    error_messages=[]
)

try:
    # Execute the workflow
    print("\n🔄 Running workflow...\n")
    result = research_workflow.invoke(initial_state)
    
    print("\n" + "=" * 40)
    print("📊 WORKFLOW RESULTS")
    print("=" * 40)
    
    print(f"Final Stage: {result['current_stage']}")
    print(f"\nFinal Output:\n{result['final_output']}")
    
    print(f"\n📚 Processing Steps ({len(result['processing_history'])}):")
    for i, step in enumerate(result['processing_history'], 1):
        print(f"   {i}. {step['stage'].upper()}: {step['details']}")
        
except Exception as e:
    print(f"❌ Workflow execution failed: {e}")
    print("💡 This is normal if running without full setup")


🚀 EXECUTING WORKFLOW
📋 Task: tutorial_001
   Content: np.sqrt(x**2 + y**2)
   Type: code_conversion

🔄 Running workflow...

📥 INPUT: Processing task 'tutorial_001'
   Content: np.sqrt(x**2 + y**2)...
🔍 ANALYZE: Analyzing task type 'code_conversion'
⚙️ PROCESS: Executing main processing for code_conversion
✅ VALIDATE: Checking output quality
📤 OUTPUT: Finalizing results

📊 WORKFLOW RESULTS
Final Stage: complete

Final Output:
RESEARCH RESULT:\nLaTeX conversion: $\\sqrt(x^2 + y^2)$\n\nValidation: Validation passed: Output appears correct

📚 Processing Steps (5):
   1. INPUT: Started processing code_conversion task
   2. ANALYZE: Code conversion task detected. Language: python
   3. PROCESS: Applied code_conversion processing logic
   4. VALIDATE: Validation passed: Output appears correct
   5. OUTPUT: Workflow completed successfully


## 🔬 Advanced Workflow: Code Conversion Pipeline

Let's create a more sophisticated workflow specifically for code-to-LaTeX conversion.

In [19]:
if llm_available:
    from graph_introduction import create_code_conversion_workflow, CodeConversionState
    
    print("🔬 ADVANCED CODE CONVERSION WORKFLOW")
    print("=" * 40)
    
    # Create specialized workflow
    conversion_workflow = create_code_conversion_workflow(latex_converter)
    
    print("✅ Advanced workflow created with stages:")
    advanced_stages = ["parse", "convert", "validate", "finalize"]
    for i, stage in enumerate(advanced_stages, 1):
        print(f"   {i}. {stage.upper()}")
    
    # Test with complex expression
    test_code = "(1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu) / sigma)**2)"
    
    print(f"\n🧪 Testing with complex expression:")
    print(f"   {test_code}")
    
    # Initial state for conversion workflow
    conversion_state = CodeConversionState(
        original_code=test_code,
        language="python",
        conversion_type="inline",
        parsed_code={},
        latex_output="",
        validation_result={},
        final_latex="",
        confidence_score=0.0,
        metadata={}
    )
    
    try:
        print("\n🔄 Running conversion workflow...\n")
        conversion_result = conversion_workflow.invoke(conversion_state)
        
        print("\n" + "=" * 50)
        print("📊 CONVERSION RESULTS")
        print("=" * 50)
        
        print(f"Original Code: {conversion_result['original_code']}")
        print(f"Final LaTeX: {conversion_result['final_latex']}")
        print(f"Confidence: {conversion_result['confidence_score']:.2f}")
        print(f"Success: {conversion_result['metadata']['conversion_successful']}")
        
    except Exception as e:
        print(f"❌ Conversion workflow failed: {e}")
        
else:
    print("⏭️ Skipping advanced workflow (no LLM available)")
    print("\n💡 Advanced workflow would provide:")
    print("   • Code parsing and analysis")
    print("   • Intelligent LaTeX conversion")
    print("   • Output validation and quality checks")
    print("   • Formatted final results with confidence scores")

🔬 ADVANCED CODE CONVERSION WORKFLOW
✅ Advanced workflow created with stages:
   1. PARSE
   2. CONVERT
   3. VALIDATE
   4. FINALIZE

🧪 Testing with complex expression:
   (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu) / sigma)**2)

🔄 Running conversion workflow...

🔍 PARSE: Analyzing python code
🔄 CONVERT: Converting to inline LaTeX
✅ VALIDATE: Checking LaTeX formatting
📋 FINALIZE: Preparing final output

📊 CONVERSION RESULTS
Original Code: (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu) / sigma)**2)
Final LaTeX: \texttt{(1 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu) / sigma)**2)}
Confidence: 1.00
Success: False


# Section 5: Practical Applications (15 min)

## 🎯 Research Workflow Integration

Let's see how to integrate this into your actual research workflow.

In [20]:
# Simulate a real research scenario
print("📄 RESEARCH PAPER SCENARIO")
print("=" * 30)

research_paper_sections = {
    "Introduction": [
        "distance = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)",
        "similarity = np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))"
    ],
    "Methodology": [
        "cost = (1 / (2 * m)) * np.sum((hypothesis - y)**2)",
        "gradient = (1 / m) * X.T.dot(hypothesis - y)"
    ],
    "Results": [
        "accuracy = np.sum(predictions == y_test) / len(y_test)",
        "mse = np.mean((predictions - y_true)**2)"
    ]
}

print("📋 Your paper has code in multiple sections:")
total_expressions = 0
for section, expressions in research_paper_sections.items():
    print(f"\n📖 {section}:")
    for expr in expressions:
        print(f"   • {expr}")
        total_expressions += 1

print(f"\n📊 Total expressions to convert: {total_expressions}")
print(f"⏰ Manual conversion time: ~{total_expressions * 15} minutes")
print(f"🤖 Agent conversion time: ~{total_expressions * 2} minutes")
print(f"💡 Time saved: ~{total_expressions * 13} minutes!")

📄 RESEARCH PAPER SCENARIO
📋 Your paper has code in multiple sections:

📖 Introduction:
   • distance = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)
   • similarity = np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))

📖 Methodology:
   • cost = (1 / (2 * m)) * np.sum((hypothesis - y)**2)
   • gradient = (1 / m) * X.T.dot(hypothesis - y)

📖 Results:
   • accuracy = np.sum(predictions == y_test) / len(y_test)
   • mse = np.mean((predictions - y_true)**2)

📊 Total expressions to convert: 6
⏰ Manual conversion time: ~90 minutes
🤖 Agent conversion time: ~12 minutes
💡 Time saved: ~78 minutes!


In [21]:
if llm_available:
    print("\n🔄 BATCH PROCESSING DEMONSTRATION")
    print("=" * 35)
    
    # Collect all expressions
    all_expressions = []
    for section, expressions in research_paper_sections.items():
        all_expressions.extend(expressions)
    
    print(f"📦 Processing {len(all_expressions)} expressions...")
    
    # Batch convert using our agent
    start_time = time.time()
    batch_results = latex_converter.batch_convert(all_expressions)
    total_time = time.time() - start_time
    
    print(f"\n📊 BATCH CONVERSION RESULTS:")
    print(f"Total time: {total_time:.1f}s")
    print(f"Average per expression: {total_time/len(all_expressions):.1f}s")
    
    successful_conversions = sum(1 for r in batch_results if r.success)
    print(f"Success rate: {successful_conversions}/{len(all_expressions)} ({successful_conversions/len(all_expressions):.1%})")
    
    print("\n📄 Sample conversions:")
    for i, (original, result) in enumerate(zip(all_expressions[:3], batch_results[:3])):
        print(f"\n{i+1}. Original: {original}")
        if result.success:
            print(f"   LaTeX: {result.output}")
        else:
            print(f"   Error: Failed to convert")
            
else:
    print("\n⏭️ Skipping batch processing (no LLM available)")
    print("💡 Batch processing would convert all expressions efficiently")


🔄 BATCH PROCESSING DEMONSTRATION
📦 Processing 6 expressions...

📊 BATCH CONVERSION RESULTS:
Total time: 11.3s
Average per expression: 1.9s
Success rate: 6/6 (100.0%)

📄 Sample conversions:

1. Original: distance = np.sqrt((x1 - x2)**2 + (y1 - y2)**2)
   LaTeX: $ \text{distance} = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2} $

2. Original: similarity = np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))
   LaTeX: $\text{similarity} = \frac{v_1 \cdot v_2}{\|v_1\| \cdot \|v_2\|}$

3. Original: cost = (1 / (2 * m)) * np.sum((hypothesis - y)**2)
   LaTeX: $\text{cost} = \frac{1}{2m} \sum (hypothesis - y)^2$


## 💼 Integration Guidelines

### When to Use Each Approach:

1. **Hardcoded Approach**:
   - Simple, repetitive patterns
   - High-volume processing (1000+ expressions)
   - Offline processing requirements
   - Preprocessing step

2. **LLM Agent Approach**:
   - Complex mathematical expressions
   - Variable notation styles
   - Context-dependent conversion
   - Quality over speed priority

3. **LangGraph Workflow**:
   - Multi-step research processes
   - Need for validation and quality control
   - Collaborative research environments
   - Reproducible research workflows

# 🏃 Hands-on Exercise

**Challenge**: Adapt the code converter for your specific research domain!

## Exercise Tasks:

1. **Customize for Your Field**: Modify the function mappings for your research area
2. **Add New Patterns**: Extend the hardcoded processor with domain-specific patterns
3. **Create Custom Workflow**: Design a LangGraph workflow for your research process
4. **Test Real Examples**: Use expressions from your actual research work
