# 🌐 API Integration Example - LLM Survey Generator

This notebook demonstrates how to interact with the LLM Survey Generator REST API and WebSocket endpoints.

## Features Covered
1. **Authentication** - API key setup
2. **Paper Upload** - Submit papers for processing
3. **Survey Generation** - Create surveys via API
4. **Real-time Monitoring** - WebSocket progress tracking
5. **Result Retrieval** - Get completed surveys
6. **Batch Processing** - Multiple surveys in parallel

---

## 🔧 Setup and Configuration

In [None]:
import requests
import json
import time
import asyncio
import websockets
from pathlib import Path
from datetime import datetime
import pandas as pd
from IPython.display import display, Markdown, clear_output
import warnings
warnings.filterwarnings('ignore')

# API Configuration
BASE_URL = "http://localhost:8000"
API_KEY = "your-api-key-here"  # Replace with actual key

# Set up headers
headers = {
    "api-key": API_KEY,
    "Content-Type": "application/json"
}

print("🔧 API Client Configuration")
print(f"  Base URL: {BASE_URL}")
print(f"  API Key: {'*' * 10}...")

# Test connection
try:
    response = requests.get(f"{BASE_URL}/health")
    if response.status_code == 200:
        print("✅ API is online and healthy")
        print(f"  Status: {response.json()}")
    else:
        print("⚠️ API may be offline. Start with: uvicorn src.api.main:app")
except:
    print("❌ Cannot connect to API. Please start the server first.")
    print("   Run: uvicorn src.api.main:app --reload")

## 1️⃣ Upload Papers to API

In [None]:
# Create sample paper data
sample_papers = [
    {
        "title": "Attention Is All You Need",
        "abstract": "The dominant sequence transduction models are based on complex recurrent or convolutional neural networks. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms.",
        "authors": ["Vaswani et al."],
        "year": 2017
    },
    {
        "title": "BERT: Pre-training of Deep Bidirectional Transformers",
        "abstract": "We introduce BERT, which stands for Bidirectional Encoder Representations from Transformers. BERT is designed to pre-train deep bidirectional representations.",
        "authors": ["Devlin et al."],
        "year": 2018
    },
    {
        "title": "Language Models are Few-Shot Learners",
        "abstract": "We demonstrate that scaling up language models greatly improves task-agnostic, few-shot performance.",
        "authors": ["Brown et al."],
        "year": 2020
    }
]

# Save papers to JSON file
papers_file = Path('../data/api_papers.json')
papers_file.parent.mkdir(exist_ok=True)
with open(papers_file, 'w') as f:
    json.dump(sample_papers, f, indent=2)

print(f"📄 Created {len(sample_papers)} sample papers")

# Upload papers via API
paper_ids = []

# Note: In production, you'd upload PDF files
# This is a simplified example using JSON data
for paper in sample_papers:
    # Simulate upload (in real API, use multipart/form-data with PDF)
    response = requests.post(
        f"{BASE_URL}/papers",
        json=paper,
        headers=headers
    )
    
    if response.status_code == 200:
        paper_id = response.json().get('paper_id', f"demo-{len(paper_ids)}")
        paper_ids.append(paper_id)
        print(f"✅ Uploaded: {paper['title'][:50]}...")
        print(f"   Paper ID: {paper_id}")
    else:
        # For demo, create mock IDs
        paper_id = f"demo-{len(paper_ids)}"
        paper_ids.append(paper_id)
        print(f"📝 Demo mode - Paper ID: {paper_id}")

print(f"\n🎯 Total papers uploaded: {len(paper_ids)}")

## 2️⃣ Create Survey Generation Job

In [None]:
# Create survey request
survey_request = {
    "topic": "Evolution of Transformer-based Language Models",
    "paper_ids": paper_ids,
    "system_type": "iterative",  # Options: baseline, lce, iterative
    "max_iterations": 3,
    "model_preference": "balanced"  # Options: fast, balanced, complex
}

print("📝 Survey Request:")
print(json.dumps(survey_request, indent=2))

# Submit survey generation request
try:
    response = requests.post(
        f"{BASE_URL}/surveys",
        json=survey_request,
        headers=headers
    )
    
    if response.status_code == 200:
        survey_data = response.json()
        survey_id = survey_data['survey_id']
        print(f"\n✅ Survey job created!")
        print(f"   Survey ID: {survey_id}")
        print(f"   Status: {survey_data['status']}")
    else:
        # Demo mode
        survey_id = "demo-survey-001"
        print(f"\n📝 Demo mode - Survey ID: {survey_id}")
except:
    survey_id = "demo-survey-001"
    print(f"\n📝 Demo mode (API offline) - Survey ID: {survey_id}")

## 3️⃣ Monitor Progress with Polling

In [None]:
def poll_survey_status(survey_id, max_polls=10, interval=2):
    """Poll survey status until completion."""
    
    print(f"🔄 Monitoring survey: {survey_id}\n")
    
    for i in range(max_polls):
        try:
            response = requests.get(
                f"{BASE_URL}/surveys/{survey_id}/status",
                headers=headers
            )
            
            if response.status_code == 200:
                status_data = response.json()
            else:
                # Demo fallback
                status_data = {
                    "status": "processing" if i < 5 else "completed",
                    "current_iteration": min(i // 2 + 1, 3),
                    "current_phase": ["generating", "verifying", "improving"][i % 3],
                    "quality_score": 3.5 + (i * 0.1)
                }
        except:
            # Demo mode when API is offline
            status_data = {
                "status": "processing" if i < 5 else "completed",
                "current_iteration": min(i // 2 + 1, 3),
                "current_phase": ["generating", "verifying", "improving"][i % 3],
                "quality_score": 3.5 + (i * 0.1)
            }
        
        # Display status
        clear_output(wait=True)
        print(f"🔄 Survey Status Update [{i+1}/{max_polls}]")
        print("=" * 50)
        print(f"Status: {status_data['status'].upper()}")
        print(f"Iteration: {status_data.get('current_iteration', 'N/A')}")
        print(f"Phase: {status_data.get('current_phase', 'N/A')}")
        print(f"Quality Score: {status_data.get('quality_score', 0):.2f}")
        
        # Progress bar
        progress = (i + 1) / max_polls
        bar_length = 30
        filled = int(bar_length * progress)
        bar = '█' * filled + '░' * (bar_length - filled)
        print(f"\nProgress: [{bar}] {progress*100:.0f}%")
        
        if status_data['status'] == 'completed':
            print("\n✅ Survey generation completed!")
            return status_data
        
        time.sleep(interval)
    
    print("\n⏱️ Polling timeout. Survey may still be processing.")
    return status_data

# Poll for status
final_status = poll_survey_status(survey_id)

## 4️⃣ Real-time Monitoring with WebSocket

In [None]:
# WebSocket monitoring (requires async)
async def monitor_with_websocket(survey_id):
    """Connect to WebSocket for real-time updates."""
    
    ws_url = f"ws://localhost:8000/ws/{survey_id}"
    print(f"🌐 Connecting to WebSocket: {ws_url}\n")
    
    try:
        async with websockets.connect(ws_url) as websocket:
            while True:
                message = await websocket.recv()
                data = json.loads(message)
                
                # Display update
                clear_output(wait=True)
                print("🔴 LIVE WebSocket Update")
                print("=" * 50)
                print(f"Survey ID: {data.get('survey_id', survey_id)}")
                print(f"Status: {data.get('status', 'unknown').upper()}")
                print(f"Iteration: {data.get('current_iteration', 'N/A')}/3")
                print(f"Phase: {data.get('current_phase', 'N/A')}")
                print(f"Quality: {data.get('quality_score', 0):.2f}/5.00")
                
                if data.get('status') == 'completed':
                    print("\n✅ Survey completed via WebSocket!")
                    break
                    
    except Exception as e:
        print(f"⚠️ WebSocket not available: {e}")
        print("Falling back to polling...")

# Note: In Jupyter, use this to run async code
try:
    # Try to connect to WebSocket
    await monitor_with_websocket(survey_id)
except:
    print("📝 WebSocket demo - would show real-time updates in production")
    print("   Status: Processing → Verifying → Improving → Completed")

## 5️⃣ Retrieve Completed Survey

In [None]:
# Get completed survey
def get_survey(survey_id, format='json'):
    """Retrieve completed survey."""
    
    try:
        response = requests.get(
            f"{BASE_URL}/surveys/{survey_id}",
            params={'format': format},
            headers=headers
        )
        
        if response.status_code == 200:
            return response.json() if format == 'json' else response.text
    except:
        pass
    
    # Demo fallback
    return {
        "title": "Survey on Evolution of Transformer-based Language Models",
        "sections": [
            {
                "title": "Introduction",
                "content": "Transformer models have revolutionized NLP since 2017...",
                "citations": ["Vaswani et al.", "Devlin et al."]
            },
            {
                "title": "Architecture Evolution",
                "content": "From attention mechanisms to BERT and GPT...",
                "citations": ["Brown et al."]
            },
            {
                "title": "Applications",
                "content": "Modern applications span from translation to generation...",
                "citations": ["Devlin et al.", "Brown et al."]
            },
            {
                "title": "Conclusion",
                "content": "Transformers continue to advance the field...",
                "citations": []
            }
        ],
        "quality_score": 4.11,
        "iterations": 3,
        "generation_time": 45.2
    }

# Get survey in JSON format
survey_json = get_survey(survey_id, format='json')

print("📄 Retrieved Survey:")
print("=" * 50)
print(f"Title: {survey_json['title']}")
print(f"Sections: {len(survey_json['sections'])}")
print(f"Quality Score: {survey_json['quality_score']:.2f}/5.00")
print(f"Iterations: {survey_json['iterations']}")
print(f"Generation Time: {survey_json.get('generation_time', 'N/A')}s")

# Display sections
print("\n📚 Survey Sections:")
for i, section in enumerate(survey_json['sections'], 1):
    print(f"  {i}. {section['title']}")
    print(f"     Content: {section['content'][:80]}...")
    print(f"     Citations: {len(section['citations'])}")

## 6️⃣ Batch Processing Example

In [None]:
# Create multiple survey jobs
batch_topics = [
    "Transformer Architectures",
    "Pre-training Methods",
    "Few-shot Learning"
]

batch_jobs = []

print("🚀 Starting batch processing...\n")

for topic in batch_topics:
    request = {
        "topic": topic,
        "paper_ids": paper_ids[:2],  # Use subset for speed
        "system_type": "iterative",
        "max_iterations": 2  # Fewer iterations for batch
    }
    
    try:
        response = requests.post(
            f"{BASE_URL}/surveys",
            json=request,
            headers=headers
        )
        
        if response.status_code == 200:
            job_id = response.json()['survey_id']
        else:
            job_id = f"batch-{len(batch_jobs)}"
    except:
        job_id = f"batch-demo-{len(batch_jobs)}"
    
    batch_jobs.append({
        'id': job_id,
        'topic': topic,
        'status': 'submitted',
        'start_time': datetime.now()
    })
    
    print(f"📝 Submitted: {topic}")
    print(f"   Job ID: {job_id}")

print(f"\n✅ Batch submitted: {len(batch_jobs)} jobs")

# Monitor batch
print("\n⏳ Monitoring batch progress...\n")

for _ in range(5):  # Simulate monitoring
    clear_output(wait=True)
    print("📊 Batch Status Dashboard")
    print("=" * 60)
    
    for job in batch_jobs:
        # Simulate status updates
        if job['status'] == 'submitted':
            job['status'] = 'processing'
        elif job['status'] == 'processing':
            job['status'] = 'completed' if _ > 2 else 'processing'
        
        status_icon = {'submitted': '📝', 'processing': '🔄', 'completed': '✅'}[job['status']]
        print(f"{status_icon} {job['topic'][:30]:30s} | {job['status']:10s} | {job['id']}")
    
    completed = sum(1 for j in batch_jobs if j['status'] == 'completed')
    print(f"\n📈 Progress: {completed}/{len(batch_jobs)} completed")
    
    time.sleep(1)

print("\n🎉 Batch processing complete!")

## 7️⃣ Advanced API Features

In [None]:
# List all papers
def list_papers(skip=0, limit=10):
    """List uploaded papers with pagination."""
    
    try:
        response = requests.get(
            f"{BASE_URL}/papers",
            params={'skip': skip, 'limit': limit},
            headers=headers
        )
        
        if response.status_code == 200:
            return response.json()
    except:
        pass
    
    # Demo fallback
    return {
        'total': len(sample_papers),
        'papers': sample_papers[skip:skip+limit]
    }

# Get paper list
papers_list = list_papers()
print("📚 Available Papers:")
print(f"Total: {papers_list.get('total', len(sample_papers))}\n")

for i, paper in enumerate(papers_list.get('papers', sample_papers), 1):
    print(f"{i}. {paper['title']} ({paper.get('year', 'N/A')})")

# Check rate limits
print("\n⚠️ Rate Limits:")
print("  • /upload: 30 requests/minute")
print("  • /surveys: 10 requests/minute")
print("  • Other endpoints: No limit")

# Error handling example
print("\n🛡️ Error Handling Example:")

# Try invalid request
invalid_request = {
    "topic": "",  # Empty topic
    "system_type": "invalid"  # Invalid system
}

try:
    response = requests.post(
        f"{BASE_URL}/surveys",
        json=invalid_request,
        headers=headers
    )
    
    if response.status_code != 200:
        print(f"Error {response.status_code}: {response.json().get('detail', 'Invalid request')}")
except:
    print("Error 400: Invalid request parameters (demo)")

## 8️⃣ Export Results

In [None]:
# Export survey to different formats
def export_survey(survey, format='markdown'):
    """Export survey in different formats."""
    
    if format == 'markdown':
        md = f"# {survey['title']}\n\n"
        
        for section in survey['sections']:
            md += f"## {section['title']}\n\n"
            md += f"{section['content']}\n\n"
            
            if section['citations']:
                md += "**References:**\n"
                for cite in section['citations']:
                    md += f"- {cite}\n"
                md += "\n"
        
        return md
    
    elif format == 'html':
        html = f"<h1>{survey['title']}</h1>\n"
        
        for section in survey['sections']:
            html += f"<h2>{section['title']}</h2>\n"
            html += f"<p>{section['content']}</p>\n"
            
            if section['citations']:
                html += "<p><em>Citations: "
                html += ", ".join(section['citations'])
                html += "</em></p>\n"
        
        return html
    
    return survey  # Default JSON

# Export in different formats
output_dir = Path('../outputs/api_exports')
output_dir.mkdir(parents=True, exist_ok=True)

# Save as Markdown
md_content = export_survey(survey_json, format='markdown')
md_path = output_dir / f"survey_{survey_id}.md"
with open(md_path, 'w') as f:
    f.write(md_content)
print(f"📝 Exported Markdown: {md_path}")

# Save as HTML
html_content = export_survey(survey_json, format='html')
html_path = output_dir / f"survey_{survey_id}.html"
with open(html_path, 'w') as f:
    f.write(html_content)
print(f"🌐 Exported HTML: {html_path}")

# Save as JSON
json_path = output_dir / f"survey_{survey_id}.json"
with open(json_path, 'w') as f:
    json.dump(survey_json, f, indent=2)
print(f"📄 Exported JSON: {json_path}")

# Display preview
print("\n📖 Markdown Preview:")
print("=" * 50)
display(Markdown(md_content[:500] + "..."))

## 🎯 API Integration Best Practices

### Authentication
```python
# Store API key securely
import os
API_KEY = os.environ.get('SURVEY_API_KEY')
```

### Error Handling
```python
try:
    response = requests.post(...)
    response.raise_for_status()
except requests.exceptions.RequestException as e:
    handle_error(e)
```

### Rate Limiting
```python
from time import sleep
from functools import wraps

def rate_limit(max_calls=10, period=60):
    # Implement rate limiting decorator
    pass
```

### WebSocket Reconnection
```python
async def connect_with_retry(url, max_retries=3):
    for attempt in range(max_retries):
        try:
            return await websockets.connect(url)
        except:
            await asyncio.sleep(2 ** attempt)
```

### Batch Processing
```python
# Use asyncio for concurrent requests
async def batch_process(topics):
    tasks = [create_survey(topic) for topic in topics]
    return await asyncio.gather(*tasks)
```

## 📚 API Documentation Links

- **Interactive Docs**: http://localhost:8000/docs
- **ReDoc**: http://localhost:8000/redoc
- **OpenAPI Schema**: http://localhost:8000/openapi.json

## 🚀 Next Steps

1. **Production Deployment**
   - Use HTTPS with SSL certificates
   - Implement API key rotation
   - Set up monitoring and alerts

2. **Advanced Features**
   - Custom model selection per request
   - Priority queue for urgent surveys
   - Scheduled survey generation

3. **Integration Options**
   - Build a web frontend
   - Create mobile app
   - Integrate with research tools

---

*This notebook demonstrates complete API integration for the LLM Survey Generator system.*