# AI-Orchestrated PowerPoint Report Generator

This notebook uses **your LLM API's agentic capabilities** to automatically generate comprehensive PowerPoint reports from warpage data.

## How It Works

Instead of hardcoding visualizations and slides, this notebook:
1. **Uploads your data** to the LLM API
2. **Sends strategic prompts** to the ReAct agent
3. **Lets the agent autonomously**:
   - Analyze the data
   - Generate visualization code
   - Execute the code
   - Create PowerPoint slides
   - Add charts and formatting

**Key Advantage:** The AI adapts to your data structure, fixes errors automatically, and can suggest additional insights.

**Technology Stack:**
- Your LLM API (ReAct agent + python_coder tool)
- python-pptx (auto-generated)
- matplotlib/seaborn (auto-generated)

**Supports:** Multiple datasets, automatic comparison, adaptive visualizations

## 1. Setup & Configuration

USE THIS AS IT IS, NO NEED TO CHANGE!!

In [None]:
import httpx
import json
from pathlib import Path
from datetime import datetime
import time

# LLM API Client (from API_examples.ipynb)
class LLMApiClient:
    def __init__(self, base_url: str, timeout: float = 3600.0):
        self.base_url = base_url.rstrip("/")
        self.token = None
        self.timeout = httpx.Timeout(50.0, read=timeout, write=timeout, pool=timeout)

    def _headers(self):
        h = {}
        if self.token:
            h["Authorization"] = f"Bearer {self.token}"
        return h

    def login(self, username: str, password: str):
        r = httpx.post(f"{self.base_url}/api/auth/login", json={
            "username": username, "password": password
        }, timeout=10.0)
        r.raise_for_status()
        data = r.json()
        self.token = data["access_token"]
        return data

    def list_models(self):
        headers = {"Authorization": f"Bearer {self.token}"} if self.token else {}
        r = httpx.get(f"{self.base_url}/v1/models", headers=headers, timeout=10.0)
        r.raise_for_status()
        return r.json()

    def chat_new(self, model: str, user_message: str, agent_type: str = "auto", files: list = None):
        messages = [{"role": "user", "content": user_message}]
        data = {
            "model": model,
            "messages": json.dumps(messages),
            "agent_type": agent_type
        }
        
        files_to_upload = []
        if files:
            for file_path in files:
                f = open(file_path, "rb")
                files_to_upload.append(("files", (Path(file_path).name, f)))
        
        try:
            r = httpx.post(
                f"{self.base_url}/v1/chat/completions",
                data=data,
                files=files_to_upload if files_to_upload else None,
                headers=self._headers(),
                timeout=self.timeout
            )
            r.raise_for_status()
            result = r.json()
            return result["choices"][0]["message"]["content"], result["x_session_id"]
        finally:
            for _, (_, f) in files_to_upload:
                f.close()

    def chat_continue(self, model: str, session_id: str, user_message: str, agent_type: str = "auto", files: list = None):
        messages = [{"role": "user", "content": user_message}]
        data = {
            "model": model,
            "messages": json.dumps(messages),
            "session_id": session_id,
            "agent_type": agent_type
        }
        
        files_to_upload = []
        if files:
            for file_path in files:
                f = open(file_path, "rb")
                files_to_upload.append(("files", (Path(file_path).name, f)))
        
        try:
            r = httpx.post(
                f"{self.base_url}/v1/chat/completions",
                data=data,
                files=files_to_upload if files_to_upload else None,
                headers=self._headers(),
                timeout=self.timeout
            )
            r.raise_for_status()
            result = r.json()
            return result["choices"][0]["message"]["content"], result["x_session_id"]
        finally:
            for _, (_, f) in files_to_upload:
                f.close()

# Configuration
API_BASE_URL = 'http://localhost:1007'
USERNAME = "leesihun"
PASSWORD = "s.hun.lee"

# Initialize client
client = LLMApiClient(API_BASE_URL, timeout=3600.0)
print(f"Client initialized: {API_BASE_URL}")

In [None]:
# Login and get model
client.login(USERNAME, PASSWORD)
models = client.list_models()
MODEL = models["data"][0]["id"]

print(f"Logged in as: {USERNAME}")
print(f"Using model: {MODEL}")

## 2. Configure Data Files

**Edit this cell** to specify your warpage statistics files:

In [None]:
# ========================================
# CONFIGURATION: Define your data files
# ========================================

# Option 1: Single file
# stats_paths = [Path(f"data/uploads/{USERNAME}/20251013_stats.json")]

# Option 2: Multiple files for comparison
stats_paths = [
    Path(f"B8_1021_stats.json"),
    Path(f"B8_1027_stats.json"),
]

# Verify files exist
print(f"Configured {len(stats_paths)} data file(s):\n")


for i, path in enumerate(stats_paths, 1):
    if path.exists():
        size_kb = path.stat().st_size / 1024
        print(f"  [{i}] {path.name} ({size_kb:.1f} KB) - OK")
    else:
        print(f"  [{i}] {path.name} - FILE NOT FOUND")

# Convert to strings for API
file_paths_str = [str(p) for p in stats_paths]

## 3. Phase 1: Data Analysis (AI-Driven)

Let the AI agent analyze your data and recommend visualizations.

In [None]:
analysis_prompt = f"""
I have {len(stats_paths)} warpage measurement dataset(s) attached as JSON files.

TASK: Comprehensive Data Analysis

Please perform the following analysis:

1. **Calculate Key Statistics:**
   - Total number of measurements across all datasets
   - Overall mean, std, min, max, range
   - Number of outliers by viewing PCA values as shown in pc1, pc2
   - Dataset comparison between different production dates

2. **Identify Visualization Needs:**
   - What trends should be visualized?
   - What comparisons are important?
   - What outliers or anomalies should be highlighted?

3. **Return Summary:**
   Provide a clear summary with:
   - Which data is better and why?
   - Any concerns or interesting patterns found

Be specific and data-driven in your recommendations.
"""

print("=" * 80)
print("PHASE 1: DATA ANALYSIS")
print("=" * 80)
print(f"\nUploading {len(stats_paths)} file(s) to AI agent...\n")

start_time = time.time()
analysis_result, session_id = client.chat_new(
    MODEL,
    analysis_prompt,
    agent_type="auto",  # Let agent decide (will likely use python_coder)
    files=file_paths_str
)
elapsed = time.time() - start_time

print(f"\nAnalysis completed in {elapsed:.1f}s\n")
print("=" * 80)
print("AI ANALYSIS RESULT")
print("=" * 80)
print(analysis_result)
print("\n" + "=" * 80)

## 4. Phase 2: Generate Visualizations (AI-Driven)

Let the AI agent create all visualizations based on the analysis.

In [None]:
visualization_prompt = f"""
Based on the previous data analysis, now generate comprehensive visualizations.

TASK: Create High-Quality Visualization Charts

Create the following {8 if len(stats_paths) == 1 else 9} professional visualizations:

**Required Charts:**

1. **Temporal Trends** (temporal_trends.png)
   - Line chart: mean values over file index (temporal sequence)
   - Add median line (dashed)
   - Shade ±1 standard deviation area
   - Labels: 'File Index', 'Warpage Value'

{'''2. **Dataset Comparison** (dataset_comparison.png) - ONLY if multiple datasets
   - 2x2 grid of subplots:
     - Top-left: Bar chart of average mean by dataset
     - Top-right: Bar chart of average std by dataset
     - Bottom-left: Bar chart of average range by dataset
     - Bottom-right: Box plots of mean distribution by dataset
''' if len(stats_paths) > 1 else ''}

{'3' if len(stats_paths) > 1 else '2'}. **Distribution Analysis** (distributions.png)
   - 2x2 grid of histograms:
     - Mean distribution (with average line)
     - Std distribution (with average line)
     - Skewness distribution (with average line)
     - Kurtosis distribution (with average line and threshold at 47)

{'4' if len(stats_paths) > 1 else '3'}. **Box Plot Analysis** (boxplots.png)
   - 1x3 grid:
     - Min/Max/Range box plots
     - Mean/Median box plots
     - Std/Skewness/Kurtosis box plots (scaled for visibility)

{'5' if len(stats_paths) > 1 else '4'}. **PCA Scatter Plot** (pca_scatter.png)
   - Scatter plot: PC1 vs PC2
   - Color by file index (temporal sequence) using 'viridis' colormap
   - Annotate top 5% outliers (by distance from center)
   - Add colorbar labeled 'File Index (Time)'

{'6' if len(stats_paths) > 1 else '5'}. **Correlation Heatmap** (correlation_heatmap.png)
   - Heatmap of correlations between all numeric metrics
   - Use 'coolwarm' colormap, center at 0
   - Annotate with correlation coefficients

{'7' if len(stats_paths) > 1 else '6'}. **Control Chart** (control_chart.png)
   - Line chart of mean values over file index
   - Add center line (overall mean)
   - Add ±3σ control limits (UCL/LCL)
   - Shade ±2σ warning zone (yellow)
   - Mark out-of-control points with red 'X'

{'8' if len(stats_paths) > 1 else '7'}. **Radar Chart** (radar_chart.png)
   - Compare top 5 vs bottom 5 performers
   - Metrics: mean, std, range, skewness, kurtosis (normalized 0-1)
   - Use polar plot with filled areas

{'9' if len(stats_paths) > 1 else '8'}. **Summary Statistics Table** (summary_table.png)
   - Create table image using matplotlib
   - Show describe() output for all numeric columns
   - Color-code header row (blue) and row labels (gray)

**Technical Requirements:**
- Save all charts to 'temp_charts/' directory (create if needed)
- Use 300 DPI for all images
- Professional color scheme: Blue (#1f77b4), Orange (#ff7f0e), Green (#2ca02c), Red (#d62728), Gray (#7f7f7f)
- Set style: seaborn whitegrid
- Add proper titles, labels, legends
- Use tight_layout() before saving

**Packages:**
- matplotlib.pyplot
- seaborn
- pandas
- numpy
- sklearn.preprocessing.MinMaxScaler (for radar chart normalization)

Print confirmation after each chart is saved.
"""

print("=" * 80)
print("PHASE 2: VISUALIZATION GENERATION")
print("=" * 80)
print(f"\nGenerating {8 if len(stats_paths) == 1 else 9} professional visualizations...\n")

start_time = time.time()
viz_result, _ = client.chat_continue(
    MODEL,
    session_id,
    visualization_prompt,
    agent_type="auto"
)
elapsed = time.time() - start_time

print(f"\nVisualization generation completed in {elapsed:.1f}s\n")
print("=" * 80)
print("AI VISUALIZATION RESULT")
print("=" * 80)
print(viz_result)
print("\n" + "=" * 80)

## 5. Phase 3: PowerPoint Assembly (AI-Driven)

Let the AI agent create the PowerPoint presentation with all charts.

In [None]:
# Get total file count for the prompt
total_files = 0
for path in stats_paths:
    with open(path, 'r') as f:
        data = json.load(f)
        total_files += len(data.get('files', []))

pptx_prompt = f"""
Create a professional PowerPoint presentation using python-pptx.

TASK: Generate Comprehensive PowerPoint Report

**Presentation Details:**
- Title: "Warpage Analysis Report"
- Subtitle: "Statistical Analysis of {total_files} Measurement Files{f' ({len(stats_paths)} Datasets)' if len(stats_paths) > 1 else ''}"
- Slide size: 10 x 7.5 inches
- Output filename: Warpage_Report_{datetime.now().strftime('%Y%m%d_%H%M%S')}.pptx

**Slide Structure:**

**Slide 1: Title Slide**
- Title: "Warpage Analysis Report" (44pt, bold, blue #1f77b4, centered)
- Subtitle: "Statistical Analysis of {total_files} Measurement Files{f' ({len(stats_paths)} Datasets)' if len(stats_paths) > 1 else ''}" (24pt, centered)
- Date: Current date (14pt, italic, centered)

**Slide 2: Executive Summary**
- Title: "Executive Summary" (32pt, bold, blue)
- 4 metric cards (colored boxes with white text):
  1. Total Files: {total_files} (blue background)
  2. Avg Mean Warpage: [calculate from data] (orange background)
  3. Avg Std Dev: [calculate from data] (green background)
  4. Outliers (Kurtosis>47): [count from data] (red background)
- Below cards: Add text summary from Phase 1 analysis

**Slide 3: Temporal Trends**
- Title: "Temporal Trends"
- Image: temp_charts/temporal_trends.png
- Description: "Mean warpage values over time with ±1σ variability envelope"

{f'''**Slide 4: Dataset Comparison**
- Title: "Dataset Comparison"
- Image: temp_charts/dataset_comparison.png
- Description: "Comparative analysis across {len(stats_paths)} datasets"
''' if len(stats_paths) > 1 else ''}

**Slide {'5' if len(stats_paths) > 1 else '4'}: Distribution Analysis**
- Title: "Distribution Analysis"
- Image: temp_charts/distributions.png
- Description: "Distribution of key statistical metrics"

**Slide {'6' if len(stats_paths) > 1 else '5'}: Variability Analysis**
- Title: "Variability Analysis"
- Image: temp_charts/boxplots.png
- Description: "Box plots showing spread and quartiles"

**Slide {'7' if len(stats_paths) > 1 else '6'}: PCA Analysis**
- Title: "PCA Analysis"
- Image: temp_charts/pca_scatter.png
- Description: "Principal Component Analysis with outlier detection"

**Slide {'8' if len(stats_paths) > 1 else '7'}: Correlation Matrix**
- Title: "Correlation Matrix"
- Image: temp_charts/correlation_heatmap.png
- Description: "Correlations between all metrics"

**Slide {'9' if len(stats_paths) > 1 else '8'}: Control Chart**
- Title: "Outlier Detection (Control Chart)"
- Image: temp_charts/control_chart.png
- Description: "Statistical process control with ±3σ limits"

**Slide {'10' if len(stats_paths) > 1 else '9'}: Top vs Bottom Performers**
- Title: "Top vs Bottom Performers"
- Image: temp_charts/radar_chart.png
- Description: "Comparison of best vs worst quality measurements"

**Slide {'11' if len(stats_paths) > 1 else '10'}: Summary Statistics**
- Title: "Summary Statistics"
- Image: temp_charts/summary_table.png
- Description: "Descriptive statistics for all metrics"

**Slide {'12' if len(stats_paths) > 1 else '11'}: Recommendations**
- Title: "Recommendations & Next Steps" (32pt, bold, blue)
- Three sections with bullet points:
  1. Quality Control (orange header, 18pt bold)
     - Investigate outlier files
     - Review out-of-control points
     - Establish tighter control limits
  2. Process Improvement (orange header, 18pt bold)
     - Target: mean warpage closer to 0
     - Reduce variability
     - Implement corrective actions
  3. Monitoring (orange header, 18pt bold)
     - Track PCA trends
     - Set up automated alerts
     - Conduct root cause analysis

**Technical Requirements:**
- Use blank layout (index 6) for all slides
- Add textboxes with specified formatting
- For chart slides:
  - Title at top (0.5 inches from top)
  - Image centered (9 inches wide)
  - Description at bottom if needed
- RGB colors: Blue (31, 119, 180), Orange (255, 127, 14), Green (44, 160, 44), Red (214, 39, 40)
- Use PP_ALIGN.CENTER for titles
- Save with timestamp in filename

Print confirmation when PowerPoint is saved, including:
- Filename
- File size
- Number of slides
- Full file path
"""

print("=" * 80)
print("PHASE 3: POWERPOINT ASSEMBLY")
print("=" * 80)
print(f"\nCreating PowerPoint with {11 if len(stats_paths) == 1 else 12} slides...\n")

start_time = time.time()
pptx_result, _ = client.chat_continue(
    MODEL,
    session_id,
    pptx_prompt,
    agent_type="auto"
)
elapsed = time.time() - start_time

print(f"\nPowerPoint creation completed in {elapsed:.1f}s\n")
print("=" * 80)
print("AI POWERPOINT RESULT")
print("=" * 80)
print(pptx_result)
print("\n" + "=" * 80)

## 6. Summary & Next Steps

In [None]:
from pathlib import Path
import glob

print("\n" + "=" * 80)
print("REPORT GENERATION COMPLETE!")
print("=" * 80)

# Find generated PowerPoint
pptx_files = sorted(glob.glob("Warpage_Report_*.pptx"), reverse=True)
if pptx_files:
    latest_pptx = pptx_files[0]
    size_kb = Path(latest_pptx).stat().st_size / 1024
    print(f"\nGenerated PowerPoint:")
    print(f"  File: {latest_pptx}")
    print(f"  Size: {size_kb:.2f} KB")
else:
    print("\nWarning: Could not find generated PowerPoint file")

# Check generated charts
temp_charts = Path("temp_charts")
if temp_charts.exists():
    chart_files = list(temp_charts.glob("*.png"))
    print(f"\nGenerated Charts: {len(chart_files)}")
    for chart in sorted(chart_files):
        print(f"  - {chart.name}")

print("\n" + "=" * 80)
print("WHAT THE AI DID")
print("=" * 80)
print("""
Phase 1: Data Analysis
  - Loaded and combined all datasets
  - Calculated key statistics
  - Identified patterns and outliers
  - Recommended visualizations

Phase 2: Visualization Generation
  - Generated 8-9 professional charts
  - Used matplotlib/seaborn with 300 DPI
  - Applied professional color schemes
  - Saved all charts to temp_charts/

Phase 3: PowerPoint Assembly
  - Created 11-12 slide presentation
  - Added title, summary, charts, and recommendations
  - Formatted with professional styling
  - Saved with timestamp

Total Steps: All autonomous via ReAct agent + python_coder tool!
""")

print("=" * 80)
print("NEXT STEPS")
print("=" * 80)
print("""
1. Open the generated .pptx file
2. Review charts and insights
3. Customize branding/colors if needed
4. Add company logo
5. Present to stakeholders

To regenerate with different data:
  - Update stats_paths in Section 2
  - Run all cells again

To cleanup temporary files:
  - Delete temp_charts/ directory
  - Delete old .pptx files
""")

print("=" * 80)

## 7. Optional: Cleanup

Remove temporary chart files if desired.

In [None]:
import shutil

cleanup = input("Delete temporary chart files? (y/n): ")
if cleanup.lower() == 'y':
    temp_charts = Path("temp_charts")
    if temp_charts.exists():
        shutil.rmtree(temp_charts)
        print(f"✓ Cleaned up {temp_charts}")
else:
    print(f"Temporary charts preserved in: temp_charts/")

---

## About This Notebook

This notebook demonstrates the power of **AI-orchestrated automation**:

- **~200 lines of orchestration code** vs 2000+ lines of hardcoded logic
- **Adaptive**: AI adjusts to your data structure
- **Self-healing**: Auto-fixes code errors (max 5 retries)
- **Multi-file support**: Built-in from your API
- **Leverages your infrastructure**: ReAct agent + python_coder tool

**How it works:**
1. You provide strategic prompts
2. ReAct agent reasons about the task
3. python_coder generates code
4. Code executes in sandboxed environment
5. Agent verifies results and iterates if needed

**Result:** Comprehensive PowerPoint report generated autonomously!

---

**Version:** 1.0.0 (AI-Orchestrated)  
**Last Updated:** January 2025