# üöÄ Week 3, Day 3: Interactive Visualizations

**üéØ Goal:** Create stunning interactive visualizations and AI dashboards with Plotly

**‚è±Ô∏è Time:** 60-90 minutes

**üåü Why This Matters for AI:**
- **Interactive exploration** - Zoom, pan, hover for insights
- **Real-time dashboards** - Monitor AI models in production
- **Stakeholder communication** - Impress non-technical audiences
- **Web deployment** - Share interactive reports online
- **3D visualization** - Explore high-dimensional data

---

## üî• 2024-2025 AI Trend Alert!

**AI Observability** is now ESSENTIAL:
- Monitor LLM costs, latency, quality in real-time
- **Interactive dashboards track production AI systems!**
- Companies like Langsmith, W&B, Arize use Plotly

**Agentic AI** requires real-time monitoring:
- Track autonomous agent decisions
- Visualize multi-step reasoning chains
- **Interactive plots show agent behavior patterns!**

**Multimodal AI Debugging**:
- Explore relationships between text, image, audio embeddings
- **3D scatter plots visualize embedding spaces!**

**You'll build dashboards like those at OpenAI, Anthropic, Google DeepMind!** üöÄ

---

## üìä What is Plotly?

**Plotly** = Interactive, web-based visualizations

Think of it as:
- Matplotlib: Static images üì∑
- Plotly: Interactive web apps üåê

**Key superpowers:**
- **Hover** to see exact values
- **Zoom** into interesting regions
- **Pan** around your data
- **Click** legends to show/hide
- **Export** to PNG, SVG, HTML
- **Deploy** to web with one click

Let's make your first interactive plot! ‚ú®

In [None]:
# Install Plotly (if needed)
# !pip install plotly

import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import pandas as pd
import numpy as np

print("Plotly version:", px.__version__)
print("‚úÖ Ready to create interactive visualizations!")

## üé® Your First Interactive Plot - One Line of Code!

In [None]:
# AI model training data
training_data = pd.DataFrame({
    'Epoch': np.arange(1, 51),
    'Training_Loss': 2.5 * np.exp(-0.08 * np.arange(1, 51)) + 0.1 * np.random.randn(50),
    'Validation_Loss': 2.5 * np.exp(-0.07 * np.arange(1, 51)) + 0.15 * np.random.randn(50),
    'Learning_Rate': 0.001 * 0.95 ** np.arange(1, 51)
})

# Create interactive line plot
fig = px.line(training_data, x='Epoch', y=['Training_Loss', 'Validation_Loss'],
             title='ü§ñ Interactive AI Training Monitor',
             labels={'value': 'Loss', 'variable': 'Type'})

fig.update_layout(hovermode='x unified')  # Unified hover tooltip
fig.show()

print("üéâ Try it:")
print("  - Hover over the line to see exact values")
print("  - Double-click legend items to isolate")
print("  - Single-click legend to show/hide")
print("  - Use toolbar to zoom, pan, reset")

## üìà Interactive Scatter Plots - Explore Data Relationships

### 1Ô∏è‚É£ Basic Interactive Scatter

In [None]:
# LLM model specifications
np.random.seed(42)
models_df = pd.DataFrame({
    'Model': ['GPT-4', 'Claude-3.5-Sonnet', 'Gemini-1.5-Pro', 'Llama-3.1-405B', 
             'GPT-3.5', 'Claude-3-Haiku', 'Gemini-1.5-Flash', 'Llama-3.1-70B',
             'GPT-4-Turbo', 'Mistral-Large', 'Command-R+', 'Phi-3'],
    'Parameters_B': [1800, 200, 540, 405, 175, 50, 100, 70, 1800, 176, 104, 14],
    'Benchmark_Score': [95, 94, 92, 89, 85, 88, 87, 86, 94, 88, 87, 82],
    'Cost_per_1M_tokens': [30, 15, 7, 5, 2, 1.5, 2.5, 3, 20, 8, 5, 1],
    'Release_Year': [2023, 2024, 2024, 2024, 2022, 2024, 2024, 2024, 2023, 2024, 2024, 2024],
    'Category': ['Frontier', 'Frontier', 'Frontier', 'Open', 
                'Legacy', 'Fast', 'Fast', 'Open',
                'Frontier', 'Specialized', 'Specialized', 'Efficient']
})

# Create interactive scatter with multiple features
fig = px.scatter(models_df, 
                x='Cost_per_1M_tokens', 
                y='Benchmark_Score',
                size='Parameters_B',  # Bubble size
                color='Category',  # Color by category
                hover_name='Model',  # Show on hover
                hover_data={'Parameters_B': ':.0f', 'Release_Year': True},
                title='üîç LLM Performance vs Cost Analysis (2024)',
                labels={'Cost_per_1M_tokens': 'Cost per 1M Tokens ($)',
                       'Benchmark_Score': 'Performance Score'})

fig.update_layout(height=600)
fig.show()

print("üí° Insights:")
print("  - Frontier models: High performance, high cost")
print("  - Efficient models: Good performance/cost ratio")
print("  - Try hovering to see each model's details!")

### 2Ô∏è‚É£ Animated Scatter - Show Trends Over Time!

In [None]:
# Create historical data (2020-2024)
historical_data = pd.DataFrame({
    'Year': np.repeat([2020, 2021, 2022, 2023, 2024], 5),
    'Model': ['GPT-3', 'BERT', 'T5', 'RoBERTa', 'XLNet'] * 5,
    'Parameters_B': np.array([175, 110, 11, 125, 54] * 5) * np.repeat([1, 1.1, 1.2, 1.3, 1.4], 5),
    'Benchmark_Score': np.array([70, 65, 68, 72, 66]) + np.repeat([0, 3, 6, 10, 15], 5) + np.random.randn(25),
    'Category': ['LLM', 'Encoder', 'Seq2Seq', 'Encoder', 'Encoder'] * 5
})

# Create animated scatter
fig = px.scatter(historical_data,
                x='Parameters_B',
                y='Benchmark_Score',
                animation_frame='Year',  # Animate over years!
                animation_group='Model',  # Group by model
                size='Parameters_B',
                color='Category',
                hover_name='Model',
                title='üìà AI Model Evolution (2020-2024)',
                range_x=[0, 300],
                range_y=[60, 95])

fig.update_layout(height=600)
fig.show()

print("üé¨ Press PLAY to see AI evolution over time!")
print("   Notice: Models get bigger AND smarter each year!")

## üìä Interactive Bar Charts & Histograms

In [None]:
# Create AI usage statistics
usage_stats = pd.DataFrame({
    'Use_Case': ['Code Generation', 'Content Writing', 'Data Analysis', 
                'Customer Support', 'Research', 'Translation', 'Summarization'],
    'GPT-4': [92, 88, 85, 90, 94, 89, 91],
    'Claude-3.5': [94, 96, 87, 88, 92, 87, 93],
    'Gemini-Pro': [90, 85, 90, 86, 88, 91, 89]
})

usage_long = usage_stats.melt(id_vars='Use_Case', var_name='Model', value_name='Score')

# Interactive grouped bar chart
fig = px.bar(usage_long,
            x='Use_Case',
            y='Score',
            color='Model',
            barmode='group',  # Grouped bars
            title='üéØ LLM Performance Across Use Cases',
            labels={'Score': 'Performance Score', 'Use_Case': 'Use Case'})

fig.update_layout(height=500)
fig.show()

print("üé® Try clicking legend items to compare models!")

## üå°Ô∏è Interactive Heatmaps - Correlation Analysis

In [None]:
# Create feature correlation matrix for ML model
features = ['Age', 'Income', 'Credit_Score', 'Debt_Ratio', 'Loan_Amount', 'Interest_Rate']
correlation_matrix = np.array([
    [1.00, 0.45, 0.62, -0.23, 0.38, -0.15],
    [0.45, 1.00, 0.78, -0.45, 0.68, -0.52],
    [0.62, 0.78, 1.00, -0.67, 0.42, -0.71],
    [-0.23, -0.45, -0.67, 1.00, -0.15, 0.82],
    [0.38, 0.68, 0.42, -0.15, 1.00, -0.34],
    [-0.15, -0.52, -0.71, 0.82, -0.34, 1.00]
])

# Create interactive heatmap
fig = go.Figure(data=go.Heatmap(
    z=correlation_matrix,
    x=features,
    y=features,
    colorscale='RdBu',
    zmid=0,  # Center colorscale at 0
    text=correlation_matrix,
    texttemplate='%{text:.2f}',
    textfont={"size": 12},
    colorbar=dict(title="Correlation")
))

fig.update_layout(
    title='üî• Feature Correlation Matrix - Loan Prediction Model',
    xaxis_title='Features',
    yaxis_title='Features',
    height=600
)

fig.show()

print("üí° Feature engineering insights:")
print("  - Strong positive: Income ‚Üî Credit_Score (0.78)")
print("  - Strong negative: Credit_Score ‚Üî Interest_Rate (-0.71)")
print("  - Hover to see exact correlations!")

## üéØ 3D Scatter Plots - Visualize Embeddings!

In [None]:
# Simulate text embeddings (like BERT, GPT embeddings)
np.random.seed(42)
n_samples = 200

# Create 3 clusters (representing different topics)
cluster1 = np.random.randn(n_samples//3, 3) + [2, 2, 2]  # Tech
cluster2 = np.random.randn(n_samples//3, 3) + [-2, -2, 2]  # Health
cluster3 = np.random.randn(n_samples//3, 3) + [0, -2, -2]  # Finance

embeddings_3d = pd.DataFrame(
    np.vstack([cluster1, cluster2, cluster3]),
    columns=['Dim_1', 'Dim_2', 'Dim_3']
)
embeddings_3d['Topic'] = ['Tech']*(n_samples//3) + ['Health']*(n_samples//3) + ['Finance']*(n_samples//3)
embeddings_3d['Document_ID'] = [f'Doc_{i}' for i in range(len(embeddings_3d))]

# Create 3D scatter plot
fig = px.scatter_3d(embeddings_3d,
                    x='Dim_1',
                    y='Dim_2',
                    z='Dim_3',
                    color='Topic',
                    hover_name='Document_ID',
                    title='üåê 3D Text Embedding Visualization (t-SNE/UMAP style)',
                    labels={'Dim_1': 'Dimension 1',
                           'Dim_2': 'Dimension 2',
                           'Dim_3': 'Dimension 3'})

fig.update_traces(marker=dict(size=5))
fig.update_layout(height=700)
fig.show()

print("üéÆ 3D Interaction:")
print("  - Click and drag to rotate")
print("  - Scroll to zoom")
print("  - Hover to see document IDs")
print("\n‚ú® This is how RAG systems visualize document similarity!")

## üìä Subplots - Create Dashboards!

In [None]:
# Create comprehensive AI monitoring dashboard
np.random.seed(42)
hours = np.arange(24)

dashboard_data = {
    'Hour': hours,
    'API_Calls': np.random.poisson(500, 24) + hours * 20,
    'Avg_Latency_ms': np.random.gamma(2, 100, 24),
    'Error_Rate': np.random.beta(2, 100, 24),
    'Cost_USD': (np.random.poisson(500, 24) + hours * 20) * 0.002
}

# Create 2x2 subplot dashboard
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=('API Calls Over Time', 'Average Latency',
                   'Error Rate', 'Cumulative Cost'),
    specs=[[{'type': 'scatter'}, {'type': 'scatter'}],
          [{'type': 'scatter'}, {'type': 'scatter'}]]
)

# Plot 1: API Calls
fig.add_trace(
    go.Scatter(x=hours, y=dashboard_data['API_Calls'], 
              mode='lines+markers', name='API Calls',
              line=dict(color='blue', width=2)),
    row=1, col=1
)

# Plot 2: Latency
fig.add_trace(
    go.Scatter(x=hours, y=dashboard_data['Avg_Latency_ms'],
              mode='lines+markers', name='Latency',
              line=dict(color='orange', width=2)),
    row=1, col=2
)

# Plot 3: Error Rate
fig.add_trace(
    go.Scatter(x=hours, y=dashboard_data['Error_Rate'] * 100,
              mode='lines+markers', name='Error Rate',
              line=dict(color='red', width=2)),
    row=2, col=1
)

# Plot 4: Cumulative Cost
fig.add_trace(
    go.Scatter(x=hours, y=np.cumsum(dashboard_data['Cost_USD']),
              mode='lines+markers', name='Cost',
              line=dict(color='green', width=2),
              fill='tozeroy'),
    row=2, col=2
)

# Update layout
fig.update_xaxes(title_text="Hour of Day", row=2, col=1)
fig.update_xaxes(title_text="Hour of Day", row=2, col=2)
fig.update_yaxes(title_text="Calls", row=1, col=1)
fig.update_yaxes(title_text="ms", row=1, col=2)
fig.update_yaxes(title_text="Error %", row=2, col=1)
fig.update_yaxes(title_text="USD", row=2, col=2)

fig.update_layout(
    title_text='üöÄ AI API Monitoring Dashboard - Live Production Metrics',
    height=800,
    showlegend=False
)

fig.show()

print("üìä This is what production AI monitoring looks like!")
print("   Used by: OpenAI, Anthropic, Cohere, all AI companies")

## üéØ Real AI Example: Model Performance Comparison

In [None]:
# Create comprehensive model benchmark data
benchmarks = pd.DataFrame({
    'Model': ['GPT-4', 'Claude-3.5', 'Gemini-1.5-Pro', 'Llama-3.1-405B'] * 6,
    'Benchmark': ['MMLU', 'GSM8K', 'HumanEval', 'TruthfulQA', 'HellaSwag', 'ARC'] * 4,
    'Score': [
        # GPT-4
        86.4, 92.0, 67.0, 59.0, 95.3, 96.3,
        # Claude-3.5  
        88.7, 96.4, 73.0, 62.0, 95.4, 96.4,
        # Gemini-1.5-Pro
        85.9, 91.7, 71.9, 57.0, 92.5, 95.6,
        # Llama-3.1-405B
        85.2, 89.0, 61.0, 55.0, 89.3, 94.8
    ]
})

# Create interactive grouped bar chart
fig = px.bar(benchmarks,
            x='Benchmark',
            y='Score',
            color='Model',
            barmode='group',
            title='üèÜ LLM Benchmark Comparison (2024)',
            labels={'Score': 'Benchmark Score (%)', 'Benchmark': 'Test'},
            text='Score')

fig.update_traces(texttemplate='%{text:.1f}', textposition='outside')
fig.update_layout(height=600)
fig.show()

print("üìä Benchmark meanings:")
print("  - MMLU: General knowledge (57 subjects)")
print("  - GSM8K: Math word problems")
print("  - HumanEval: Code generation")
print("  - TruthfulQA: Truthfulness")
print("  - HellaSwag: Common sense reasoning")
print("  - ARC: Science questions")

## üåä Waterfall Charts - Show Cumulative Effects

In [None]:
# AI model optimization journey
optimization_steps = {
    'Step': ['Baseline', 'Better Prompt', 'RAG Added', 'Fine-tuning', 
            'Caching', 'Batch Processing', 'Final'],
    'Improvement': [0, 15, 25, 20, -10, -8, 0],  # Negative = cost reduction
    'Measure': ['relative'] * 6 + ['total']
}

fig = go.Figure(go.Waterfall(
    name="Accuracy",
    orientation="v",
    measure=optimization_steps['Measure'],
    x=optimization_steps['Step'],
    y=optimization_steps['Improvement'],
    textposition="outside",
    text=[f"+{v}%" if v > 0 else f"{v}%" for v in optimization_steps['Improvement']],
    connector={"line": {"color": "rgb(63, 63, 63)"}},
))

fig.update_layout(
    title="üéØ AI Model Optimization Journey",
    xaxis_title="Optimization Step",
    yaxis_title="Accuracy Improvement (%)",
    height=500
)

fig.show()

print("üí° Optimization insights:")
print("  - Biggest win: RAG (+25%)")
print("  - Trade-offs: Caching reduces accuracy slightly (-10%)")
print("  - Final improvement: +42% total!")

## üéØ MINI CHALLENGE: Build an LLM Cost Optimizer Dashboard

**Scenario:** You need to optimize LLM API costs across different models!

**Your Task:** Create an interactive dashboard with:
1. Cost per model over time
2. Quality vs Cost scatter plot
3. Token usage distribution
4. Recommendation engine visualization

In [None]:
# Create realistic LLM usage data
np.random.seed(42)
days = 30

cost_data = pd.DataFrame({
    'Day': np.tile(np.arange(1, days+1), 4),
    'Model': np.repeat(['GPT-4', 'GPT-3.5', 'Claude-3.5', 'Llama-3'], days),
    'Daily_Cost': np.concatenate([
        np.random.gamma(3, 30, days),  # GPT-4
        np.random.gamma(2, 10, days),  # GPT-3.5
        np.random.gamma(2.5, 20, days),  # Claude
        np.random.gamma(1.5, 5, days)   # Llama
    ]),
    'Quality_Score': np.concatenate([
        np.random.beta(9, 1, days) * 100,
        np.random.beta(7, 2, days) * 100,
        np.random.beta(8, 1.5, days) * 100,
        np.random.beta(6, 2.5, days) * 100
    ]),
    'Tokens_Used': np.concatenate([
        np.random.poisson(50000, days),
        np.random.poisson(150000, days),
        np.random.poisson(80000, days),
        np.random.poisson(200000, days)
    ])
})

# TODO: Create your dashboard!
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=('Daily Cost Trends', 'Quality vs Cost Analysis',
                   'Token Usage Distribution', 'Cost Efficiency Score'),
    specs=[[{'type': 'scatter'}, {'type': 'scatter'}],
          [{'type': 'box'}, {'type': 'bar'}]]
)

# Plot 1: Cost trends
for model in ['GPT-4', 'GPT-3.5', 'Claude-3.5', 'Llama-3']:
    model_data = cost_data[cost_data['Model'] == model]
    fig.add_trace(
        go.Scatter(x=model_data['Day'], y=model_data['Daily_Cost'],
                  mode='lines', name=model),
        row=1, col=1
    )

# Plot 2: Quality vs Cost
for model in ['GPT-4', 'GPT-3.5', 'Claude-3.5', 'Llama-3']:
    model_data = cost_data[cost_data['Model'] == model]
    fig.add_trace(
        go.Scatter(x=model_data['Daily_Cost'], y=model_data['Quality_Score'],
                  mode='markers', name=model, showlegend=False),
        row=1, col=2
    )

# Plot 3: Token distribution
for model in ['GPT-4', 'GPT-3.5', 'Claude-3.5', 'Llama-3']:
    model_data = cost_data[cost_data['Model'] == model]
    fig.add_trace(
        go.Box(y=model_data['Tokens_Used'], name=model, showlegend=False),
        row=2, col=1
    )

# Plot 4: Efficiency (Quality / Cost)
efficiency = cost_data.groupby('Model').apply(
    lambda x: x['Quality_Score'].mean() / x['Daily_Cost'].mean()
).reset_index(name='Efficiency')

fig.add_trace(
    go.Bar(x=efficiency['Model'], y=efficiency['Efficiency'],
          text=efficiency['Efficiency'].round(2),
          textposition='outside', showlegend=False),
    row=2, col=2
)

# Update axes
fig.update_xaxes(title_text="Day", row=1, col=1)
fig.update_xaxes(title_text="Daily Cost ($)", row=1, col=2)
fig.update_xaxes(title_text="Model", row=2, col=1)
fig.update_xaxes(title_text="Model", row=2, col=2)

fig.update_yaxes(title_text="Cost ($)", row=1, col=1)
fig.update_yaxes(title_text="Quality Score", row=1, col=2)
fig.update_yaxes(title_text="Tokens", row=2, col=1)
fig.update_yaxes(title_text="Quality/Cost", row=2, col=2)

fig.update_layout(
    title_text='üí∞ LLM Cost Optimization Dashboard',
    height=900
)

fig.show()

# Calculate insights
total_cost = cost_data.groupby('Model')['Daily_Cost'].sum()
avg_quality = cost_data.groupby('Model')['Quality_Score'].mean()

print("\nüí° Cost Optimization Recommendations:")
print(f"\nTotal 30-day costs:")
for model in total_cost.index:
    print(f"  {model}: ${total_cost[model]:.2f} (Quality: {avg_quality[model]:.1f}/100)")

print(f"\nüèÜ Best efficiency: {efficiency.loc[efficiency['Efficiency'].idxmax(), 'Model']}")
print(f"üí∏ Potential savings: Switch low-priority tasks to efficient models!")

## üíæ Saving Interactive Plots

In [None]:
# Create a sample plot
fig = px.scatter(models_df, x='Cost_per_1M_tokens', y='Benchmark_Score',
                size='Parameters_B', color='Category', hover_name='Model',
                title='LLM Performance vs Cost')

# Save as interactive HTML (can share with anyone!)
fig.write_html('llm_analysis.html')

# Save as static image
# fig.write_image('llm_analysis.png')  # Requires kaleido: pip install kaleido

# Save as JSON (can reload later)
# fig.write_json('llm_analysis.json')

print("‚úÖ Saved interactive plot as HTML!")
print("   You can:")
print("   - Open in browser")
print("   - Share with colleagues")
print("   - Embed in websites")
print("   - Host on GitHub Pages")

## üéâ Congratulations!

**You just learned:**
- ‚úÖ Why Plotly is essential for modern AI dashboards
- ‚úÖ Interactive line, scatter, bar, histogram plots
- ‚úÖ Animated visualizations (time series)
- ‚úÖ 3D scatter plots (embedding visualization)
- ‚úÖ Heatmaps and correlation matrices
- ‚úÖ Multi-plot dashboards with subplots
- ‚úÖ Real AI examples (cost optimization, model comparison)
- ‚úÖ Saving and sharing interactive plots

**üéØ Plotly Cheat Sheet:**
```python
# Plotly Express (high-level, easy)
px.line()        # Line plots
px.scatter()     # Scatter plots
px.bar()         # Bar charts
px.histogram()   # Histograms
px.box()         # Box plots
px.scatter_3d()  # 3D scatter
px.scatter(animation_frame='Year')  # Animated!

# Graph Objects (low-level, powerful)
go.Figure()      # Custom figures
go.Scatter()     # Custom scatter
go.Bar()         # Custom bars
go.Heatmap()     # Heatmaps
make_subplots()  # Dashboards

# Save and share
fig.write_html('plot.html')  # Interactive HTML
fig.show()                   # Display
```

**üéØ Practice Exercise:**

Build a complete AI model monitoring dashboard with:
1. Real-time accuracy plot (simulated streaming data)
2. Confusion matrix heatmap
3. Feature importance bar chart
4. Prediction distribution histogram
5. 3D decision boundary visualization
6. Make it fully interactive and shareable!

---

**üìö Next Week:** Week 4 - Math for AI (The Foundation of ML!)

**üí° Fun Fact:** 
- Plotly powers dashboards at Tesla, Google, Netflix
- TensorBoard (Google) uses similar interactive plots
- Weights & Biases, Langsmith use Plotly for ML tracking

---

**üéñÔ∏è Achievement Unlocked: Visualization Master!**

*You can now create world-class AI visualizations and dashboards!* üöÄ

---

**Week 3 Complete! üéä**

You've mastered:
- Day 1: Matplotlib (foundation)
- Day 2: Seaborn (statistical)
- Day 3: Plotly (interactive)

**You're now ready to visualize ANY AI/ML project!** ‚ú®