# Visualization and Export

Learn how to create publication-quality charts and export results.

This notebook demonstrates:
1. SDK theme for consistent styling
2. Cost over time line charts
3. Failures by year bar charts
4. Risk distribution histograms
5. Scenario comparison charts
6. Customizing and combining charts

## Setup

In [None]:
# Core imports
import pandas as pd
import numpy as np
from datetime import date, timedelta
import matplotlib.pyplot as plt

# SDK imports
from asset_optimization import (
    Portfolio,
    WeibullModel,
    Simulator,
    SimulationConfig,
    Optimizer,
    set_sdk_theme,
    plot_cost_over_time,
    plot_failures_by_year,
    plot_risk_distribution,
    plot_scenario_comparison,
    compare,
)

## 1. Create Sample Data

First, we'll create simulation and optimization results to visualize.

In [None]:
# Generate synthetic portfolio
np.random.seed(42)

n_assets = 500
materials = ['Cast Iron', 'PVC', 'Ductile Iron']

base_date = date(2024, 1, 1)
install_dates = [
    base_date - timedelta(days=int(np.random.uniform(20*365, 80*365)))
    for _ in range(n_assets)
]

data = pd.DataFrame({
    'asset_id': [f'PIPE-{i:04d}' for i in range(n_assets)],
    'install_date': pd.to_datetime(install_dates),
    'asset_type': 'pipe',
    'material': np.random.choice(materials, n_assets, p=[0.4, 0.35, 0.25]),
    'diameter_mm': np.random.choice([150, 200, 300, 400], n_assets),
    'length_m': np.random.uniform(50, 500, n_assets).round(0),
})

portfolio = Portfolio.from_dataframe(data)
print(portfolio)

In [None]:
# Create deterioration model
params = {
    'Cast Iron': (3.0, 60),
    'PVC': (2.5, 80),
    'Ductile Iron': (2.8, 70),
}
model = WeibullModel(params)

In [None]:
# Run simulation
config = SimulationConfig(
    n_years=10,
    start_year=2024,
    random_seed=42,
    failure_response='replace',
)

sim = Simulator(model, config)
sim_result = sim.run(portfolio)
print(sim_result)

In [None]:
# Run optimization
optimizer = Optimizer(strategy='greedy', min_risk_threshold=0.1)
optimizer.fit(portfolio, model, budget=500_000)
opt_result = optimizer.result
print(opt_result)

## 2. SDK Theme

All plots use a consistent professional theme. Call `set_sdk_theme()` once at the start of your notebook.

The theme provides:
- Clean white background with subtle grid
- Professional blue color palette
- Readable fonts and sizes

In [None]:
# Apply SDK theme (call once at notebook start)
set_sdk_theme()
print("SDK theme applied")

## 3. Cost Over Time

Line chart showing total cost trajectory over the simulation period.

In [None]:
# Basic cost over time chart
ax = plot_cost_over_time(sim_result)
plt.show()

In [None]:
# With custom title
ax = plot_cost_over_time(sim_result, title='Projected Maintenance Costs (2024-2033)')
plt.show()

## 4. Failures by Year

Bar chart showing failure counts per year.

In [None]:
# Basic failures chart
ax = plot_failures_by_year(sim_result)
plt.show()

In [None]:
# With custom title
ax = plot_failures_by_year(sim_result, title='Expected Asset Failures (2024-2033)')
plt.show()

## 5. Risk Distribution

Histogram showing the distribution of risk scores for selected interventions.

In [None]:
# Risk distribution of selected assets
ax = plot_risk_distribution(opt_result.selections)
plt.show()

In [None]:
# You can also plot failure probability from portfolio data
# First enrich the portfolio with failure probabilities
portfolio_with_risk = portfolio.data.copy()
portfolio_with_risk['age'] = (
    (pd.Timestamp.now() - portfolio_with_risk['install_date']).dt.days / 365.25
)
portfolio_enriched = model.transform(portfolio_with_risk)

# Plot with different column name
ax = plot_risk_distribution(
    portfolio_enriched,
    risk_column='failure_probability',
    title='Portfolio-Wide Failure Probability Distribution',
    bins=30,
)
plt.show()

## 6. Scenario Comparison

Compare the optimized scenario against a 'do nothing' baseline.

In [None]:
# Compare simulation result against auto-generated baseline
comparison = compare(sim_result, baseline='do_nothing')

print("Comparison DataFrame:")
comparison.head(10)

In [None]:
# Plot total cost comparison
ax = plot_scenario_comparison(comparison, metric='total_cost')
plt.show()

In [None]:
# Plot failure count comparison
ax = plot_scenario_comparison(comparison, metric='failure_count')
plt.show()

## 7. Customizing Charts

All plot functions return `matplotlib.axes.Axes` objects for further customization.

In [None]:
# Get axes and customize
ax = plot_cost_over_time(sim_result)

# Add annotations
ax.set_ylim(0, ax.get_ylim()[1] * 1.1)  # Add 10% headroom
ax.axhline(y=sim_result.summary['total_cost'].mean(), 
           color='orange', linestyle='--', alpha=0.7, 
           label='Average')
ax.legend()

plt.show()

In [None]:
# Provide your own figure size
ax = plot_failures_by_year(sim_result, figsize=(12, 4))
ax.set_title('Wide Format Chart')
plt.show()

## 8. Exporting Results

### To Parquet

In [None]:
# Export simulation results
sim_result.to_parquet('sim_summary.parquet', format='summary')
sim_result.to_parquet('sim_projections.parquet', format='cost_projections')

# Export optimization results
opt_result.to_parquet('opt_schedule.parquet', format='minimal', year=2024)

print("Files exported:")
print("  - sim_summary.parquet")
print("  - sim_projections.parquet")
print("  - opt_schedule.parquet")

### Reading Exports

In [None]:
# Read back parquet files
summary = pd.read_parquet('sim_summary.parquet')
print("Simulation Summary:")
summary

In [None]:
# Long format is ready for seaborn/matplotlib
projections = pd.read_parquet('sim_projections.parquet')
print("Cost Projections (long format):")
projections.head(12)

### Saving Charts

In [None]:
# Save chart to file
ax = plot_cost_over_time(sim_result)
plt.savefig('cost_chart.png', dpi=150, bbox_inches='tight')
print("Saved: cost_chart.png")
plt.close()

## 9. Creating Multi-Panel Figures

Combine multiple charts into a single figure for reports or dashboards.

In [None]:
# Create 2x2 figure
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Top left: Cost over time
plot_cost_over_time(sim_result, ax=axes[0, 0], title='Total Cost Over Time')

# Top right: Failures by year
plot_failures_by_year(sim_result, ax=axes[0, 1], title='Failures by Year')

# Bottom left: Risk distribution
plot_risk_distribution(opt_result.selections, ax=axes[1, 0], title='Selected Assets Risk Distribution')

# Bottom right: Scenario comparison
plot_scenario_comparison(comparison, metric='total_cost', ax=axes[1, 1])

# Add overall title
fig.suptitle('Asset Optimization Dashboard', fontsize=16, fontweight='bold', y=1.02)

plt.tight_layout()
plt.show()

In [None]:
# Save multi-panel figure
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

plot_cost_over_time(sim_result, ax=axes[0, 0])
plot_failures_by_year(sim_result, ax=axes[0, 1])
plot_risk_distribution(opt_result.selections, ax=axes[1, 0])
plot_scenario_comparison(comparison, metric='total_cost', ax=axes[1, 1])

fig.suptitle('Asset Optimization Dashboard', fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()

plt.savefig('dashboard.png', dpi=150, bbox_inches='tight')
print("Saved: dashboard.png")
plt.close()

## Summary

This notebook covered the visualization and export capabilities:

1. **SDK Theme**: `set_sdk_theme()` for consistent styling
2. **Four Chart Types**:
   - `plot_cost_over_time()` - Line chart of costs
   - `plot_failures_by_year()` - Bar chart of failures
   - `plot_risk_distribution()` - Histogram of risk scores
   - `plot_scenario_comparison()` - Grouped bar chart for scenarios
3. **Customization**: All functions return axes for further customization
4. **Export**: Parquet format for data, PNG/PDF for charts
5. **Multi-Panel Figures**: Combine charts into dashboards

In [None]:
# Clean up temporary files
import os
for f in ['sim_summary.parquet', 'sim_projections.parquet', 
          'opt_schedule.parquet', 'cost_chart.png', 'dashboard.png']:
    if os.path.exists(f):
        os.remove(f)
        print(f"Cleaned up: {f}")