# Week 9: Advanced Visualization and Interactive Dashboards

**Instructor**: Sohn Chul

## Learning Objectives

By the end of this session, you will be able to:
1. Create advanced static visualizations with seaborn and matplotlib
2. Build interactive visualizations using Plotly
3. Develop interactive maps with Folium
4. Create real-time dashboards with Streamlit
5. Design effective data stories for climate analysis
6. Implement animation techniques for temporal data

## Prerequisites
- Completion of Week 1-8 materials
- Understanding of KMA heat index formula
- Basic knowledge of Python visualization libraries

## 1. Setup and Data Preparation

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import folium
from folium import plugins
from folium.features import DivIcon
import streamlit as st
from datetime import datetime, timedelta
import warnings
import json
from pathlib import Path

warnings.filterwarnings('ignore')

# Set style configurations
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

# Configure for better display
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)

In [None]:
# KMA Heat Index calculation functions
def calculate_wet_bulb_temperature(Ta, RH):
    """
    Calculate wet-bulb temperature using Stull's formula.
    
    Parameters:
    Ta: Air temperature (°C)
    RH: Relative humidity (%)
    
    Returns:
    Tw: Wet-bulb temperature (°C)
    """
    Tw = (Ta * np.arctan(0.151977 * (RH + 8.313659)**0.5) + 
          np.arctan(Ta + RH) - 
          np.arctan(RH - 1.67633) + 
          0.00391838 * RH**1.5 * np.arctan(0.023101 * RH) - 
          4.686035)
    return Tw

def calculate_heat_index_kma(Ta, RH):
    """
    Calculate heat index using Korea Meteorological Administration (KMA) formula.
    
    Parameters:
    Ta: Air temperature (°C)
    RH: Relative humidity (%)
    
    Returns:
    HI: Heat index (°C)
    """
    Tw = calculate_wet_bulb_temperature(Ta, RH)
    HI = (-0.2442 + 0.55399 * Tw + 0.45535 * Ta - 
          0.0022 * Tw**2 + 0.00278 * Tw * Ta + 3.0)
    return HI

In [None]:
# Generate comprehensive synthetic dataset
np.random.seed(42)

# Create temporal range
date_range = pd.date_range(start='2025-04-01', end='2025-08-31', freq='10min')

# Seoul districts for spatial analysis
districts = ['Gangnam', 'Gangdong', 'Gangbuk', 'Gangseo', 'Gwanak',
             'Gwangjin', 'Guro', 'Geumcheon', 'Nowon', 'Dobong',
             'Dongdaemun', 'Dongjak', 'Mapo', 'Seodaemun', 'Seocho',
             'Seongdong', 'Seongbuk', 'Songpa', 'Yangcheon', 'Yeongdeungpo',
             'Yongsan', 'Eunpyeong', 'Jongno', 'Jung', 'Jungnang']

# Sample for manageable size
sample_dates = np.random.choice(date_range, size=5000, replace=False)
sample_dates = np.sort(sample_dates)

# Create dataset
data_list = []
for district in districts[:10]:  # Use 10 districts for demo
    for date in sample_dates:
        # Simulate realistic temperature and humidity patterns
        hour = date.hour
        month = date.month
        
        # Base temperature varies by month and hour
        base_temp = 15 + (month - 4) * 3  # Increase from April to August
        daily_variation = 8 * np.sin((hour - 6) * np.pi / 12) if 6 <= hour <= 18 else -2
        temp = base_temp + daily_variation + np.random.normal(0, 2)
        
        # Humidity inversely related to temperature
        base_humidity = 70 - (month - 4) * 5
        humidity = base_humidity - daily_variation * 2 + np.random.normal(0, 5)
        humidity = np.clip(humidity, 20, 95)
        
        # Calculate KMA heat index
        heat_index = calculate_heat_index_kma(temp, humidity)
        
        # Add urban heat island effect for certain districts
        if district in ['Gangnam', 'Jung', 'Yongsan']:
            temp += np.random.uniform(0.5, 2)
            heat_index = calculate_heat_index_kma(temp, humidity)
        
        data_list.append({
            'timestamp': date,
            'district': district,
            'temperature': temp,
            'humidity': humidity,
            'heat_index': heat_index,
            'lat': 37.5665 + np.random.uniform(-0.1, 0.1),
            'lon': 126.9780 + np.random.uniform(-0.15, 0.15)
        })

df = pd.DataFrame(data_list)
df['date'] = df['timestamp'].dt.date
df['hour'] = df['timestamp'].dt.hour
df['month'] = df['timestamp'].dt.month
df['day_of_week'] = df['timestamp'].dt.dayofweek

print(f"Dataset shape: {df.shape}")
print(f"Date range: {df['timestamp'].min()} to {df['timestamp'].max()}")
df.head()

## 2. Advanced Static Visualizations

In [None]:
# Create a comprehensive visualization dashboard
fig = plt.figure(figsize=(20, 12))
gs = fig.add_gridspec(3, 3, hspace=0.3, wspace=0.3)

# 1. Heat Index Distribution by District
ax1 = fig.add_subplot(gs[0, :])
district_heat = df.groupby('district')['heat_index'].apply(list)
ax1.violinplot(district_heat.values, positions=range(len(district_heat)),
               showmeans=True, showmedians=True)
ax1.set_xticks(range(len(district_heat)))
ax1.set_xticklabels(district_heat.index, rotation=45)
ax1.set_title('KMA Heat Index Distribution by District', fontsize=14, fontweight='bold')
ax1.set_ylabel('Heat Index (°C)')
ax1.grid(True, alpha=0.3)

# 2. Temporal Heat Index Patterns
ax2 = fig.add_subplot(gs[1, 0])
hourly_avg = df.groupby('hour')['heat_index'].mean()
hourly_std = df.groupby('hour')['heat_index'].std()
ax2.plot(hourly_avg.index, hourly_avg.values, 'b-', linewidth=2, label='Mean')
ax2.fill_between(hourly_avg.index, 
                  hourly_avg - hourly_std, 
                  hourly_avg + hourly_std, 
                  alpha=0.3, label='±1 STD')
ax2.set_xlabel('Hour of Day')
ax2.set_ylabel('Heat Index (°C)')
ax2.set_title('Daily Heat Index Pattern', fontsize=12, fontweight='bold')
ax2.legend()
ax2.grid(True, alpha=0.3)

# 3. Temperature vs Humidity Scatter with Heat Index
ax3 = fig.add_subplot(gs[1, 1])
scatter = ax3.scatter(df['temperature'], df['humidity'], 
                      c=df['heat_index'], cmap='RdYlBu_r',
                      alpha=0.5, s=1)
plt.colorbar(scatter, ax=ax3, label='Heat Index (°C)')
ax3.set_xlabel('Temperature (°C)')
ax3.set_ylabel('Humidity (%)')
ax3.set_title('Temperature-Humidity-Heat Index Relationship', fontsize=12, fontweight='bold')
ax3.grid(True, alpha=0.3)

# 4. Monthly Heat Index Trends
ax4 = fig.add_subplot(gs[1, 2])
monthly_data = df.groupby('month')['heat_index'].agg(['mean', 'min', 'max'])
x = monthly_data.index
ax4.plot(x, monthly_data['mean'], 'ro-', linewidth=2, markersize=8, label='Mean')
ax4.fill_between(x, monthly_data['min'], monthly_data['max'], 
                  alpha=0.3, color='red', label='Min-Max Range')
ax4.set_xlabel('Month')
ax4.set_ylabel('Heat Index (°C)')
ax4.set_title('Monthly Heat Index Progression', fontsize=12, fontweight='bold')
ax4.set_xticks([4, 5, 6, 7, 8])
ax4.set_xticklabels(['Apr', 'May', 'Jun', 'Jul', 'Aug'])
ax4.legend()
ax4.grid(True, alpha=0.3)

# 5. Heatmap of Heat Index by Hour and Day of Week
ax5 = fig.add_subplot(gs[2, :])
pivot_data = df.pivot_table(values='heat_index', 
                             index='hour', 
                             columns='day_of_week', 
                             aggfunc='mean')
im = ax5.imshow(pivot_data.T, cmap='RdYlBu_r', aspect='auto')
ax5.set_yticks(range(7))
ax5.set_yticklabels(['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'])
ax5.set_xlabel('Hour of Day')
ax5.set_ylabel('Day of Week')
ax5.set_title('Heat Index Patterns by Hour and Day of Week', fontsize=12, fontweight='bold')
plt.colorbar(im, ax=ax5, label='Heat Index (°C)')

# Add annotations for peak hours
for i in range(7):
    for j in range(24):
        if j in [13, 14, 15] and pivot_data.iloc[j, i] > pivot_data.mean().mean() + 5:
            ax5.text(j, i, '⚠', ha='center', va='center', color='white', fontsize=12)

plt.suptitle('Seoul Heatwave Analysis Dashboard - KMA Heat Index', 
             fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

## 3. Interactive Visualizations with Plotly

In [None]:
# Create interactive time series with range slider
daily_avg = df.groupby('date').agg({
    'heat_index': 'mean',
    'temperature': 'mean',
    'humidity': 'mean'
}).reset_index()

fig = make_subplots(
    rows=3, cols=1,
    shared_xaxes=True,
    vertical_spacing=0.05,
    subplot_titles=('KMA Heat Index', 'Temperature', 'Humidity'),
    row_heights=[0.4, 0.3, 0.3]
)

# Heat Index
fig.add_trace(
    go.Scatter(
        x=daily_avg['date'],
        y=daily_avg['heat_index'],
        name='Heat Index',
        line=dict(color='red', width=2),
        hovertemplate='Date: %{x}<br>Heat Index: %{y:.1f}°C<extra></extra>'
    ),
    row=1, col=1
)

# Add danger zone
fig.add_hrect(y0=33, y1=40, line_width=0, fillcolor="red", opacity=0.2,
              annotation_text="Danger Zone", annotation_position="top left",
              row=1, col=1)

# Temperature
fig.add_trace(
    go.Scatter(
        x=daily_avg['date'],
        y=daily_avg['temperature'],
        name='Temperature',
        line=dict(color='orange', width=2),
        hovertemplate='Date: %{x}<br>Temperature: %{y:.1f}°C<extra></extra>'
    ),
    row=2, col=1
)

# Humidity
fig.add_trace(
    go.Scatter(
        x=daily_avg['date'],
        y=daily_avg['humidity'],
        name='Humidity',
        line=dict(color='blue', width=2),
        hovertemplate='Date: %{x}<br>Humidity: %{y:.1f}%<extra></extra>'
    ),
    row=3, col=1
)

# Update layout
fig.update_xaxes(title_text="Date", row=3, col=1)
fig.update_yaxes(title_text="Heat Index (°C)", row=1, col=1)
fig.update_yaxes(title_text="Temperature (°C)", row=2, col=1)
fig.update_yaxes(title_text="Humidity (%)", row=3, col=1)

fig.update_layout(
    title={
        'text': 'Interactive Time Series Analysis - Seoul Climate Data',
        'x': 0.5,
        'xanchor': 'center'
    },
    hovermode='x unified',
    xaxis=dict(
        rangeselector=dict(
            buttons=list([
                dict(count=7, label="1w", step="day", stepmode="backward"),
                dict(count=1, label="1m", step="month", stepmode="backward"),
                dict(step="all")
            ])
        ),
        rangeslider=dict(visible=True),
        type="date"
    ),
    height=800,
    showlegend=False
)

fig.show()

In [None]:
# 3D Surface Plot of Heat Index
# Create grid data for surface plot
temp_range = np.linspace(df['temperature'].min(), df['temperature'].max(), 50)
humidity_range = np.linspace(df['humidity'].min(), df['humidity'].max(), 50)
temp_grid, humidity_grid = np.meshgrid(temp_range, humidity_range)

# Calculate heat index for grid
heat_index_grid = calculate_heat_index_kma(temp_grid, humidity_grid)

# Create 3D surface plot
fig = go.Figure(data=[go.Surface(
    x=temp_range,
    y=humidity_range,
    z=heat_index_grid,
    colorscale='RdYlBu_r',
    contours={
        "z": {"show": True, "start": 20, "end": 40, "size": 2}
    },
    colorbar=dict(title="Heat Index (°C)")
)])

# Add sample points
sample_points = df.sample(500)
fig.add_trace(go.Scatter3d(
    x=sample_points['temperature'],
    y=sample_points['humidity'],
    z=sample_points['heat_index'],
    mode='markers',
    marker=dict(
        size=3,
        color=sample_points['heat_index'],
        colorscale='RdYlBu_r',
        showscale=False,
        opacity=0.8
    ),
    name='Observed Data',
    hovertemplate='Temp: %{x:.1f}°C<br>Humidity: %{y:.1f}%<br>Heat Index: %{z:.1f}°C<extra></extra>'
))

fig.update_layout(
    title='3D Heat Index Surface - KMA Formula',
    scene=dict(
        xaxis_title='Temperature (°C)',
        yaxis_title='Humidity (%)',
        zaxis_title='Heat Index (°C)',
        camera=dict(
            eye=dict(x=1.5, y=1.5, z=1.5)
        )
    ),
    width=900,
    height=700
)

fig.show()

In [None]:
# Animated scatter plot showing daily progression
# Prepare data for animation
anim_data = df.groupby(['date', 'district']).agg({
    'temperature': 'mean',
    'humidity': 'mean',
    'heat_index': 'mean'
}).reset_index()

# Create animated scatter plot
fig = px.scatter(
    anim_data,
    x='temperature',
    y='humidity',
    animation_frame='date',
    animation_group='district',
    size='heat_index',
    color='district',
    hover_name='district',
    size_max=30,
    range_x=[anim_data['temperature'].min()-2, anim_data['temperature'].max()+2],
    range_y=[anim_data['humidity'].min()-5, anim_data['humidity'].max()+5],
    title='Daily Climate Progression by District (Animated)',
    labels={
        'temperature': 'Temperature (°C)',
        'humidity': 'Humidity (%)',
        'heat_index': 'Heat Index (°C)'
    }
)

# Add reference lines for dangerous conditions
fig.add_hline(y=80, line_dash="dash", line_color="red", 
              annotation_text="High Humidity")
fig.add_vline(x=35, line_dash="dash", line_color="red", 
              annotation_text="High Temperature")

fig.update_layout(
    xaxis=dict(showgrid=True, gridwidth=1, gridcolor='LightGray'),
    yaxis=dict(showgrid=True, gridwidth=1, gridcolor='LightGray'),
    width=900,
    height=600
)

fig.show()

## 4. Interactive Maps with Folium

In [None]:
# Create heat map with time slider
import folium
from folium import plugins

# Prepare data for heatmap with time
# Group by hour for time-based visualization
hourly_data = df.groupby(['hour', 'lat', 'lon'])['heat_index'].mean().reset_index()

# Create base map centered on Seoul
seoul_map = folium.Map(
    location=[37.5665, 126.9780],
    zoom_start=11,
    tiles='OpenStreetMap'
)

# Prepare data for HeatMapWithTime
heat_data = []
for hour in hourly_data['hour'].unique():
    hour_data = hourly_data[hourly_data['hour'] == hour]
    heat_data.append(
        [[row['lat'], row['lon'], row['heat_index']] 
         for _, row in hour_data.iterrows()]
    )

# Add HeatMapWithTime
plugins.HeatMapWithTime(
    heat_data,
    index=[f"{h:02d}:00" for h in sorted(hourly_data['hour'].unique())],
    auto_play=True,
    max_opacity=0.8,
    radius=15,
    gradient={0.4: 'blue', 0.6: 'yellow', 0.8: 'orange', 1.0: 'red'},
    name='Heat Index by Hour'
).add_to(seoul_map)

# Add layer control
folium.LayerControl().add_to(seoul_map)

# Add title
title_html = '''
             <h3 align="center" style="font-size:20px">
             <b>Seoul Heat Index Map - Hourly Progression (KMA Formula)</b></h3>
             '''
seoul_map.get_root().html.add_child(folium.Element(title_html))

seoul_map

In [None]:
# Create choropleth map by district
# Calculate district statistics
district_stats = df.groupby('district').agg({
    'heat_index': ['mean', 'max', 'std'],
    'lat': 'mean',
    'lon': 'mean'
}).round(2)
district_stats.columns = ['_'.join(col).strip() for col in district_stats.columns]
district_stats = district_stats.reset_index()

# Create map with district markers
district_map = folium.Map(
    location=[37.5665, 126.9780],
    zoom_start=11,
    tiles='CartoDB positron'
)

# Add circle markers for each district
for _, row in district_stats.iterrows():
    # Determine color based on mean heat index
    if row['heat_index_mean'] < 25:
        color = 'blue'
    elif row['heat_index_mean'] < 30:
        color = 'green'
    elif row['heat_index_mean'] < 33:
        color = 'orange'
    else:
        color = 'red'
    
    # Create popup text
    popup_text = f"""
    <b>{row['district']}</b><br>
    Mean Heat Index: {row['heat_index_mean']:.1f}°C<br>
    Max Heat Index: {row['heat_index_max']:.1f}°C<br>
    Std Dev: {row['heat_index_std']:.1f}°C
    """
    
    # Add circle marker
    folium.CircleMarker(
        location=[row['lat_mean'], row['lon_mean']],
        radius=row['heat_index_mean']/2,  # Size based on heat index
        popup=folium.Popup(popup_text, max_width=250),
        color=color,
        fill=True,
        fillColor=color,
        fillOpacity=0.6,
        weight=2
    ).add_to(district_map)
    
    # Add district label
    folium.Marker(
        location=[row['lat_mean'], row['lon_mean']],
        icon=DivIcon(
            html=f'<div style="font-size: 10pt; color: black; font-weight: bold;">{row["district"]}</div>',
        )
    ).add_to(district_map)

# Add legend
legend_html = '''
    <div style="position: fixed; 
                bottom: 50px; left: 50px; width: 180px; height: 120px; 
                background-color: white; z-index:9999; font-size:14px;
                border:2px solid grey; border-radius:5px; padding: 10px">
    <p align="center" style="margin: 0 0 5px 0;"><b>Mean Heat Index</b></p>
    <p style="margin: 5px;"><span style="color: blue;">●</span> &lt; 25°C (Safe)</p>
    <p style="margin: 5px;"><span style="color: green;">●</span> 25-30°C (Caution)</p>
    <p style="margin: 5px;"><span style="color: orange;">●</span> 30-33°C (Warning)</p>
    <p style="margin: 5px;"><span style="color: red;">●</span> &gt; 33°C (Danger)</p>
    </div>
'''
district_map.get_root().html.add_child(folium.Element(legend_html))

# Add title
title_html = '''
             <h3 align="center" style="font-size:20px">
             <b>Seoul Districts Heat Index Analysis (KMA Formula)</b></h3>
             '''
district_map.get_root().html.add_child(folium.Element(title_html))

district_map

## 5. Dashboard Creation with Streamlit

In [None]:
# Create Streamlit dashboard file
streamlit_code = '''
import streamlit as st
import pandas as pd
import numpy as np
import plotly.graph_objects as go
import plotly.express as px
from datetime import datetime, timedelta
import folium
from streamlit_folium import folium_static

# Page config
st.set_page_config(
    page_title="Seoul Heatwave Dashboard",
    page_icon="🌡️",
    layout="wide",
    initial_sidebar_state="expanded"
)

# Title and description
st.title("🌡️ Seoul Heatwave Analysis Dashboard")
st.markdown("**Real-time monitoring and analysis using KMA Heat Index Formula**")

# KMA Heat Index functions
@st.cache_data
def calculate_wet_bulb_temperature(Ta, RH):
    """Calculate wet-bulb temperature using Stull's formula."""
    Tw = (Ta * np.arctan(0.151977 * (RH + 8.313659)**0.5) + 
          np.arctan(Ta + RH) - 
          np.arctan(RH - 1.67633) + 
          0.00391838 * RH**1.5 * np.arctan(0.023101 * RH) - 
          4.686035)
    return Tw

@st.cache_data
def calculate_heat_index_kma(Ta, RH):
    """Calculate heat index using KMA formula."""
    Tw = calculate_wet_bulb_temperature(Ta, RH)
    HI = (-0.2442 + 0.55399 * Tw + 0.45535 * Ta - 
          0.0022 * Tw**2 + 0.00278 * Tw * Ta + 3.0)
    return HI

# Load data function
@st.cache_data
def load_data():
    # Generate synthetic data for demo
    np.random.seed(42)
    dates = pd.date_range(start="2025-04-01", end="2025-08-31", freq="H")
    districts = ["Gangnam", "Gangdong", "Gangbuk", "Gangseo", "Gwanak"]
    
    data = []
    for date in dates:
        for district in districts:
            temp = 20 + np.random.normal(0, 5) + 10 * np.sin((date.hour - 6) * np.pi / 12)
            humidity = 60 + np.random.normal(0, 10)
            heat_index = calculate_heat_index_kma(temp, humidity)
            
            data.append({
                "timestamp": date,
                "district": district,
                "temperature": temp,
                "humidity": humidity,
                "heat_index": heat_index
            })
    
    return pd.DataFrame(data)

# Load data
df = load_data()

# Sidebar filters
st.sidebar.header("Filter Options")

# Date range filter
date_range = st.sidebar.date_input(
    "Select Date Range",
    value=(df["timestamp"].min(), df["timestamp"].max()),
    min_value=df["timestamp"].min(),
    max_value=df["timestamp"].max()
)

# District filter
selected_districts = st.sidebar.multiselect(
    "Select Districts",
    options=df["district"].unique(),
    default=df["district"].unique()
)

# Filter data
filtered_df = df[
    (df["timestamp"].dt.date >= date_range[0]) &
    (df["timestamp"].dt.date <= date_range[1]) &
    (df["district"].isin(selected_districts))
]

# Main dashboard
col1, col2, col3, col4 = st.columns(4)

# Metrics
with col1:
    avg_heat_index = filtered_df["heat_index"].mean()
    st.metric(
        label="Average Heat Index",
        value=f"{avg_heat_index:.1f}°C",
        delta=f"{avg_heat_index - 30:.1f}°C from normal"
    )

with col2:
    max_heat_index = filtered_df["heat_index"].max()
    st.metric(
        label="Maximum Heat Index",
        value=f"{max_heat_index:.1f}°C",
        delta="Danger" if max_heat_index > 35 else "Safe"
    )

with col3:
    avg_temp = filtered_df["temperature"].mean()
    st.metric(
        label="Average Temperature",
        value=f"{avg_temp:.1f}°C"
    )

with col4:
    avg_humidity = filtered_df["humidity"].mean()
    st.metric(
        label="Average Humidity",
        value=f"{avg_humidity:.1f}%"
    )

# Charts
st.header("📊 Analysis Charts")

# Time series chart
col1, col2 = st.columns(2)

with col1:
    st.subheader("Heat Index Time Series")
    
    daily_avg = filtered_df.groupby(["timestamp", "district"])["heat_index"].mean().reset_index()
    
    fig = px.line(
        daily_avg,
        x="timestamp",
        y="heat_index",
        color="district",
        title="Heat Index by District Over Time"
    )
    fig.add_hline(y=33, line_dash="dash", line_color="red", 
                  annotation_text="Danger Threshold")
    st.plotly_chart(fig, use_container_width=True)

with col2:
    st.subheader("District Comparison")
    
    district_avg = filtered_df.groupby("district")["heat_index"].agg(["mean", "max", "min"]).reset_index()
    
    fig = go.Figure()
    fig.add_trace(go.Bar(name="Mean", x=district_avg["district"], y=district_avg["mean"]))
    fig.add_trace(go.Scatter(name="Max", x=district_avg["district"], y=district_avg["max"], 
                             mode="markers", marker=dict(size=10, color="red")))
    fig.add_trace(go.Scatter(name="Min", x=district_avg["district"], y=district_avg["min"],
                             mode="markers", marker=dict(size=10, color="blue")))
    fig.update_layout(title="Heat Index Statistics by District")
    st.plotly_chart(fig, use_container_width=True)

# Heatmap
st.header("🗺️ Spatial Analysis")

hourly_pattern = filtered_df.pivot_table(
    values="heat_index",
    index=filtered_df["timestamp"].dt.hour,
    columns="district",
    aggfunc="mean"
)

fig = px.imshow(
    hourly_pattern,
    labels=dict(x="District", y="Hour of Day", color="Heat Index"),
    title="Heat Index Patterns by Hour and District",
    color_continuous_scale="RdYlBu_r"
)
st.plotly_chart(fig, use_container_width=True)

# Warning system
st.header("⚠️ Heat Wave Warnings")

danger_data = filtered_df[filtered_df["heat_index"] > 33]
if not danger_data.empty:
    st.error(f"🚨 {len(danger_data)} readings exceed danger threshold (33°C)!")
    
    danger_summary = danger_data.groupby("district")["heat_index"].count().reset_index()
    danger_summary.columns = ["District", "Danger Count"]
    
    col1, col2 = st.columns([1, 2])
    with col1:
        st.dataframe(danger_summary)
    with col2:
        fig = px.pie(danger_summary, values="Danger Count", names="District",
                     title="Distribution of Danger Readings")
        st.plotly_chart(fig, use_container_width=True)
else:
    st.success("✅ No danger level heat index readings detected.")

# Footer
st.markdown("---")
st.markdown("*Dashboard powered by KMA Heat Index Formula | Instructor: Sohn Chul*")
'''

# Save the Streamlit app code
with open('../streamlit_dashboard.py', 'w') as f:
    f.write(streamlit_code)
    
print("Streamlit dashboard code saved to 'streamlit_dashboard.py'")
print("To run the dashboard, use: streamlit run streamlit_dashboard.py")

## 6. Advanced Visualization Techniques

In [None]:
# Ridgeline plot for distribution over time
from scipy.stats import gaussian_kde

fig, axes = plt.subplots(5, 1, figsize=(12, 10), sharex=True)
months = [4, 5, 6, 7, 8]
colors = plt.cm.coolwarm(np.linspace(0, 1, 5))

for idx, (ax, month, color) in enumerate(zip(axes, months, colors)):
    month_data = df[df['month'] == month]['heat_index'].values
    
    # Create KDE
    kde = gaussian_kde(month_data)
    x_range = np.linspace(month_data.min(), month_data.max(), 200)
    density = kde(x_range)
    
    # Plot
    ax.fill_between(x_range, 0, density, color=color, alpha=0.7)
    ax.plot(x_range, density, color=color, linewidth=2)
    
    # Styling
    ax.set_ylabel(['Apr', 'May', 'Jun', 'Jul', 'Aug'][idx], rotation=0, ha='right')
    ax.set_yticks([])
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    ax.spines['left'].set_visible(False)
    
    # Add mean line
    mean_val = month_data.mean()
    ax.axvline(mean_val, color='black', linestyle='--', alpha=0.5, linewidth=1)
    ax.text(mean_val, ax.get_ylim()[1]*0.8, f'{mean_val:.1f}°C', 
            ha='center', fontsize=9)

axes[-1].set_xlabel('KMA Heat Index (°C)', fontsize=12)
fig.suptitle('Heat Index Distribution Progression (April - August 2025)', 
             fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

In [None]:
# Radial heatmap for 24-hour patterns
import matplotlib.patches as patches

fig, ax = plt.subplots(figsize=(10, 10), subplot_kw=dict(projection='polar'))

# Prepare data
hourly_heat = df.groupby('hour')['heat_index'].mean().values
hours = np.arange(24)
theta = hours * 2 * np.pi / 24

# Create radial bars
width = 2 * np.pi / 24
colors = plt.cm.RdYlBu_r((hourly_heat - hourly_heat.min()) / 
                         (hourly_heat.max() - hourly_heat.min()))

bars = ax.bar(theta, hourly_heat, width=width, bottom=15, color=colors, alpha=0.8)

# Customize the plot
ax.set_theta_zero_location('N')
ax.set_theta_direction(-1)
ax.set_xticks(theta)
ax.set_xticklabels([f'{h:02d}:00' for h in hours])
ax.set_ylim(15, hourly_heat.max() + 2)
ax.set_yticks([20, 25, 30, 35])
ax.set_yticklabels(['20°C', '25°C', '30°C', '35°C'])
ax.set_title('24-Hour KMA Heat Index Pattern\n(Radial Visualization)', 
             fontsize=14, fontweight='bold', pad=20)

# Add danger zone annotation
danger_hours = [h for h, val in enumerate(hourly_heat) if val > 30]
for h in danger_hours:
    ax.annotate('⚠', xy=(theta[h], hourly_heat[h] + 1), 
                ha='center', fontsize=12, color='red')

plt.tight_layout()
plt.show()

## 7. Best Practices for Climate Data Visualization

### Color Scheme Selection
- **Temperature/Heat Index**: Use diverging color schemes (RdYlBu_r, coolwarm)
- **Categorical Data**: Use distinct colors (Set2, tab10)
- **Sequential Data**: Use single-hue progression (Blues, Reds)

### Accessibility Considerations
1. **Color Blindness**: Use colorblind-safe palettes
2. **High Contrast**: Ensure sufficient contrast for readability
3. **Alternative Encodings**: Use patterns, shapes, or labels in addition to color

### Interactive Features
1. **Tooltips**: Provide detailed information on hover
2. **Zoom/Pan**: Allow exploration of dense data
3. **Filters**: Enable data subset selection
4. **Export**: Allow users to save visualizations

### Performance Optimization
1. **Data Aggregation**: Pre-aggregate for large datasets
2. **Lazy Loading**: Load data on demand
3. **Caching**: Cache computed results
4. **Sampling**: Use representative samples for initial views

## 8. Exercise: Create Your Own Visualization

### Task
Create a comprehensive visualization dashboard that:
1. Shows the relationship between temperature, humidity, and KMA heat index
2. Identifies periods of extreme heat (heat index > 35°C)
3. Compares patterns across different districts
4. Includes both static and interactive elements

### Requirements
- Use KMA heat index formula for all calculations
- Include at least 3 different visualization types
- Make at least one visualization interactive
- Follow accessibility best practices
- Document your design choices

In [None]:
# Your solution here
# Example structure provided:

def create_custom_dashboard(df):
    """
    Create a custom visualization dashboard for heat wave analysis.
    
    Parameters:
    df: DataFrame with columns ['timestamp', 'district', 'temperature', 'humidity', 'heat_index']
    
    Returns:
    fig: Combined figure with multiple visualizations
    """
    # Your implementation here
    pass

# Test your function
# create_custom_dashboard(df)

## 9. Summary and Key Takeaways

### What We Learned
1. **Static Visualizations**: Created publication-quality figures with matplotlib and seaborn
2. **Interactive Plots**: Built dynamic visualizations with Plotly
3. **Geospatial Mapping**: Developed interactive maps with Folium
4. **Dashboard Development**: Created real-time dashboards with Streamlit
5. **Advanced Techniques**: Implemented ridgeline plots, radial visualizations, and animations
6. **KMA Formula Integration**: Consistently used KMA heat index throughout all visualizations

### Key Concepts
- **Visual Encoding**: Mapping data to visual properties effectively
- **Interactivity**: Enhancing user engagement and exploration
- **Storytelling**: Creating narrative through visualization
- **Performance**: Optimizing for large datasets
- **Accessibility**: Ensuring visualizations are usable by all

### Next Steps
- Week 10: Final Project - Comprehensive Seoul Heatwave Analysis System
- Integrate all techniques learned throughout the course
- Build a production-ready climate monitoring system

## Resources and References

### Visualization Libraries
- [Plotly Documentation](https://plotly.com/python/)
- [Folium Documentation](https://python-visualization.github.io/folium/)
- [Streamlit Documentation](https://docs.streamlit.io/)
- [Matplotlib Gallery](https://matplotlib.org/stable/gallery/index.html)
- [Seaborn Gallery](https://seaborn.pydata.org/examples/index.html)

### Climate Visualization Best Practices
- [Climate.gov Data Visualization](https://www.climate.gov/)
- [NOAA Climate Data Visualization](https://www.ncdc.noaa.gov/)
- [KMA Climate Information Portal](http://www.climate.go.kr/)

### Color Theory and Accessibility
- [ColorBrewer 2.0](https://colorbrewer2.org/)
- [Accessible Colors](https://accessible-colors.com/)
- [WCAG Guidelines](https://www.w3.org/WAI/WCAG21/quickref/)