# Week 9 - Part 3: Data Visualization with Streamlit

**Course:** Python Data Analysis for Business Intelligence  
**Week:** 9 | **Session:** Wednesday | **Part:** 3 of 3  
**Duration:** 20 minutes | **Date:** June 4, 2025

## Learning Objectives
By the end of this session, you will be able to:
- Integrate matplotlib, seaborn, and plotly with Streamlit
- Create interactive business dashboards with real-time data
- Connect to Supabase for live data visualization
- Implement caching strategies for optimal performance
- Build professional data storytelling applications

---

## 🎯 Business Context: Real-Time Business Intelligence

**Executive Challenge**: Your CEO walks into a board meeting and says:
> "I need to see our customer satisfaction trends, revenue performance, and operational metrics updating in real-time. Our competitors are making data-driven decisions faster than us."

**The Traditional Problem**:
- Static reports delivered weekly via email
- Data analysts spending hours creating PowerPoint presentations
- Decision-makers working with outdated information
- No ability to drill down into specific metrics during meetings

**Today's Solution**: Real-time, interactive dashboards that connect directly to your database and update automatically.

**Business Impact**:
- ⚡ Real-time decision making
- 📊 Interactive data exploration
- 🎯 Targeted business insights
- 💰 Faster response to market changes

---

## 🛠️ Setup: Environment and Supabase Connection

Let's set up our environment with live database connectivity:

In [1]:
# Essential imports for data visualization
import streamlit as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from datetime import datetime, timedelta
import time

# Course utilities
plt.style.use('default')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 12

print("✅ Visualization environment ready!")
print("🔗 Ready to build real-time dashboards with live data")

✅ Visualization environment ready!
🔗 Ready to build real-time dashboards with live data


## 📊 Section 1: Streamlit + Plotly Integration (8 minutes)

Let's create an advanced dashboard that showcases interactive visualization capabilities:

In [2]:
%%writefile interactive_dashboard_app.py

import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from datetime import datetime, timedelta
import time

# Configure page for wide layout
st.set_page_config(
    page_title="Real-Time Business Dashboard",
    page_icon="📊",
    layout="wide",
    initial_sidebar_state="expanded"
)

# Custom CSS for professional appearance
st.markdown("""
<style>
    .main > div {
        padding-top: 1rem;
    }
    .stMetric {
        background: white;
        border: 1px solid #e0e0e0;
        padding: 1rem;
        border-radius: 8px;
        box-shadow: 0 2px 4px rgba(0,0,0,0.1);
    }
</style>
""", unsafe_allow_html=True)

# Generate realistic business data with time series
@st.cache_data(ttl=300)  # Cache for 5 minutes
def generate_realtime_data():
    """
    Generate realistic time-series business data.
    In production, this would be replaced with Supabase queries.
    """
    np.random.seed(int(time.time()) % 1000)  # Semi-random for "real-time" feel
    
    # Generate 90 days of hourly data
    end_date = datetime.now()
    start_date = end_date - timedelta(days=90)
    date_range = pd.date_range(start_date, end_date, freq='H')
    
    # Simulate realistic business patterns
    base_orders = 50
    seasonal_pattern = np.sin(np.arange(len(date_range)) * 2 * np.pi / (24 * 7)) * 10  # Weekly pattern
    daily_pattern = np.sin(np.arange(len(date_range)) * 2 * np.pi / 24) * 20  # Daily pattern
    growth_trend = np.arange(len(date_range)) * 0.01  # Growth trend
    noise = np.random.normal(0, 5, len(date_range))
    
    orders = base_orders + seasonal_pattern + daily_pattern + growth_trend + noise
    orders = np.maximum(orders, 5)  # Minimum 5 orders per hour
    
    # Generate correlated metrics
    revenue = orders * np.random.normal(85, 15, len(date_range))
    satisfaction = np.random.normal(4.2, 0.3, len(date_range))
    satisfaction = np.clip(satisfaction, 1, 5)
    
    # Create DataFrame
    df = pd.DataFrame({
        'timestamp': date_range,
        'orders': orders.astype(int),
        'revenue': revenue,
        'satisfaction': satisfaction,
        'customers': orders * np.random.uniform(0.7, 1.3, len(date_range)),
        'conversion_rate': np.random.normal(3.5, 0.5, len(date_range))
    })
    
    # Add categorical data
    categories = ['Electronics', 'Fashion', 'Home & Garden', 'Books', 'Sports']
    states = ['São Paulo', 'Rio de Janeiro', 'Minas Gerais', 'Bahia', 'Paraná']
    
    category_data = []
    for _, row in df.iterrows():
        for cat in categories:
            orders_cat = int(row['orders'] * np.random.uniform(0.1, 0.3))
            if orders_cat > 0:
                category_data.append({
                    'timestamp': row['timestamp'],
                    'category': cat,
                    'orders': orders_cat,
                    'revenue': orders_cat * np.random.uniform(70, 120)
                })
    
    category_df = pd.DataFrame(category_data)
    
    return df, category_df

# Load data
df, category_df = generate_realtime_data()

# Header with real-time update
header_col1, header_col2 = st.columns([3, 1])

with header_col1:
    st.title("📊 Real-Time Business Intelligence Dashboard")
    st.markdown("**Live insights from Olist Brazilian E-commerce Platform**")

with header_col2:
    st.markdown(f"**🕒 Last Updated:** {datetime.now().strftime('%H:%M:%S')}")
    if st.button("🔄 Refresh Data"):
        st.cache_data.clear()
        st.rerun()

# Sidebar controls
st.sidebar.header("📊 Dashboard Controls")

# Time range selector
time_range = st.sidebar.selectbox(
    "📅 Time Range:",
    ['Last 24 Hours', 'Last 7 Days', 'Last 30 Days', 'Last 90 Days'],
    index=1
)

# Filter data based on time range
if time_range == 'Last 24 Hours':
    cutoff = datetime.now() - timedelta(hours=24)
elif time_range == 'Last 7 Days':
    cutoff = datetime.now() - timedelta(days=7)
elif time_range == 'Last 30 Days':
    cutoff = datetime.now() - timedelta(days=30)
else:
    cutoff = datetime.now() - timedelta(days=90)

filtered_df = df[df['timestamp'] >= cutoff]
filtered_category_df = category_df[category_df['timestamp'] >= cutoff]

# Chart type selector
chart_style = st.sidebar.selectbox(
    "📈 Chart Style:",
    ['Professional', 'Colorful', 'Minimal', 'Dark Theme']
)

# Set color palette based on style
if chart_style == 'Professional':
    color_palette = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd']
elif chart_style == 'Colorful':
    color_palette = px.colors.qualitative.Set1
elif chart_style == 'Minimal':
    color_palette = ['#34495e', '#7f8c8d', '#95a5a6', '#bdc3c7', '#ecf0f1']
else:  # Dark Theme
    color_palette = ['#e74c3c', '#f39c12', '#f1c40f', '#2ecc71', '#3498db']

# Auto-refresh toggle
auto_refresh = st.sidebar.checkbox("🔄 Auto-refresh (30s)", value=False)
if auto_refresh:
    time.sleep(30)
    st.rerun()

# Real-time KPIs
st.subheader("⚡ Real-Time Key Performance Indicators")

kpi_col1, kpi_col2, kpi_col3, kpi_col4, kpi_col5 = st.columns(5)

# Calculate current period vs previous period
current_period = filtered_df.tail(len(filtered_df)//2)
previous_period = filtered_df.head(len(filtered_df)//2)

with kpi_col1:
    current_orders = current_period['orders'].sum()
    previous_orders = previous_period['orders'].sum()
    orders_change = ((current_orders - previous_orders) / previous_orders * 100) if previous_orders > 0 else 0
    
    st.metric(
        label="📦 Total Orders",
        value=f"{current_orders:,}",
        delta=f"{orders_change:+.1f}%"
    )

with kpi_col2:
    current_revenue = current_period['revenue'].sum()
    previous_revenue = previous_period['revenue'].sum()
    revenue_change = ((current_revenue - previous_revenue) / previous_revenue * 100) if previous_revenue > 0 else 0
    
    st.metric(
        label="💰 Revenue",
        value=f"R$ {current_revenue:,.0f}",
        delta=f"{revenue_change:+.1f}%"
    )

with kpi_col3:
    current_satisfaction = current_period['satisfaction'].mean()
    previous_satisfaction = previous_period['satisfaction'].mean()
    satisfaction_change = current_satisfaction - previous_satisfaction
    
    st.metric(
        label="⭐ Avg Satisfaction",
        value=f"{current_satisfaction:.2f}/5.0",
        delta=f"{satisfaction_change:+.2f}"
    )

with kpi_col4:
    current_customers = current_period['customers'].sum()
    previous_customers = previous_period['customers'].sum()
    customers_change = ((current_customers - previous_customers) / previous_customers * 100) if previous_customers > 0 else 0
    
    st.metric(
        label="👥 Customers",
        value=f"{current_customers:,.0f}",
        delta=f"{customers_change:+.1f}%"
    )

with kpi_col5:
    current_conversion = current_period['conversion_rate'].mean()
    previous_conversion = previous_period['conversion_rate'].mean()
    conversion_change = current_conversion - previous_conversion
    
    st.metric(
        label="🎯 Conversion Rate",
        value=f"{current_conversion:.1f}%",
        delta=f"{conversion_change:+.1f}%"
    )

# Interactive Charts Section
st.markdown("---")
st.subheader("📈 Interactive Business Analytics")

# Chart tabs for better organization
chart_tab1, chart_tab2, chart_tab3, chart_tab4 = st.tabs([
    "📊 Time Series", "🏷️ Categories", "🗺️ Geographic", "📋 Summary"
])

with chart_tab1:
    # Time series charts
    ts_col1, ts_col2 = st.columns(2)
    
    with ts_col1:
        # Orders and Revenue over time
        fig = make_subplots(
            rows=2, cols=1,
            subplot_titles=['Orders Over Time', 'Revenue Over Time'],
            vertical_spacing=0.1
        )
        
        # Group by day for cleaner visualization
        daily_data = filtered_df.groupby(filtered_df['timestamp'].dt.date).agg({
            'orders': 'sum',
            'revenue': 'sum'
        }).reset_index()
        
        fig.add_trace(
            go.Scatter(
                x=daily_data['timestamp'],
                y=daily_data['orders'],
                mode='lines+markers',
                name='Orders',
                line=dict(color=color_palette[0], width=3)
            ),
            row=1, col=1
        )
        
        fig.add_trace(
            go.Scatter(
                x=daily_data['timestamp'],
                y=daily_data['revenue'],
                mode='lines+markers',
                name='Revenue',
                line=dict(color=color_palette[1], width=3)
            ),
            row=2, col=1
        )
        
        fig.update_layout(
            height=500,
            title_text="Business Performance Trends",
            showlegend=False
        )
        
        st.plotly_chart(fig, use_container_width=True)
    
    with ts_col2:
        # Customer satisfaction and conversion trends
        fig = make_subplots(
            rows=2, cols=1,
            subplot_titles=['Customer Satisfaction', 'Conversion Rate'],
            vertical_spacing=0.1
        )
        
        daily_metrics = filtered_df.groupby(filtered_df['timestamp'].dt.date).agg({
            'satisfaction': 'mean',
            'conversion_rate': 'mean'
        }).reset_index()
        
        fig.add_trace(
            go.Scatter(
                x=daily_metrics['timestamp'],
                y=daily_metrics['satisfaction'],
                mode='lines+markers',
                name='Satisfaction',
                line=dict(color=color_palette[2], width=3),
                fill='tonexty'
            ),
            row=1, col=1
        )
        
        fig.add_trace(
            go.Scatter(
                x=daily_metrics['timestamp'],
                y=daily_metrics['conversion_rate'],
                mode='lines+markers',
                name='Conversion',
                line=dict(color=color_palette[3], width=3)
            ),
            row=2, col=1
        )
        
        fig.update_layout(
            height=500,
            title_text="Quality Metrics Trends",
            showlegend=False
        )
        
        st.plotly_chart(fig, use_container_width=True)

with chart_tab2:
    # Category analysis
    cat_col1, cat_col2 = st.columns(2)
    
    with cat_col1:
        # Category performance pie chart
        category_summary = filtered_category_df.groupby('category').agg({
            'orders': 'sum',
            'revenue': 'sum'
        }).reset_index()
        
        fig = px.pie(
            category_summary,
            values='revenue',
            names='category',
            title="Revenue by Product Category",
            color_discrete_sequence=color_palette
        )
        
        fig.update_traces(textposition='inside', textinfo='percent+label')
        st.plotly_chart(fig, use_container_width=True)
    
    with cat_col2:
        # Category trends over time
        category_daily = filtered_category_df.groupby([
            filtered_category_df['timestamp'].dt.date, 'category'
        ])['revenue'].sum().reset_index()
        
        fig = px.line(
            category_daily,
            x='timestamp',
            y='revenue',
            color='category',
            title="Category Revenue Trends",
            color_discrete_sequence=color_palette
        )
        
        fig.update_layout(height=400)
        st.plotly_chart(fig, use_container_width=True)
    
    # Category performance table
    st.subheader("📊 Category Performance Summary")
    
    category_summary['avg_order_value'] = category_summary['revenue'] / category_summary['orders']
    category_summary['revenue_share'] = (category_summary['revenue'] / category_summary['revenue'].sum() * 100)
    
    st.dataframe(
        category_summary.style.format({
            'orders': '{:,}',
            'revenue': 'R$ {:,.0f}',
            'avg_order_value': 'R$ {:.2f}',
            'revenue_share': '{:.1f}%'
        }),
        use_container_width=True
    )

with chart_tab3:
    # Geographic analysis (simulated)
    st.subheader("🗺️ Geographic Performance")
    
    # Generate sample geographic data
    states = ['São Paulo', 'Rio de Janeiro', 'Minas Gerais', 'Bahia', 'Paraná', 'Santa Catarina']
    geo_data = []
    
    for state in states:
        orders = np.random.randint(1000, 5000)
        revenue = orders * np.random.uniform(80, 120)
        satisfaction = np.random.uniform(3.8, 4.5)
        
        geo_data.append({
            'state': state,
            'orders': orders,
            'revenue': revenue,
            'satisfaction': satisfaction
        })
    
    geo_df = pd.DataFrame(geo_data)
    
    geo_col1, geo_col2 = st.columns(2)
    
    with geo_col1:
        # Revenue by state
        fig = px.bar(
            geo_df.sort_values('revenue', ascending=True),
            x='revenue',
            y='state',
            orientation='h',
            title="Revenue by State",
            color='satisfaction',
            color_continuous_scale='RdYlGn'
        )
        
        fig.update_layout(height=400)
        st.plotly_chart(fig, use_container_width=True)
    
    with geo_col2:
        # Orders vs Satisfaction scatter
        fig = px.scatter(
            geo_df,
            x='orders',
            y='satisfaction',
            size='revenue',
            color='state',
            title="Orders vs Satisfaction by State",
            hover_data=['revenue'],
            color_discrete_sequence=color_palette
        )
        
        fig.update_layout(height=400)
        st.plotly_chart(fig, use_container_width=True)

with chart_tab4:
    # Summary dashboard
    st.subheader("📋 Executive Summary")
    
    summary_col1, summary_col2 = st.columns(2)
    
    with summary_col1:
        # Key insights
        st.markdown("### 🎯 Key Insights")
        
        total_orders = filtered_df['orders'].sum()
        total_revenue = filtered_df['revenue'].sum()
        avg_satisfaction = filtered_df['satisfaction'].mean()
        
        st.info(f"📊 **Total Orders**: {total_orders:,} in selected period")
        st.info(f"💰 **Total Revenue**: R$ {total_revenue:,.0f}")
        st.info(f"⭐ **Average Satisfaction**: {avg_satisfaction:.2f}/5.0")
        
        # Performance indicators
        if avg_satisfaction >= 4.0:
            st.success("✅ Customer satisfaction is above target (4.0)")
        else:
            st.warning("⚠️ Customer satisfaction needs attention")
        
        if orders_change > 0:
            st.success(f"📈 Orders growing by {orders_change:.1f}%")
        else:
            st.error(f"📉 Orders declining by {abs(orders_change):.1f}%")
    
    with summary_col2:
        # Performance gauge
        overall_score = (avg_satisfaction / 5.0) * 100
        
        fig = go.Figure(go.Indicator(
            mode = "gauge+number+delta",
            value = overall_score,
            domain = {'x': [0, 1], 'y': [0, 1]},
            title = {'text': "Overall Performance Score"},
            delta = {'reference': 80},
            gauge = {
                'axis': {'range': [None, 100]},
                'bar': {'color': color_palette[0]},
                'steps': [
                    {'range': [0, 60], 'color': "lightgray"},
                    {'range': [60, 80], 'color': "yellow"},
                    {'range': [80, 100], 'color': "lightgreen"}
                ],
                'threshold': {
                    'line': {'color': "red", 'width': 4},
                    'thickness': 0.75,
                    'value': 90
                }
            }
        ))
        
        fig.update_layout(height=400)
        st.plotly_chart(fig, use_container_width=True)

# Footer
st.markdown("---")
st.markdown(
    "**📊 Real-Time Dashboard** | "
    "Built with Streamlit & Plotly | "
    "**🔗 Data Source**: Live Supabase Connection | "
    f"**🕒 Generated**: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}"
)

Writing interactive_dashboard_app.py


## 🔗 Section 2: Supabase Integration Setup (7 minutes)

Now let's connect to real Supabase data for live business intelligence:

In [None]:
%%writefile supabase_integration_app.py

import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime, timedelta
import os

# Configure page
st.set_page_config(
    page_title="Live Supabase Dashboard",
    page_icon="🔗",
    layout="wide"
)

st.title("🔗 Live Supabase Data Integration")
st.markdown("**Real-time business dashboard with live database connectivity**")

# Supabase connection helper
@st.cache_resource
def init_supabase_connection():
    """
    Initialize Supabase connection using environment variables or Streamlit secrets.
    In production, use st.secrets for secure credential management.
    """
    try:
        # Try to get credentials from Streamlit secrets first
        if hasattr(st, 'secrets') and 'supabase' in st.secrets:
            url = st.secrets['supabase']['url']
            key = st.secrets['supabase']['anon_key']
        else:
            # Fallback to environment variables
            url = os.getenv('SUPABASE_URL')
            key = os.getenv('SUPABASE_ANON_KEY')
        
        if not url or not key:
            st.error("❌ Supabase credentials not found. Please set up your credentials.")
            st.info(
                "Add your Supabase URL and anon key to `.streamlit/secrets.toml` "
                "or set SUPABASE_URL and SUPABASE_ANON_KEY environment variables."
            )
            return None
        
        # Note: In a real application, you would import and use the supabase client here
        # from supabase import create_client, Client
        # supabase: Client = create_client(url, key)
        # return supabase
        
        # For this demo, we'll simulate the connection
        st.success("✅ Supabase connection established!")
        return {'url': url, 'status': 'connected'}
        
    except Exception as e:
        st.error(f"❌ Failed to connect to Supabase: {str(e)}")
        return None

# Data loading functions that would query Supabase
@st.cache_data(ttl=300)  # Cache for 5 minutes
def load_orders_data(supabase_client, limit=1000):
    """
    Load orders data from Supabase.
    In production, this would execute actual Supabase queries.
    """
    if not supabase_client:
        return None
    
    # Simulated query: supabase_client.table('orders').select('*').limit(limit).execute()
    # For demo, generate realistic data
    np.random.seed(42)
    
    dates = pd.date_range('2024-01-01', periods=limit, freq='H')
    
    data = {
        'order_id': [f"ORD_{i:08d}" for i in range(1, limit + 1)],
        'customer_id': [f"CUST_{np.random.randint(1, 10000):06d}" for _ in range(limit)],
        'order_date': dates,
        'product_category': np.random.choice([
            'Electronics', 'Fashion', 'Home & Garden', 'Books', 'Sports',
            'Beauty', 'Automotive', 'Toys', 'Health', 'Food'
        ], limit),
        'order_value': np.random.exponential(100, limit),
        'customer_state': np.random.choice([
            'SP', 'RJ', 'MG', 'BA', 'PR', 'RS', 'PE', 'CE', 'SC', 'GO'
        ], limit),
        'order_status': np.random.choice([
            'completed', 'processing', 'shipped', 'cancelled'
        ], limit, p=[0.8, 0.1, 0.08, 0.02]),
        'satisfaction_score': np.random.choice([1, 2, 3, 4, 5], limit, p=[0.05, 0.1, 0.2, 0.35, 0.3])
    }
    
    df = pd.DataFrame(data)
    df['order_value'] = np.round(df['order_value'], 2)
    
    return df

@st.cache_data(ttl=300)
def load_customer_metrics(supabase_client):
    """
    Load customer metrics from Supabase.
    """
    if not supabase_client:
        return None
    
    # Simulated aggregated metrics query
    metrics = {
        'total_customers': 45230,
        'new_customers_today': 127,
        'customer_ltv': 485.50,
        'churn_rate': 2.3,
        'avg_satisfaction': 4.2,
        'nps_score': 67
    }
    
    return metrics

# Initialize connection
supabase_client = init_supabase_connection()

if supabase_client:
    # Sidebar controls
    st.sidebar.header("🔧 Database Controls")
    
    # Data refresh controls
    if st.sidebar.button("🔄 Refresh Data"):
        st.cache_data.clear()
        st.success("✅ Data refreshed from Supabase!")
        st.rerun()
    
    # Query limit
    query_limit = st.sidebar.slider(
        "Query Limit:",
        min_value=100,
        max_value=5000,
        value=1000,
        step=100,
        help="Limit number of records to fetch for performance"
    )
    
    # Filter controls
    st.sidebar.subheader("📊 Data Filters")
    
    date_filter = st.sidebar.date_input(
        "Orders Since:",
        value=datetime.now().date() - timedelta(days=30),
        help="Filter orders from this date forward"
    )
    
    status_filter = st.sidebar.multiselect(
        "Order Status:",
        ['completed', 'processing', 'shipped', 'cancelled'],
        default=['completed', 'processing', 'shipped']
    )
    
    # Load data
    with st.spinner("🔄 Loading data from Supabase..."):
        orders_df = load_orders_data(supabase_client, query_limit)
        customer_metrics = load_customer_metrics(supabase_client)
    
    if orders_df is not None and customer_metrics is not None:
        # Apply filters
        filtered_orders = orders_df[
            (orders_df['order_date'].dt.date >= date_filter) &
            (orders_df['order_status'].isin(status_filter))
        ]
        
        # Display connection info
        info_col1, info_col2, info_col3 = st.columns(3)
        
        with info_col1:
            st.info(f"🔗 **Connected to Supabase**\n\nRecords loaded: {len(orders_df):,}")
        
        with info_col2:
            st.info(f"📊 **Filtered Dataset**\n\nRecords after filters: {len(filtered_orders):,}")
        
        with info_col3:
            st.info(f"🕒 **Last Updated**\n\n{datetime.now().strftime('%H:%M:%S')}")
        
        # Live KPIs from Supabase
        st.subheader("📊 Live Business Metrics")
        
        kpi_col1, kpi_col2, kpi_col3, kpi_col4, kpi_col5, kpi_col6 = st.columns(6)
        
        with kpi_col1:
            st.metric(
                "👥 Total Customers",
                f"{customer_metrics['total_customers']:,}",
                f"+{customer_metrics['new_customers_today']}"
            )
        
        with kpi_col2:
            total_revenue = filtered_orders['order_value'].sum()
            st.metric(
                "💰 Revenue",
                f"R$ {total_revenue:,.0f}",
                "Live data"
            )
        
        with kpi_col3:
            avg_order_value = filtered_orders['order_value'].mean()
            st.metric(
                "💳 Avg Order Value",
                f"R$ {avg_order_value:.2f}",
                "Real-time"
            )
        
        with kpi_col4:
            st.metric(
                "⭐ Satisfaction",
                f"{customer_metrics['avg_satisfaction']:.1f}/5.0",
                f"NPS: {customer_metrics['nps_score']}"
            )
        
        with kpi_col5:
            st.metric(
                "💼 Customer LTV",
                f"R$ {customer_metrics['customer_ltv']:.2f}",
                "Lifetime Value"
            )
        
        with kpi_col6:
            st.metric(
                "📉 Churn Rate",
                f"{customer_metrics['churn_rate']:.1f}%",
                "Monthly"
            )
        
        # Interactive visualizations
        st.markdown("---")
        st.subheader("📈 Live Data Visualizations")
        
        viz_col1, viz_col2 = st.columns(2)
        
        with viz_col1:
            # Revenue by category
            category_revenue = filtered_orders.groupby('product_category')['order_value'].sum().reset_index()
            category_revenue = category_revenue.sort_values('order_value', ascending=False)
            
            fig = px.bar(
                category_revenue,
                x='product_category',
                y='order_value',
                title="Revenue by Product Category (Live Data)",
                labels={'order_value': 'Revenue (R$)', 'product_category': 'Category'}
            )
            fig.update_xaxes(tickangle=45)
            st.plotly_chart(fig, use_container_width=True)
        
        with viz_col2:
            # Orders by state
            state_orders = filtered_orders['customer_state'].value_counts().reset_index()
            state_orders.columns = ['state', 'orders']
            
            fig = px.pie(
                state_orders.head(8),  # Top 8 states
                values='orders',
                names='state',
                title="Orders by State (Top 8)"
            )
            st.plotly_chart(fig, use_container_width=True)
        
        # Time series analysis
        st.subheader("📊 Time Series Analysis")
        
        # Daily revenue trend
        daily_revenue = filtered_orders.groupby(
            filtered_orders['order_date'].dt.date
        ).agg({
            'order_value': 'sum',
            'order_id': 'count',
            'satisfaction_score': 'mean'
        }).reset_index()
        
        daily_revenue.columns = ['date', 'revenue', 'orders', 'avg_satisfaction']
        
        # Create subplot with secondary y-axis
        fig = go.Figure()
        
        # Revenue line
        fig.add_trace(
            go.Scatter(
                x=daily_revenue['date'],
                y=daily_revenue['revenue'],
                mode='lines+markers',
                name='Daily Revenue',
                line=dict(color='#1f77b4', width=3)
            )
        )
        
        fig.update_layout(
            title="Daily Revenue Trend (Live Supabase Data)",
            xaxis_title="Date",
            yaxis_title="Revenue (R$)",
            height=400
        )
        
        st.plotly_chart(fig, use_container_width=True)
        
        # Data preview
        st.subheader("📋 Live Data Preview")
        
        preview_tabs = st.tabs(["📦 Recent Orders", "📊 Summary Stats", "🔍 Raw Data"])
        
        with preview_tabs[0]:
            # Show recent orders
            recent_orders = filtered_orders.nlargest(10, 'order_date')
            st.dataframe(
                recent_orders[[
                    'order_id', 'order_date', 'product_category',
                    'order_value', 'customer_state', 'order_status'
                ]],
                use_container_width=True
            )
        
        with preview_tabs[1]:
            # Summary statistics
            stats_col1, stats_col2 = st.columns(2)
            
            with stats_col1:
                st.markdown("**📊 Order Statistics**")
                st.write(f"Total Orders: {len(filtered_orders):,}")
                st.write(f"Total Revenue: R$ {filtered_orders['order_value'].sum():,.2f}")
                st.write(f"Average Order Value: R$ {filtered_orders['order_value'].mean():.2f}")
                st.write(f"Median Order Value: R$ {filtered_orders['order_value'].median():.2f}")
            
            with stats_col2:
                st.markdown("**⭐ Quality Metrics**")
                st.write(f"Average Satisfaction: {filtered_orders['satisfaction_score'].mean():.2f}/5.0")
                completion_rate = (filtered_orders['order_status'] == 'completed').mean() * 100
                st.write(f"Completion Rate: {completion_rate:.1f}%")
                st.write(f"Unique Customers: {filtered_orders['customer_id'].nunique():,}")
                st.write(f"Product Categories: {filtered_orders['product_category'].nunique()}")
        
        with preview_tabs[2]:
            # Raw data with search
            search_term = st.text_input("🔍 Search orders (by Order ID or Customer ID):")
            
            if search_term:
                search_results = filtered_orders[
                    (filtered_orders['order_id'].str.contains(search_term, case=False)) |
                    (filtered_orders['customer_id'].str.contains(search_term, case=False))
                ]
                st.dataframe(search_results, use_container_width=True)
            else:
                st.dataframe(filtered_orders.head(20), use_container_width=True)
        
        # Export functionality
        st.markdown("---")
        st.subheader("💾 Data Export")
        
        export_col1, export_col2, export_col3 = st.columns(3)
        
        with export_col1:
            if st.button("📊 Export to CSV"):
                csv = filtered_orders.to_csv(index=False)
                st.download_button(
                    label="⬇️ Download CSV",
                    data=csv,
                    file_name=f"olist_orders_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv",
                    mime="text/csv"
                )
        
        with export_col2:
            if st.button("📈 Export Summary Report"):
                summary_report = daily_revenue.to_csv(index=False)
                st.download_button(
                    label="⬇️ Download Report",
                    data=summary_report,
                    file_name=f"daily_summary_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv",
                    mime="text/csv"
                )
        
        with export_col3:
            if st.button("📧 Email Dashboard"):
                st.success("✅ Dashboard summary sent to stakeholders!")
                st.info("📧 Email functionality would be implemented here")
    
    else:
        st.error("❌ Failed to load data from Supabase")
        st.info("Please check your database connection and try again.")

else:
    # Show connection setup instructions
    st.warning("⚠️ Supabase connection not configured")
    
    with st.expander("🔧 Setup Instructions", expanded=True):
        st.markdown("""
        ### Setting up Supabase Connection
        
        To connect this dashboard to live Supabase data:
        
        1. **Create a `.streamlit/secrets.toml` file** in your project root:
        ```toml
        [supabase]
        url = "https://your-project.supabase.co"
        anon_key = "your-anon-key-here"
        ```
        
        2. **Or set environment variables**:
        ```bash
        export SUPABASE_URL="https://your-project.supabase.co"
        export SUPABASE_ANON_KEY="your-anon-key-here"
        ```
        
        3. **Install the Supabase client**:
        ```bash
        pip install supabase
        ```
        
        4. **Ensure your Supabase tables match the expected schema**:
        - `orders` table with columns: order_id, customer_id, order_date, product_category, order_value, customer_state, order_status, satisfaction_score
        - Proper Row Level Security (RLS) policies for data access
        
        5. **Restart your Streamlit app** after configuration
        """)

# Footer
st.markdown("---")
st.markdown(
    "**🔗 Live Supabase Integration** | "
    "Real-time business intelligence with secure database connectivity | "
    "**📚 Next:** Advanced dashboard features and deployment"
)

## ⚡ Section 3: Performance and Caching (5 minutes)

Learn essential performance optimization techniques for production dashboards:

In [None]:
# Performance optimization examples
import streamlit as st
import pandas as pd
import time

# Example 1: Data Caching
st.subheader("🚀 Performance Optimization Techniques")

st.markdown("""
### 1. Data Caching with `@st.cache_data`

**Problem**: Database queries run on every user interaction  
**Solution**: Cache expensive operations

```python
@st.cache_data(ttl=300)  # Cache for 5 minutes
def load_sales_data():
    # Expensive database query
    return pd.read_sql(query, connection)
```
""")

st.markdown("""
### 2. Resource Caching with `@st.cache_resource`

**Use for**: Database connections, ML models, global objects  

```python
@st.cache_resource
def init_database_connection():
    return create_supabase_client()
```
""")

st.markdown("""
### 3. Conditional Rendering

**Avoid**: Rendering expensive components unnecessarily  

```python
if show_advanced_charts:
    # Only create expensive charts when needed
    create_complex_visualization()
```
""")

st.markdown("""
### 4. Efficient Data Filtering

**Best Practice**: Filter at database level, not in Python  

```python
# Good: Filter in SQL
query = "SELECT * FROM orders WHERE date >= %s"
df = pd.read_sql(query, conn, params=[start_date])

# Avoid: Loading all data then filtering
df = pd.read_sql("SELECT * FROM orders", conn)
df = df[df['date'] >= start_date]
```
""")

# Interactive performance demo
st.markdown("---")
st.subheader("🔬 Performance Comparison Demo")

demo_col1, demo_col2 = st.columns(2)

with demo_col1:
    st.markdown("**Without Caching:**")
    if st.button("Load Data (No Cache)"):
        start_time = time.time()
        # Simulate expensive operation
        time.sleep(2)
        data = pd.DataFrame({'x': range(1000), 'y': range(1000)})
        end_time = time.time()
        st.write(f"⏱️ Time: {end_time - start_time:.2f} seconds")
        st.write(f"📊 Loaded {len(data)} records")

with demo_col2:
    st.markdown("**With Caching:**")
    
    @st.cache_data
    def load_cached_data():
        time.sleep(2)  # Simulate expensive operation
        return pd.DataFrame({'x': range(1000), 'y': range(1000)})
    
    if st.button("Load Data (Cached)"):
        start_time = time.time()
        data = load_cached_data()
        end_time = time.time()
        st.write(f"⏱️ Time: {end_time - start_time:.2f} seconds")
        st.write(f"📊 Loaded {len(data)} records")
        st.info("🚀 Subsequent calls will be instant!")

print("✅ Performance optimization examples ready!")

## 🎯 Key Takeaways

✅ **Interactive Visualizations**: Plotly integration for professional charts  
✅ **Live Data Connectivity**: Supabase integration for real-time dashboards  
✅ **Performance Optimization**: Caching strategies for production apps  
✅ **Business Intelligence**: Executive-level dashboard design  
✅ **Data Export**: CSV downloads and sharing capabilities  

## 🔜 What's Next

Tomorrow (Thursday), we'll dive into advanced Streamlit features:

**Thursday Topics:**
- Multi-page applications and navigation
- Advanced business dashboard patterns
- Production deployment to Streamlit Cloud
- Security and authentication considerations

---

## 💼 Today's Assignment: Basic Streamlit Dashboard

**Create your first business intelligence dashboard:**

### Requirements:
1. **Data Source**: Connect to Olist sample data (provided)
2. **KPI Metrics**: Display 4-6 key business metrics
3. **Interactive Filters**: Date range, category, and state filters
4. **Visualizations**: 2-3 charts showing business insights
5. **Professional Design**: Clean layout with proper styling

### Business Focus:
- **Audience**: E-commerce operations manager
- **Goal**: Daily performance monitoring
- **Key Questions**: 
  - How are sales trending?
  - Which categories perform best?
  - What's our customer satisfaction?

### Deliverable:
- Working Streamlit app (`.py` file)
- Screenshots of dashboard in action
- Brief explanation of business insights discovered

### Due: Before Thursday's class

**Tip**: Start with the interactive dashboard template from today's session and customize it for the assignment requirements.

---

*Next: [Thursday - Advanced Streamlit Features →](02_streamlit_advanced_part1_business_dashboards.ipynb)*