# Polars Data Visualization - Comprehensive Workshop

Master data visualization with Polars using Matplotlib and Plotly.

## What You'll Learn:
- Converting Polars DataFrames for plotting
- Matplotlib integration (static plots)
- Plotly integration (interactive plots)
- Line charts, bar charts, scatter plots, histograms
- Time series visualization
- Multi-panel plots and subplots
- Customizing styles and themes
- Real-world dashboard examples

## Why Visualize with Polars?
- ✅ Fast data preparation with Polars expressions
- ✅ Efficient aggregation before plotting
- ✅ Seamless integration with popular viz libraries
- ✅ Handle large datasets efficiently

## Libraries:
- **Matplotlib**: Static plots, publication-quality graphics
- **Plotly**: Interactive plots, web-ready visualizations

In [None]:
import polars as pl
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
from datetime import date, datetime, timedelta

# Set style for matplotlib
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

print(f"Polars version: {pl.__version__}")

---
# Part 1: Data Preparation for Plotting

## 1.1 Creating Sample Data

In [None]:
# Create sample sales data
np.random.seed(42)
dates = [date(2024, 1, 1) + timedelta(days=i) for i in range(365)]

df_sales = pl.DataFrame({
    'date': dates,
    'product': np.random.choice(['Laptop', 'Phone', 'Tablet', 'Headphones'], 365),
    'region': np.random.choice(['North', 'South', 'East', 'West'], 365),
    'revenue': np.random.randint(100, 2000, 365),
    'units_sold': np.random.randint(1, 20, 365)
})

print("Sample data:")
print(df_sales.head(10))
print(f"\nTotal rows: {len(df_sales)}")

## 1.2 Converting Polars to Plotting-Friendly Formats

In [None]:
# Method 1: Extract columns as Python lists (for matplotlib)
dates_list = df_sales['date'].to_list()
revenue_list = df_sales['revenue'].to_list()

print(f"As lists: {len(dates_list)} dates, {len(revenue_list)} revenue values")

# Method 2: Convert to NumPy arrays (faster for matplotlib)
dates_np = df_sales['date'].to_numpy()
revenue_np = df_sales['revenue'].to_numpy()

print(f"As NumPy: {dates_np.shape}, {revenue_np.shape}")

# Method 3: Use Polars directly (Plotly supports Polars DataFrames!)
print(f"\nPlotly can use Polars DataFrames directly!")

---
# Part 2: Matplotlib - Static Plots

## 2.1 Line Charts

In [None]:
# Simple line chart
plt.figure(figsize=(12, 6))
plt.plot(df_sales['date'].to_list(), df_sales['revenue'].to_list())
plt.title('Daily Revenue Over Time', fontsize=16, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Revenue ($)', fontsize=12)
plt.xticks(rotation=45)
plt.tight_layout()
plt.grid(True, alpha=0.3)
plt.show()

In [None]:
# Multiple lines - Revenue by Product
products = df_sales['product'].unique().to_list()

plt.figure(figsize=(14, 7))

for product in products:
    product_data = df_sales.filter(pl.col('product') == product).sort('date')
    
    # Calculate 7-day moving average
    product_data = product_data.with_columns(
        pl.col('revenue').rolling_mean(window_size=7).alias('revenue_ma7')
    )
    
    plt.plot(
        product_data['date'].to_list(),
        product_data['revenue_ma7'].to_list(),
        label=product,
        linewidth=2
    )

plt.title('Revenue Trend by Product (7-Day Moving Average)', fontsize=16, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Revenue ($)', fontsize=12)
plt.legend(loc='best', fontsize=10)
plt.xticks(rotation=45)
plt.tight_layout()
plt.grid(True, alpha=0.3)
plt.show()

## 2.2 Bar Charts

In [None]:
# Total revenue by product
product_revenue = (
    df_sales
    .group_by('product')
    .agg(pl.col('revenue').sum().alias('total_revenue'))
    .sort('total_revenue', descending=True)
)

plt.figure(figsize=(10, 6))
plt.bar(
    product_revenue['product'].to_list(),
    product_revenue['total_revenue'].to_list(),
    color=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728']
)
plt.title('Total Revenue by Product', fontsize=16, fontweight='bold')
plt.xlabel('Product', fontsize=12)
plt.ylabel('Total Revenue ($)', fontsize=12)
plt.xticks(rotation=0)
plt.tight_layout()
plt.grid(axis='y', alpha=0.3)

# Add value labels on bars
for i, (product, revenue) in enumerate(zip(product_revenue['product'].to_list(), product_revenue['total_revenue'].to_list())):
    plt.text(i, revenue + 1000, f'${revenue:,.0f}', ha='center', fontsize=10)

plt.show()

In [None]:
# Grouped bar chart - Revenue by Product and Region
product_region = (
    df_sales
    .group_by(['product', 'region'])
    .agg(pl.col('revenue').sum().alias('total_revenue'))
    .sort(['product', 'region'])
)

products = product_region['product'].unique().to_list()
regions = product_region['region'].unique().to_list()

fig, ax = plt.subplots(figsize=(12, 6))

x = np.arange(len(products))
width = 0.2

for i, region in enumerate(regions):
    region_data = product_region.filter(pl.col('region') == region)
    revenues = []
    
    for product in products:
        prod_revenue = region_data.filter(pl.col('product') == product)['total_revenue']
        revenues.append(prod_revenue[0] if len(prod_revenue) > 0 else 0)
    
    ax.bar(x + i * width, revenues, width, label=region)

ax.set_title('Revenue by Product and Region', fontsize=16, fontweight='bold')
ax.set_xlabel('Product', fontsize=12)
ax.set_ylabel('Total Revenue ($)', fontsize=12)
ax.set_xticks(x + width * 1.5)
ax.set_xticklabels(products)
ax.legend()
ax.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

## 2.3 Scatter Plots

In [None]:
# Scatter plot: Units Sold vs Revenue
plt.figure(figsize=(10, 6))

products = df_sales['product'].unique().to_list()
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728']

for product, color in zip(products, colors):
    product_data = df_sales.filter(pl.col('product') == product)
    plt.scatter(
        product_data['units_sold'].to_list(),
        product_data['revenue'].to_list(),
        label=product,
        alpha=0.6,
        s=50,
        color=color
    )

plt.title('Revenue vs Units Sold by Product', fontsize=16, fontweight='bold')
plt.xlabel('Units Sold', fontsize=12)
plt.ylabel('Revenue ($)', fontsize=12)
plt.legend(loc='best')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 2.4 Histograms

In [None]:
# Distribution of revenue
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Histogram
axes[0].hist(df_sales['revenue'].to_list(), bins=30, color='skyblue', edgecolor='black')
axes[0].set_title('Revenue Distribution', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Revenue ($)', fontsize=12)
axes[0].set_ylabel('Frequency', fontsize=12)
axes[0].grid(axis='y', alpha=0.3)

# Box plot
products = df_sales['product'].unique().to_list()
data_by_product = [df_sales.filter(pl.col('product') == p)['revenue'].to_list() for p in products]

axes[1].boxplot(data_by_product, labels=products)
axes[1].set_title('Revenue Distribution by Product', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Product', fontsize=12)
axes[1].set_ylabel('Revenue ($)', fontsize=12)
axes[1].grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

## 2.5 Pie Charts

In [None]:
# Market share by product
product_revenue = (
    df_sales
    .group_by('product')
    .agg(pl.col('revenue').sum().alias('total_revenue'))
    .sort('total_revenue', descending=True)
)

plt.figure(figsize=(10, 8))
plt.pie(
    product_revenue['total_revenue'].to_list(),
    labels=product_revenue['product'].to_list(),
    autopct='%1.1f%%',
    startangle=90,
    colors=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728']
)
plt.title('Revenue Market Share by Product', fontsize=16, fontweight='bold')
plt.axis('equal')
plt.tight_layout()
plt.show()

## 2.6 Heatmaps

In [None]:
# Create month and product revenue heatmap
monthly_product = (
    df_sales
    .with_columns(pl.col('date').dt.month().alias('month'))
    .group_by(['month', 'product'])
    .agg(pl.col('revenue').sum().alias('total_revenue'))
)

# Pivot for heatmap
pivot = monthly_product.pivot(
    values='total_revenue',
    index='month',
    columns='product'
).sort('month')

# Convert to numpy for heatmap
products = [col for col in pivot.columns if col != 'month']
data_matrix = pivot.select(products).to_numpy()

plt.figure(figsize=(12, 6))
plt.imshow(data_matrix, cmap='YlOrRd', aspect='auto')
plt.colorbar(label='Revenue ($)')
plt.title('Monthly Revenue Heatmap by Product', fontsize=16, fontweight='bold')
plt.xlabel('Product', fontsize=12)
plt.ylabel('Month', fontsize=12)
plt.xticks(range(len(products)), products, rotation=45)
plt.yticks(range(len(pivot)), pivot['month'].to_list())
plt.tight_layout()
plt.show()

---
# Part 3: Plotly - Interactive Plots

## 3.1 Interactive Line Charts

In [None]:
# Simple interactive line chart
# Plotly works directly with Polars DataFrames!

# Daily revenue
daily_revenue = df_sales.sort('date')

fig = px.line(
    daily_revenue,
    x='date',
    y='revenue',
    title='Daily Revenue (Interactive)',
    labels={'revenue': 'Revenue ($)', 'date': 'Date'}
)
fig.update_layout(hovermode='x unified')
fig.show()

In [None]:
# Multiple lines by product
# Calculate weekly aggregates for cleaner visualization
weekly_revenue = (
    df_sales
    .sort('date')
    .with_columns([
        pl.col('date').dt.truncate('1w').alias('week')
    ])
    .group_by(['week', 'product'])
    .agg(pl.col('revenue').sum().alias('revenue'))
    .sort('week')
)

fig = px.line(
    weekly_revenue,
    x='week',
    y='revenue',
    color='product',
    title='Weekly Revenue by Product',
    labels={'revenue': 'Revenue ($)', 'week': 'Week', 'product': 'Product'}
)
fig.update_layout(hovermode='x unified')
fig.show()

## 3.2 Interactive Bar Charts

In [None]:
# Total revenue by product
product_revenue = (
    df_sales
    .group_by('product')
    .agg([
        pl.col('revenue').sum().alias('total_revenue'),
        pl.col('units_sold').sum().alias('total_units')
    ])
    .sort('total_revenue', descending=True)
)

fig = px.bar(
    product_revenue,
    x='product',
    y='total_revenue',
    title='Total Revenue by Product',
    labels={'total_revenue': 'Total Revenue ($)', 'product': 'Product'},
    color='total_revenue',
    color_continuous_scale='Blues'
)
fig.show()

In [None]:
# Grouped bar chart
product_region = (
    df_sales
    .group_by(['product', 'region'])
    .agg(pl.col('revenue').sum().alias('total_revenue'))
)

fig = px.bar(
    product_region,
    x='product',
    y='total_revenue',
    color='region',
    barmode='group',
    title='Revenue by Product and Region',
    labels={'total_revenue': 'Total Revenue ($)', 'product': 'Product', 'region': 'Region'}
)
fig.show()

## 3.3 Interactive Scatter Plots

In [None]:
# Scatter plot with color and size
fig = px.scatter(
    df_sales,
    x='units_sold',
    y='revenue',
    color='product',
    size='revenue',
    hover_data=['date', 'region'],
    title='Revenue vs Units Sold (Interactive)',
    labels={'units_sold': 'Units Sold', 'revenue': 'Revenue ($)'}
)
fig.show()

## 3.4 Interactive Box Plots

In [None]:
# Box plot by product and region
fig = px.box(
    df_sales,
    x='product',
    y='revenue',
    color='region',
    title='Revenue Distribution by Product and Region',
    labels={'revenue': 'Revenue ($)', 'product': 'Product'}
)
fig.show()

## 3.5 Interactive Heatmaps

In [None]:
# Create heatmap data
monthly_product = (
    df_sales
    .with_columns(pl.col('date').dt.month().alias('month'))
    .group_by(['month', 'product'])
    .agg(pl.col('revenue').sum().alias('total_revenue'))
)

# Pivot for heatmap
pivot = monthly_product.pivot(
    values='total_revenue',
    index='month',
    columns='product'
).sort('month')

# Create heatmap
products = [col for col in pivot.columns if col != 'month']
z_data = pivot.select(products).to_numpy()

fig = go.Figure(data=go.Heatmap(
    z=z_data,
    x=products,
    y=pivot['month'].to_list(),
    colorscale='YlOrRd',
    text=z_data,
    texttemplate='$%{text:,.0f}',
    textfont={"size": 10},
    colorbar=dict(title="Revenue ($)")
))

fig.update_layout(
    title='Monthly Revenue Heatmap by Product',
    xaxis_title='Product',
    yaxis_title='Month'
)
fig.show()

## 3.6 Sunburst Charts

In [None]:
# Hierarchical view: Region > Product
hierarchy_data = (
    df_sales
    .group_by(['region', 'product'])
    .agg(pl.col('revenue').sum().alias('total_revenue'))
)

fig = px.sunburst(
    hierarchy_data,
    path=['region', 'product'],
    values='total_revenue',
    title='Revenue Hierarchy: Region → Product',
    color='total_revenue',
    color_continuous_scale='RdYlGn'
)
fig.show()

---
# Part 4: Time Series Visualization

## 4.1 Rolling Averages and Trends

In [None]:
# Calculate multiple rolling windows
time_series = (
    df_sales
    .sort('date')
    .group_by('date')
    .agg(pl.col('revenue').sum().alias('revenue'))
    .with_columns([
        pl.col('revenue').rolling_mean(window_size=7).alias('ma_7day'),
        pl.col('revenue').rolling_mean(window_size=30).alias('ma_30day')
    ])
)

# Plotly for interactive time series
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=time_series['date'].to_list(),
    y=time_series['revenue'].to_list(),
    mode='lines',
    name='Daily Revenue',
    line=dict(color='lightgray', width=1),
    opacity=0.5
))

fig.add_trace(go.Scatter(
    x=time_series['date'].to_list(),
    y=time_series['ma_7day'].to_list(),
    mode='lines',
    name='7-Day MA',
    line=dict(color='blue', width=2)
))

fig.add_trace(go.Scatter(
    x=time_series['date'].to_list(),
    y=time_series['ma_30day'].to_list(),
    mode='lines',
    name='30-Day MA',
    line=dict(color='red', width=2)
))

fig.update_layout(
    title='Revenue with Moving Averages',
    xaxis_title='Date',
    yaxis_title='Revenue ($)',
    hovermode='x unified'
)
fig.show()

## 4.2 Seasonality and Patterns

In [None]:
# Day of week patterns
dow_revenue = (
    df_sales
    .with_columns([
        pl.col('date').dt.weekday().alias('day_of_week'),
        pl.col('date').dt.strftime('%A').alias('day_name')
    ])
    .group_by(['day_of_week', 'day_name'])
    .agg(pl.col('revenue').mean().alias('avg_revenue'))
    .sort('day_of_week')
)

fig = px.bar(
    dow_revenue,
    x='day_name',
    y='avg_revenue',
    title='Average Revenue by Day of Week',
    labels={'avg_revenue': 'Average Revenue ($)', 'day_name': 'Day of Week'},
    color='avg_revenue',
    color_continuous_scale='Viridis'
)
fig.show()

---
# Part 5: Multi-Panel Dashboards

## 5.1 Matplotlib Subplots

In [None]:
# Create comprehensive dashboard
fig, axes = plt.subplots(2, 2, figsize=(16, 12))
fig.suptitle('Sales Dashboard', fontsize=20, fontweight='bold')

# 1. Time series (top-left)
daily = df_sales.group_by('date').agg(pl.col('revenue').sum().alias('revenue')).sort('date')
daily = daily.with_columns(pl.col('revenue').rolling_mean(window_size=7).alias('ma7'))
axes[0, 0].plot(daily['date'].to_list(), daily['ma7'].to_list(), linewidth=2, color='blue')
axes[0, 0].set_title('Daily Revenue Trend (7-Day MA)', fontsize=14, fontweight='bold')
axes[0, 0].set_xlabel('Date')
axes[0, 0].set_ylabel('Revenue ($)')
axes[0, 0].grid(True, alpha=0.3)
axes[0, 0].tick_params(axis='x', rotation=45)

# 2. Product revenue (top-right)
product_rev = df_sales.group_by('product').agg(pl.col('revenue').sum().alias('revenue')).sort('revenue', descending=True)
axes[0, 1].barh(product_rev['product'].to_list(), product_rev['revenue'].to_list(), color='skyblue')
axes[0, 1].set_title('Total Revenue by Product', fontsize=14, fontweight='bold')
axes[0, 1].set_xlabel('Revenue ($)')
axes[0, 1].grid(axis='x', alpha=0.3)

# 3. Regional distribution (bottom-left)
region_rev = df_sales.group_by('region').agg(pl.col('revenue').sum().alias('revenue'))
axes[1, 0].pie(
    region_rev['revenue'].to_list(),
    labels=region_rev['region'].to_list(),
    autopct='%1.1f%%',
    startangle=90
)
axes[1, 0].set_title('Revenue by Region', fontsize=14, fontweight='bold')

# 4. Units distribution (bottom-right)
axes[1, 1].hist(df_sales['units_sold'].to_list(), bins=20, color='coral', edgecolor='black')
axes[1, 1].set_title('Units Sold Distribution', fontsize=14, fontweight='bold')
axes[1, 1].set_xlabel('Units Sold')
axes[1, 1].set_ylabel('Frequency')
axes[1, 1].grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

## 5.2 Plotly Subplots (Interactive Dashboard)

In [None]:
from plotly.subplots import make_subplots

# Create subplots
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=('Daily Revenue Trend', 'Revenue by Product', 'Revenue by Region', 'Units Distribution'),
    specs=[[{"type": "scatter"}, {"type": "bar"}],
           [{"type": "pie"}, {"type": "histogram"}]]
)

# 1. Time series
daily = df_sales.group_by('date').agg(pl.col('revenue').sum().alias('revenue')).sort('date')
daily = daily.with_columns(pl.col('revenue').rolling_mean(window_size=7).alias('ma7'))
fig.add_trace(
    go.Scatter(x=daily['date'].to_list(), y=daily['ma7'].to_list(), name='7-Day MA', line=dict(color='blue')),
    row=1, col=1
)

# 2. Product revenue
product_rev = df_sales.group_by('product').agg(pl.col('revenue').sum().alias('revenue')).sort('revenue', descending=True)
fig.add_trace(
    go.Bar(x=product_rev['product'].to_list(), y=product_rev['revenue'].to_list(), name='Product Revenue'),
    row=1, col=2
)

# 3. Regional pie
region_rev = df_sales.group_by('region').agg(pl.col('revenue').sum().alias('revenue'))
fig.add_trace(
    go.Pie(labels=region_rev['region'].to_list(), values=region_rev['revenue'].to_list(), name='Region'),
    row=2, col=1
)

# 4. Units histogram
fig.add_trace(
    go.Histogram(x=df_sales['units_sold'].to_list(), name='Units', nbinsx=20),
    row=2, col=2
)

fig.update_layout(height=800, showlegend=False, title_text="Interactive Sales Dashboard")
fig.show()

---
# Part 6: Advanced Customization

## 6.1 Custom Themes and Styles

In [None]:
# Matplotlib: Custom style
plt.style.use('seaborn-v0_8-darkgrid')

custom_colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#FFA07A']

product_rev = df_sales.group_by('product').agg(pl.col('revenue').sum().alias('revenue')).sort('revenue', descending=True)

fig, ax = plt.subplots(figsize=(12, 6))
bars = ax.bar(product_rev['product'].to_list(), product_rev['revenue'].to_list(), color=custom_colors)

# Customize
ax.set_title('Custom Styled Bar Chart', fontsize=18, fontweight='bold', pad=20)
ax.set_xlabel('Product', fontsize=14, fontweight='bold')
ax.set_ylabel('Revenue ($)', fontsize=14, fontweight='bold')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.grid(axis='y', alpha=0.3, linestyle='--')

# Add value labels
for bar in bars:
    height = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2., height,
            f'${height:,.0f}',
            ha='center', va='bottom', fontsize=12, fontweight='bold')

plt.tight_layout()
plt.show()

In [None]:
# Plotly: Custom theme
import plotly.graph_objects as go
import plotly.io as pio

# Set template
pio.templates.default = "plotly_dark"

weekly = (
    df_sales
    .with_columns(pl.col('date').dt.truncate('1w').alias('week'))
    .group_by(['week', 'product'])
    .agg(pl.col('revenue').sum().alias('revenue'))
    .sort('week')
)

fig = px.line(
    weekly,
    x='week',
    y='revenue',
    color='product',
    title='Dark Theme - Weekly Revenue',
    template='plotly_dark'
)

fig.update_layout(
    font=dict(size=14),
    hovermode='x unified',
    plot_bgcolor='rgba(0,0,0,0)',
    paper_bgcolor='rgba(0,0,0,0)'
)

fig.show()

# Reset to default
pio.templates.default = "plotly"

---
# Part 7: Performance Tips

## 7.1 Aggregate Before Plotting

In [None]:
# ❌ BAD: Plotting raw data with millions of points
# plt.plot(huge_df['date'].to_list(), huge_df['value'].to_list())

# ✅ GOOD: Aggregate first, then plot
# For time series, aggregate by hour/day/week
aggregated = (
    df_sales
    .with_columns(pl.col('date').dt.truncate('1w').alias('week'))
    .group_by('week')
    .agg([
        pl.col('revenue').sum().alias('revenue'),
        pl.col('units_sold').sum().alias('units')
    ])
    .sort('week')
)

plt.figure(figsize=(12, 6))
plt.plot(aggregated['week'].to_list(), aggregated['revenue'].to_list(), linewidth=2)
plt.title('Weekly Aggregated Revenue (Efficient)', fontsize=14, fontweight='bold')
plt.xlabel('Week')
plt.ylabel('Revenue ($)')
plt.xticks(rotation=45)
plt.tight_layout()
plt.grid(True, alpha=0.3)
plt.show()

print(f"Reduced from {len(df_sales)} to {len(aggregated)} points!")

## 7.2 Use Lazy Evaluation

In [None]:
# Use lazy evaluation for large datasets
lazy_result = (
    df_sales.lazy()
    .filter(pl.col('product') == 'Laptop')
    .group_by('date')
    .agg(pl.col('revenue').sum().alias('revenue'))
    .sort('date')
    .collect()  # Only collect when ready to plot
)

# Now plot
fig = px.line(lazy_result, x='date', y='revenue', title='Laptop Revenue (Lazy Evaluation)')
fig.show()

---
# Summary

## Key Takeaways:

### 1. **Choosing the Right Library**
   - **Matplotlib**: Publication-quality static plots, full customization
   - **Plotly**: Interactive plots, web dashboards, no backend needed

### 2. **Data Preparation**
   - Aggregate data before plotting (performance!)
   - Use Polars expressions for efficient transformations
   - Convert to lists/numpy for Matplotlib
   - Plotly works directly with Polars DataFrames!

### 3. **Common Plot Types**

| Plot Type | Use Case | Matplotlib | Plotly |
|-----------|----------|------------|--------|
| **Line** | Time series, trends | `plt.plot()` | `px.line()` |
| **Bar** | Categorical comparison | `plt.bar()` | `px.bar()` |
| **Scatter** | Correlations | `plt.scatter()` | `px.scatter()` |
| **Histogram** | Distributions | `plt.hist()` | `px.histogram()` |
| **Box** | Statistical summary | `plt.boxplot()` | `px.box()` |
| **Heatmap** | Matrix data | `plt.imshow()` | `go.Heatmap()` |
| **Pie** | Part-to-whole | `plt.pie()` | `px.pie()` |

### 4. **Best Practices**
   - ✅ Aggregate large datasets before plotting
   - ✅ Use meaningful titles and labels
   - ✅ Choose appropriate plot types for your data
   - ✅ Add legends for multiple series
   - ✅ Use color purposefully
   - ✅ Consider your audience (static vs interactive)

### 5. **Performance Tips**
   - Use lazy evaluation for large datasets
   - Aggregate time series (hourly/daily/weekly)
   - Sample data if appropriate (`.sample()`)
   - Use NumPy arrays for Matplotlib (faster)

### 6. **Common Patterns**

```python
# Matplotlib pattern
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(df['x'].to_list(), df['y'].to_list())
ax.set_title('Title')
plt.show()

# Plotly pattern (works with Polars directly!)
fig = px.line(df, x='x', y='y', title='Title')
fig.show()
```

## Remember:
> **Prepare data with Polars, visualize with Matplotlib/Plotly!**  
> The best visualization is one that clearly communicates your insights!

---
# Practice Exercises

In [None]:
# Exercise data - E-commerce store
np.random.seed(123)
dates_exercise = [date(2024, 1, 1) + timedelta(days=i) for i in range(180)]

df_exercise = pl.DataFrame({
    'date': dates_exercise,
    'category': np.random.choice(['Electronics', 'Clothing', 'Books', 'Home'], 180),
    'sales': np.random.randint(50, 500, 180),
    'customers': np.random.randint(1, 50, 180),
    'returns': np.random.randint(0, 20, 180)
})

print("Exercise data:")
print(df_exercise.head(10))

In [None]:
# Exercise 1: Create a line chart showing total sales over time (weekly aggregation)
# Use Matplotlib or Plotly
# Your code here:


In [None]:
# Exercise 2: Create a grouped bar chart showing total sales by category
# Your code here:


In [None]:
# Exercise 3: Create a scatter plot showing relationship between customers and sales
# Color by category
# Your code here:


In [None]:
# Exercise 4: Create a 2x2 dashboard showing:
# - Time series of sales
# - Bar chart of sales by category
# - Box plot of return rates
# - Histogram of daily customers
# Your code here:


In [None]:
# Exercise 5: Create an interactive Plotly chart with:
# - Line showing 7-day and 30-day moving averages of sales
# - Dropdown to select category
# Your code here:
