# Interactive Sankey Library - Comprehensive Demonstrations

This notebook demonstrates all the features of the interactive Sankey diagram library.

## Features Demonstrated:
1. Basic Sankey Diagram
2. Time Series Animation
3. Histogram Edge Coloring
4. Interactive Filters
5. Custom Metrics
6. Complete Dashboard (All Features Combined)

In [1]:
# Setup and imports
import sys
sys.path.insert(0, '../src')

import pandas as pd
import numpy as np
from sankey_interactive import SankeyDiagram

print("✅ All imports successful!")

✅ All imports successful!


In [2]:
# Create sample time series data for energy flows
np.random.seed(42)

dates = pd.date_range('2024-01', periods=12, freq='ME')
data = []

sources = ['Solar', 'Wind', 'Coal', 'Nuclear', 'Gas']
intermediates = ['Grid', 'Storage', 'Direct']
targets = ['Residential', 'Commercial', 'Industrial']

for date in dates:
    # Simulate seasonal variations
    month = date.month
    solar_factor = 1 + 0.3 * np.sin(2 * np.pi * month / 12)
    wind_factor = 1 + 0.2 * np.cos(2 * np.pi * month / 12)
    
    # Create flows from sources to intermediates
    for source in sources:
        for intermediate in intermediates:
            if source == 'Solar':
                base_value = 100 * solar_factor
            elif source == 'Wind':
                base_value = 80 * wind_factor
            else:
                base_value = np.random.randint(50, 150)
            
            value = base_value * np.random.uniform(0.8, 1.2)
            
            data.append({
                'date': date,
                'source': source,
                'target': intermediate,
                'value': value,
                'type': 'generation'
            })
    
    # Create flows from intermediates to targets
    for intermediate in intermediates:
        for target in targets:
            value = np.random.uniform(50, 200)
            data.append({
                'date': date,
                'source': intermediate,
                'target': target,
                'value': value,
                'type': 'distribution'
            })

df = pd.DataFrame(data)
print(f"✅ Created sample data with {len(df)} flow records across {len(dates)} time periods")
df.head(10)

✅ Created sample data with 288 flow records across 12 time periods


Unnamed: 0,date,source,target,value,type
0,2024-01-31,Solar,Grid,109.228845,generation
1,2024-01-31,Solar,Storage,135.732858,generation
2,2024-01-31,Solar,Direct,125.671721,generation
3,2024-01-31,Wind,Grid,97.560299,generation
4,2024-01-31,Wind,Storage,80.942465,generation
5,2024-01-31,Wind,Direct,80.941559,generation
6,2024-01-31,Coal,Grid,121.978745,generation
7,2024-01-31,Coal,Storage,127.714862,generation
8,2024-01-31,Coal,Direct,42.028157,generation
9,2024-01-31,Nuclear,Grid,55.528775,generation


## Demo 1: Basic Sankey Diagram

Simple flow visualization showing basic node-to-node connections.

In [3]:
# Demo 1: Basic Sankey
simple_data = pd.DataFrame({
    'source': ['A', 'A', 'B', 'B', 'C'],
    'target': ['D', 'E', 'D', 'E', 'E'],
    'value': [10, 20, 15, 5, 25]
})

diagram = SankeyDiagram(simple_data)
fig = diagram.render('source', 'target', 'value', 
                     title="Demo 1: Basic Sankey Diagram",
                     show_timeline=False)
fig.show()

## Demo 2: Time Series Animation ⏱️

Interactive timeline with Play/Pause controls. Click **Play** to see the flows change over time!

In [4]:
# Demo 2: Time Series with Animation
generation_data = df[df['type'] == 'generation'].copy()

diagram = SankeyDiagram(generation_data, time_column='date')
fig = diagram.render('source', 'target', 'value',
                     title="Demo 2: Energy Generation Over Time (Click Play!)",
                     show_timeline=True,
                     show_histogram=False)
fig.show()

TypeError: SankeyDiagram._create_timeline_figure() takes 7 positional arguments but 9 were given

## Demo 3: Histogram Edge Coloring 🌈

Edges are colored based on flow values:
- **Blue** = Low values
- **Red** = High values

In [None]:
# Demo 3: Histogram Edge Coloring
first_month = df['date'].min()
monthly_data = df[df['date'] == first_month].copy()

diagram = SankeyDiagram(monthly_data)
fig = diagram.render('source', 'target', 'value',
                     title="Demo 3: Energy Flows with Value-Based Coloring",
                     show_histogram=True,
                     show_timeline=False)
fig.show()

## Demo 4: Interactive Filters 🔍

Multiple filters applied:
1. Only flows with value > 100
2. Only renewable sources (Solar, Wind)

In [None]:
# Demo 4: Filtered Sankey
diagram = SankeyDiagram(df, time_column='date')

# Add filter to show only high-value flows
diagram.add_filter('high_value', lambda data: data[data['value'] > 100])

# Add filter to show only renewable sources
diagram.add_filter('renewable', 
                  lambda data: data[data['source'].isin(['Solar', 'Wind'])])

fig = diagram.render('source', 'target', 'value',
                     title="Demo 4: Filtered - High-Value Renewable Flows Only",
                     show_timeline=True)
fig.show()

## Demo 5: Custom Metrics 📐

Custom functions define:
- Node sizing based on total throughput
- Edge coloring based on value distribution

In [None]:
# Demo 5: Custom Metrics
first_month_data = df[df['date'] == df['date'].min()].copy()

diagram = SankeyDiagram(first_month_data)

# Custom node metric: total flow through each node
def node_metric(data):
    sources = data.groupby('source')['value'].sum().to_dict()
    targets = data.groupby('target')['value'].sum().to_dict()
    return {**sources, **targets}

# Custom edge metric: flow variance
def edge_metric(data):
    return data.groupby(['source', 'target'])['value'].std().to_dict()

diagram.set_node_metric(node_metric)
diagram.set_edge_metric(edge_metric)

fig = diagram.render('source', 'target', 'value',
                     title="Demo 5: Sankey with Custom Metrics",
                     show_histogram=True,
                     show_timeline=False)
fig.show()

## Demo 6: Complete Interactive Dashboard 🚀

**ALL FEATURES COMBINED!**
- ✅ Timeline animation
- ✅ Histogram coloring
- ✅ Filters applied
- ✅ Custom metrics

This is a production-ready example showing real-world energy flow visualization.

In [None]:
# Demo 6: Everything Combined!
diagram = SankeyDiagram(df, time_column='date')

# Add filter to remove small flows
diagram.add_filter('remove_small', lambda data: data[data['value'] > 50])

# Custom metrics for node sizing
diagram.set_node_metric(
    lambda data: data.groupby('source')['value'].sum().to_dict()
)

fig = diagram.render('source', 'target', 'value',
                     title="Demo 6: Complete Interactive Energy Flow Dashboard",
                     show_histogram=True,
                     show_timeline=True)

# Add annotation
fig.add_annotation(
    text="Use the timeline slider to see changes over time<br>" +
         "Click Play to animate | Edges colored by flow value",
    xref="paper", yref="paper",
    x=0.5, y=-0.15,
    showarrow=False,
    font=dict(size=10, color="gray")
)

fig.show()

## Summary

### Key Features Demonstrated:

1. ✨ **Time Series Support** - Animate flows over time with Play/Pause controls
2. 📊 **Histogram Edges** - Value-based color gradients (blue→red)
3. 🎯 **Dynamic Metrics** - Custom functions for node sizes and edge properties
4. 🔍 **Interactive Filters** - Apply multiple filters simultaneously
5. 🎨 **Rich Customization** - Full control over appearance and behavior

### Quick API Reference:

```python
# Basic usage
diagram = SankeyDiagram(data)
diagram.render('source', 'target', 'value').show()

# With timeline
diagram = SankeyDiagram(data, time_column='date')
diagram.render('source', 'target', 'value', show_timeline=True).show()

# Add filters
diagram.add_filter('name', lambda df: df[df['value'] > 100])

# Custom metrics
diagram.set_node_metric(lambda df: df.groupby('source')['value'].sum().to_dict())

# Histogram coloring
diagram.render('source', 'target', 'value', show_histogram=True).show()
```

### Real-World Applications:

- **Energy Flow Analysis** - Power generation and distribution
- **Supply Chain Visualization** - Material flows through manufacturing
- **Financial Flows** - Money movement between accounts
- **Network Traffic** - Data flow analysis
- **Resource Allocation** - Budget distribution over time
- **Migration Patterns** - Population movement visualization