# Sapheneia TimesFM Comprehensive Demo

This notebook demonstrates the complete Sapheneia TimesFM implementation with:

- **Model Loading**: Both HuggingFace checkpoints and local model paths
- **Data Processing**: CSV loading with flexible data definitions
- **Basic Forecasting**: Point forecasts with TimesFM
- **Quantile Forecasting**: Using experimental_quantile_forecast (Marcelo's approach)
- **Covariates Support**: Dynamic and static, numerical and categorical covariates
- **Professional Visualization**: Publication-quality plots with prediction intervals
- **Bootstrap Intervals**: Uncertainty quantification through bootstrap sampling

## Prerequisites

Ensure you have run the setup script:
```bash
./setup_environment.sh
```

Or have the environment activated:
```bash
source .venv/bin/activate
```

## Setup and Imports

Import the Sapheneia TimesFM library and configure the environment for optimal forecasting.

In [None]:
# Configure notebook environment
import warnings
warnings.filterwarnings('ignore')

import sys
import os

# Add src to path for imports
sys.path.append('../src')

# Core imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
import json

# Sapheneia TimesFM imports
from model import TimesFMModel
from data import DataProcessor
from forecast import Forecaster
from visualization import Visualizer

print("🚀 Sapheneia TimesFM Environment Ready!")
print(f"📅 Demo Date: {datetime.now().strftime('%Y-%m-%d %H:%M')}")

## Configuration Parameters

Define the key parameters for our TimesFM forecasting setup. These can be adjusted based on your specific requirements.

In [None]:
# =============================================================================
# CONFIGURATION PARAMETERS
# =============================================================================

# Model Configuration
CONTEXT_LEN = 100          # Context length for input time series
HORIZON_LEN = 24           # Forecast horizon in number of steps
DEVICE_BACKEND = "cpu"     # Options: "cpu", "gpu", "tpu"

# Checkpoint Configuration - Choose ONE:
# Option 1: HuggingFace checkpoint
HUGGINGFACE_CHECKPOINT = "google/timesfm-2.0-500m-pytorch"  # or "google/timesfm-2.0-500m-jax"
LOCAL_MODEL_PATH = None

# Option 2: Local model path (uncomment to use)
# HUGGINGFACE_CHECKPOINT = None 
# LOCAL_MODEL_PATH = "/path/to/your/local/timesfm/model"  # Update this path

# Data Configuration
DATA_FILE = "../data/sample_financial_data.csv"  # Will be created if doesn't exist
FREQ = 0  # Frequency indicator: [0] = generic, [1] = yearly, [2] = quarterly

# Forecasting Configuration
USE_COVARIATES = True      # Whether to use covariates in forecasting
USE_QUANTILES = True       # Whether to compute quantile forecasts (always True in webapp)

# Quantile Configuration (aligned with webapp)
QUANTILE_INDICES = [1, 9]  # Default quantile indices to visualize (10% and 90%)
QUANTILE_COLORS = ['#ff9999', '#99ccff', '#99ff99', '#ffcc99', '#cc99ff', '#ffff99']

# Legacy parameters (kept for compatibility)
BOOTSTRAP_SAMPLES = 50     # Number of bootstrap samples for prediction intervals
CONFIDENCE_LEVELS = [0.5, 0.8, 0.95]  # Confidence levels for prediction intervals

print("⚙️ Configuration Summary:")
print(f"   Context Length: {CONTEXT_LEN}")
print(f"   Horizon Length: {HORIZON_LEN}")
print(f"   Backend: {DEVICE_BACKEND}")
print(f"   Checkpoint: {HUGGINGFACE_CHECKPOINT or LOCAL_MODEL_PATH}")
print(f"   Use Covariates: {USE_COVARIATES}")
print(f"   Use Quantiles: {USE_QUANTILES}")

## Generate Sample Data

For this demo, we'll generate synthetic financial data with realistic correlations and covariates. In practice, you would load your own CSV data file.

In [None]:
# =============================================================================
# SAMPLE DATA GENERATION
# =============================================================================

def generate_sample_financial_data(num_periods=200, save_path=None):
    """
    Generate sample financial time series data with covariates.
    This creates realistic financial data similar to Lucas's approach.
    """
    print("📊 Generating sample financial data...")
    
    np.random.seed(42)
    
    # Create date range
    start_date = datetime(2020, 1, 1)
    dates = [start_date + timedelta(weeks=i) for i in range(num_periods)]
    
    # Bitcoin price (target variable)
    btc_base = 20000
    btc_trend = np.linspace(0, 0.8, num_periods)  # 80% growth
    btc_volatility = np.random.normal(0, 0.08, num_periods)  # 8% weekly volatility
    btc_seasonal = 0.1 * np.sin(2 * np.pi * np.arange(num_periods) / 52)
    btc_price = btc_base * np.exp(btc_trend + btc_volatility + btc_seasonal)
    
    # Ethereum price (dynamic numerical covariate)
    eth_factor = 0.06  # ETH ≈ 6% of BTC
    eth_noise = np.random.normal(0, 0.05, num_periods)
    eth_price = btc_price * eth_factor * (1 + eth_noise)
    
    # VIX volatility index (dynamic numerical covariate - inverse correlation)
    btc_returns = np.diff(np.log(btc_price), prepend=0)
    vix_base = 20
    vix_price = vix_base - 2 * btc_returns * 10 + np.random.normal(0, 5, num_periods)
    vix_price = np.clip(vix_price, 10, 80)
    
    # S&P 500 (dynamic numerical covariate)
    sp500_base = 3500
    sp500_trend = np.linspace(0, 0.3, num_periods)  # 30% growth
    sp500_crypto_corr = 0.3 * (btc_price / btc_price[0] - 1) * 0.1
    sp500_noise = np.random.normal(0, 0.02, num_periods)
    sp500_price = sp500_base * np.exp(sp500_trend + sp500_crypto_corr + sp500_noise)
    
    # Quarter (dynamic categorical covariate)
    quarters = [(pd.Timestamp(d).month - 1) // 3 + 1 for d in dates]
    
    # Market regime (dynamic categorical covariate)
    regimes = []
    for i, price in enumerate(btc_price):
        if i < 10:
            regime = "bull" if price > btc_base else "bear"
        else:
            sma_10 = np.mean(btc_price[i-10:i])
            regime = "bull" if price > sma_10 * 1.05 else ("bear" if price < sma_10 * 0.95 else "neutral")
        regimes.append(regime)
    
    # Asset category (static categorical covariate)
    asset_category = "cryptocurrency"
    
    # Base volatility (static numerical covariate)
    base_volatility = 0.08  # 8% weekly volatility
    
    # Create DataFrame
    df = pd.DataFrame({
        'date': dates,
        'btc_price': btc_price,
        'eth_price': eth_price,
        'vix_index': vix_price,
        'sp500_price': sp500_price,
        'quarter': quarters,
        'market_regime': regimes,
        'asset_category': asset_category,
        'base_volatility': base_volatility
    })
    
    # Save to CSV if requested
    if save_path:
        df.to_csv(save_path, index=False)
        print(f"📁 Sample data saved to: {save_path}")
    
    return df

# Generate or load sample data
if not os.path.exists(DATA_FILE):
    # Create data directory if it doesn't exist
    os.makedirs(os.path.dirname(DATA_FILE), exist_ok=True)
    sample_data = generate_sample_financial_data(save_path=DATA_FILE)
else:
    sample_data = pd.read_csv(DATA_FILE)
    sample_data['date'] = pd.to_datetime(sample_data['date'])
    print(f"📁 Loaded existing data from: {DATA_FILE}")

print(f"\n📋 Data Summary:")
print(f"   Shape: {sample_data.shape}")
print(f"   Date Range: {sample_data['date'].min().date()} to {sample_data['date'].max().date()}")
print(f"   Columns: {list(sample_data.columns)}")

# Display first few rows
print("\n📊 Sample Data Preview:")
sample_data.head()

## Define Data Structure

Create the data definition that specifies how each column should be interpreted by TimesFM:

- **target**: The main time series to forecast
- **dynamic_numerical**: Numerical covariates that vary over time
- **dynamic_categorical**: Categorical covariates that vary over time
- **static_numerical**: Numerical covariates that are constant per time series
- **static_categorical**: Categorical covariates that are constant per time series

In [None]:
# =============================================================================
# DATA DEFINITION
# =============================================================================

# Define how each column should be interpreted
data_definition = {
    "btc_price": "target",                    # Main target to forecast
    "eth_price": "dynamic_numerical",         # Ethereum price over time
    "vix_index": "dynamic_numerical",         # VIX volatility index over time
    "sp500_price": "dynamic_numerical",       # S&P 500 price over time
    "quarter": "dynamic_categorical",         # Quarter of the year
    "market_regime": "dynamic_categorical",   # Market regime (bull/bear/neutral)
    "asset_category": "static_categorical",   # Asset category (constant)
    "base_volatility": "static_numerical"     # Base volatility (constant)
}

print("🏗️ Data Definition:")
for column, dtype in data_definition.items():
    print(f"   {column:20} -> {dtype}")

# Save data definition for reference
data_def_path = "../data/sample_data_definition.json"
os.makedirs(os.path.dirname(data_def_path), exist_ok=True)
with open(data_def_path, 'w') as f:
    json.dump(data_definition, f, indent=2)
    
print(f"\n📁 Data definition saved to: {data_def_path}")

## Initialize TimesFM Model

Load the TimesFM model using either HuggingFace checkpoint or local model path. The model will be automatically configured with optimal parameters.

In [None]:
# =============================================================================
# MODEL INITIALIZATION
# =============================================================================

print("🤖 Initializing TimesFM Model...")

# Create model wrapper
model_wrapper = TimesFMModel(
    backend=DEVICE_BACKEND,
    context_len=CONTEXT_LEN,
    horizon_len=HORIZON_LEN,
    checkpoint=HUGGINGFACE_CHECKPOINT,
    local_model_path=LOCAL_MODEL_PATH,
    num_layers=50  # Must match TimesFM 2.0-500m checkpoint
)

# Load the model
timesfm_model = model_wrapper.load_model()

# Display model information
model_info = model_wrapper.get_model_info()
print("\n📋 Model Information:")
for key, value in model_info.items():
    if key != 'capabilities':
        print(f"   {key:20} = {value}")

print("\n🔧 Model Capabilities:")
for capability, available in model_info['capabilities'].items():
    status = "✅" if available else "❌"
    print(f"   {status} {capability}")

## Process Data for Forecasting

Load and process the CSV data, applying appropriate type conversions and preparing the data structure required by TimesFM.

In [None]:
# =============================================================================
# DATA PROCESSING
# =============================================================================

print("📊 Processing data for TimesFM...")

# Initialize data processor
data_processor = DataProcessor()

# Load and process CSV data
processed_data = data_processor.load_csv_data(DATA_FILE, data_definition)

# Get data summary
data_summary = data_processor.get_data_summary()
print("\n📋 Data Processing Summary:")
print(f"   Total periods: {data_summary['date_range']['total_periods']}")
print(f"   Date range: {data_summary['date_range']['start']} to {data_summary['date_range']['end']}")

# Check if we have sufficient data
total_required = CONTEXT_LEN + HORIZON_LEN
if len(processed_data) < total_required:
    print(f"⚠️  Warning: Insufficient data. Need {total_required} periods, have {len(processed_data)}")
    # Adjust parameters for available data
    CONTEXT_LEN = max(20, len(processed_data) - HORIZON_LEN - 10)
    print(f"   Adjusted context length to: {CONTEXT_LEN}")
else:
    print(f"✅ Sufficient data available: {len(processed_data)} periods")

# Prepare forecast inputs
target_inputs, covariates = data_processor.prepare_forecast_data(
    processed_data,
    context_len=CONTEXT_LEN,
    horizon_len=HORIZON_LEN,
    target_column="btc_price"
)

print(f"\n🎯 Forecast Preparation:")
print(f"   Target inputs length: {len(target_inputs)}")
print(f"   Target value range: ${min(target_inputs):,.0f} - ${max(target_inputs):,.0f}")
print(f"   Covariate types: {len(covariates)}")

for cov_type, cov_data in covariates.items():
    if cov_data:
        print(f"      {cov_type}: {list(cov_data.keys())}")

# Validate inputs
validation_passed = data_processor.validate_forecast_inputs(
    target_inputs, covariates, CONTEXT_LEN, HORIZON_LEN
)
print(f"\n✅ Data validation: {'Passed' if validation_passed else 'Failed'}")

## Perform Forecasting

Now we'll perform different types of forecasting:

1. **Basic Point Forecasting**: Standard TimesFM forecasting
2. **Quantile Forecasting**: Using experimental_quantile_forecast (Marcelo's approach)
3. **Covariates-Enhanced Forecasting**: Incorporating exogenous variables

In [None]:
# =============================================================================
# FORECASTING
# =============================================================================

print("🔮 Performing TimesFM Forecasting...")

# Initialize forecaster
forecaster = Forecaster(timesfm_model)

# Store all forecast results
forecast_results = {}

# 1. BASIC POINT FORECASTING
print("\n1️⃣ Basic Point Forecasting...")
point_forecast, point_metadata = forecaster.forecast_basic(
    inputs=target_inputs,
    freq=FREQ
)
forecast_results['basic_point'] = point_forecast[0]  # Extract single series
print(f"   ✅ Point forecast completed: {len(forecast_results['basic_point'])} periods")

# 2. QUANTILE FORECASTING (Updated approach aligned with webapp)
if USE_QUANTILES:
    print("\n2️⃣ Quantile Forecasting (webapp-aligned approach)...")
    try:
        # Use the updated forecast method that returns both point and quantiles
        point_forecast_q, quantile_forecast = forecaster.forecast(
            inputs=target_inputs,
            freq=FREQ
        )
        forecast_results['quantile_point'] = point_forecast_q[0] if point_forecast_q.ndim > 1 else point_forecast_q
        if quantile_forecast is not None:
            forecast_results['quantile_forecast'] = quantile_forecast[0] if quantile_forecast.ndim > 2 else quantile_forecast
            print(f"   ✅ Quantile forecast completed: {quantile_forecast.shape}")
            
            # Process quantiles to create multiple bands (aligned with webapp)
            quantiles = np.array(forecast_results['quantile_forecast'])
            if quantiles.ndim == 2:
                # Ensure quantiles are sorted ascending
                try:
                    if quantiles.shape[1] < quantiles.shape[0]:
                        # shape (horizon, num_q)
                        order = np.argsort(np.nanmedian(quantiles, axis=0))
                        quantiles = quantiles[:, order]
                    else:
                        # shape (num_q, horizon)
                        order = np.argsort(np.nanmedian(quantiles, axis=1))
                        quantiles = quantiles[order, :]
                except Exception as e:
                    print(f"   ⚠️  Could not sort quantiles: {e}")
                
                # Create quantile bands based on selected indices
                if quantiles.shape[1] < quantiles.shape[0]:
                    q_mat = quantiles
                else:
                    q_mat = quantiles.T  # Make shape (horizon, num_q)
                
                num_q = q_mat.shape[1]
                print(f"   📊 Available quantiles: {num_q} (indices 0-{num_q-1})")
                
                # Create bands for selected quantile indices
                selected_sorted = sorted([i for i in QUANTILE_INDICES if 0 <= i < num_q])
                if len(selected_sorted) >= 2:
                    quantile_bands = {}
                    for i in range(len(selected_sorted) - 1):
                        lower_idx = selected_sorted[i]
                        upper_idx = selected_sorted[i + 1]
                        band_name = f'quantile_band_{i}'
                        quantile_bands[f'{band_name}_lower'] = q_mat[:, lower_idx].tolist()
                        quantile_bands[f'{band_name}_upper'] = q_mat[:, upper_idx].tolist()
                        # Create readable labels
                        lower_pct = int(round(100 * (lower_idx / (num_q - 1))))
                        upper_pct = int(round(100 * (upper_idx / (num_q - 1))))
                        quantile_bands[f'{band_name}_label'] = f"Q{lower_pct}–Q{upper_pct}"
                    
                    forecast_results['quantile_bands'] = quantile_bands
                    print(f"   ✅ Created {len(selected_sorted)-1} quantile bands from indices: {selected_sorted}")
                else:
                    print(f"   ⚠️  Insufficient valid quantile indices: {selected_sorted}")
        else:
            print("   ⚠️  Quantile forecasting not available")
    except Exception as e:
        print(f"   ❌ Quantile forecasting failed: {str(e)}")
        import traceback
        traceback.print_exc()
else:
    print("\n2️⃣ Quantile forecasting skipped (USE_QUANTILES=False)")

# 3. COVARIATES-ENHANCED FORECASTING
if USE_COVARIATES and any(covariates.values()):
    print("\n3️⃣ Covariates-Enhanced Forecasting...")
    try:
        enhanced_forecast, linear_forecast = forecaster.forecast_with_covariates(
            inputs=target_inputs,
            dynamic_numerical_covariates=covariates.get('dynamic_numerical_covariates'),
            dynamic_categorical_covariates=covariates.get('dynamic_categorical_covariates'),
            static_numerical_covariates=covariates.get('static_numerical_covariates'),
            static_categorical_covariates=covariates.get('static_categorical_covariates'),
            freq=FREQ,
            xreg_mode="xreg + timesfm",  # Following the successful approach
            normalize_xreg_target_per_input=True
        )
        forecast_results['enhanced_covariates'] = enhanced_forecast[0]
        forecast_results['linear_model'] = linear_forecast[0]
        print(f"   ✅ Covariates forecasting completed")
        print(f"      Enhanced forecast range: ${min(enhanced_forecast[0]):,.0f} - ${max(enhanced_forecast[0]):,.0f}")
    except Exception as e:
        print(f"   ❌ Covariates forecasting failed: {str(e)}")
        USE_COVARIATES = False
else:
    print("\n3️⃣ Covariates forecasting skipped")
    USE_COVARIATES = False

print(f"\n🎯 Forecasting Summary:")
for method, forecast in forecast_results.items():
    print(f"   {method:20}: {len(forecast)} periods, range ${min(forecast):,.0f} - ${max(forecast):,.0f}")

## Generate Prediction Intervals

Create prediction intervals using bootstrap sampling to quantify forecast uncertainty. This provides confidence bands around our point forecasts.

In [None]:
# =============================================================================
# PREDICTION INTERVALS
# =============================================================================

print("📊 Generating Prediction Intervals...")

# Choose the best available forecast for interval generation
if 'enhanced_covariates' in forecast_results:
    base_method = 'enhanced_covariates'
    covariates_for_intervals = covariates if USE_COVARIATES else None
elif 'basic_point' in forecast_results:
    base_method = 'basic_point'
    covariates_for_intervals = None
else:
    print("❌ No forecasts available for interval generation")
    prediction_intervals = None

if base_method:
    print(f"Using '{base_method}' forecast for interval generation")
    
    try:
        prediction_intervals = forecaster.generate_prediction_intervals(
            inputs=target_inputs,
            freq=FREQ,
            covariates=covariates_for_intervals,
            num_bootstrap_samples=BOOTSTRAP_SAMPLES,
            confidence_levels=CONFIDENCE_LEVELS,
            noise_scale=0.05  # 5% input noise for bootstrap
        )
        
        print("\n📈 Prediction Intervals Generated:")
        for key, values in prediction_intervals.items():
            if 'lower_' in key or 'upper_' in key:
                conf_level = key.split('_')[1]
                if f'lower_{conf_level}' in prediction_intervals and f'upper_{conf_level}' in prediction_intervals:
                    width = np.mean(prediction_intervals[f'upper_{conf_level}'] - prediction_intervals[f'lower_{conf_level}'])
                    print(f"   {conf_level}% Interval Width: ${width:,.0f}")
        
    except Exception as e:
        print(f"❌ Prediction interval generation failed: {str(e)}")
        prediction_intervals = None
else:
    prediction_intervals = None

## Prepare Visualization Data

Extract and prepare the data for professional visualization, including historical context and future actuals for comparison.

In [None]:
# =============================================================================
# VISUALIZATION DATA PREPARATION
# =============================================================================

print("🎨 Preparing visualization data...")

# Extract historical data for plotting
historical_data = target_inputs

# Extract actual future data for comparison (if available)
if len(processed_data) > CONTEXT_LEN + HORIZON_LEN:
    actual_future = processed_data['btc_price'].iloc[CONTEXT_LEN:CONTEXT_LEN + HORIZON_LEN].values
else:
    actual_future = None
    print("⚠️  No actual future data available for comparison")

# Prepare dates for plotting
dates_historical = processed_data['date'].iloc[:CONTEXT_LEN].tolist()
if len(processed_data) > CONTEXT_LEN:
    dates_future = processed_data['date'].iloc[CONTEXT_LEN:CONTEXT_LEN + HORIZON_LEN].tolist()
else:
    # Generate synthetic future dates
    last_date = dates_historical[-1]
    dates_future = [last_date + timedelta(weeks=i+1) for i in range(HORIZON_LEN)]

# Prepare covariates data for visualization
covariates_viz_data = {}
if USE_COVARIATES:
    # Extract dynamic numerical covariates
    for cov_name, cov_values in covariates.get('dynamic_numerical_covariates', {}).items():
        if cov_values and len(cov_values[0]) >= CONTEXT_LEN + HORIZON_LEN:
            historical_cov = cov_values[0][:CONTEXT_LEN]
            future_cov = cov_values[0][CONTEXT_LEN:CONTEXT_LEN + HORIZON_LEN]
            covariates_viz_data[cov_name] = {
                'historical': historical_cov,
                'future': future_cov
            }

print(f"📊 Visualization Data Summary:")
print(f"   Historical periods: {len(historical_data)}")
print(f"   Forecast periods: {HORIZON_LEN}")
print(f"   Actual future available: {'Yes' if actual_future is not None else 'No'}")
print(f"   Covariates for viz: {list(covariates_viz_data.keys())}")
print(f"   Date range: {dates_historical[0].date()} to {dates_future[-1].date()}")

## Professional Visualizations

Create publication-quality visualizations of our forecasting results using the Sapheneia professional styling.

In [None]:
# =============================================================================
# PROFESSIONAL VISUALIZATIONS
# =============================================================================

print("🎨 Creating professional visualizations...")

# Initialize visualizer
visualizer = Visualizer(style="professional")

# Choose the best forecast for main visualization
if 'enhanced_covariates' in forecast_results:
    main_forecast = forecast_results['enhanced_covariates']
    main_title = "Bitcoin Price Forecast with Covariates Enhancement"
elif 'basic_point' in forecast_results:
    main_forecast = forecast_results['basic_point']
    main_title = "Bitcoin Price Forecast (Basic TimesFM)"
else:
    print("❌ No forecast data available for visualization")
    main_forecast = None

if main_forecast is not None:
    # 1. MAIN FORECAST VISUALIZATION
    print("\n1️⃣ Creating main forecast visualization...")
    
    # Prepare intervals for visualization (prioritize quantile bands over bootstrap intervals)
    viz_intervals = {}
    if 'quantile_bands' in forecast_results:
        print("   📊 Using quantile bands for visualization...")
        viz_intervals = forecast_results['quantile_bands']
    elif prediction_intervals:
        print("   📊 Using bootstrap prediction intervals for visualization...")
        viz_intervals = prediction_intervals
    
    fig1 = visualizer.plot_forecast_with_intervals(
        historical_data=historical_data,
        forecast=main_forecast,
        intervals=viz_intervals,
        actual_future=actual_future,
        dates_historical=dates_historical,
        dates_future=dates_future,
        title=main_title,
        target_name="Bitcoin Price ($)",
        save_path="forecast_main.png"
    )
    plt.show()
    
    # 2. COMPREHENSIVE VISUALIZATION WITH COVARIATES
    if covariates_viz_data:
        print("\n2️⃣ Creating comprehensive visualization with covariates...")
        
        fig2 = visualizer.plot_forecast_with_covariates(
            historical_data=historical_data,
            forecast=main_forecast,
            covariates_data=covariates_viz_data,
            intervals=prediction_intervals,
            actual_future=actual_future,
            dates_historical=dates_historical,
            dates_future=dates_future,
            title="Bitcoin Forecast with Comprehensive Covariates Analysis",
            target_name="Bitcoin Price ($)",
            save_path="forecast_comprehensive.png"
        )
        plt.show()
    
    # 3. FORECAST METHODS COMPARISON
    if len(forecast_results) > 1:
        print("\n3️⃣ Creating forecast methods comparison...")
        
        # Filter forecast results for comparison (exclude auxiliary forecasts)
        comparison_forecasts = {k: v for k, v in forecast_results.items() 
                              if k not in ['linear_model', 'quantile_forecast']}
        
        if len(comparison_forecasts) > 1:
            fig3 = visualizer.plot_forecast_comparison(
                forecasts_dict=comparison_forecasts,
                historical_data=historical_data,
                actual_future=actual_future,
                title="TimesFM Forecasting Methods Comparison",
                save_path="forecast_comparison.png"
            )
            plt.show()

print("\n✅ All visualizations completed!")
print("📁 Saved plots:")
for plot_file in ['forecast_main.png', 'forecast_comprehensive.png', 'forecast_comparison.png']:
    if os.path.exists(plot_file):
        print(f"   - {plot_file}")

## Quantile Band Analysis and Data Export

Analyze the quantile bands and export the forecast data including quantiles for further analysis.


In [None]:
# =============================================================================
# QUANTILE BAND ANALYSIS AND DATA EXPORT
# =============================================================================

print("📊 Quantile Band Analysis and Data Export")
print("=" * 50)

# Analyze quantile bands if available
if 'quantile_bands' in forecast_results:
    print("\n🎯 QUANTILE BANDS ANALYSIS:")
    quantile_bands = forecast_results['quantile_bands']
    
    # Extract band information
    band_info = []
    for key in quantile_bands.keys():
        if key.endswith('_label'):
            band_name = key.replace('_label', '')
            if f'{band_name}_lower' in quantile_bands and f'{band_name}_upper' in quantile_bands:
                lower_values = quantile_bands[f'{band_name}_lower']
                upper_values = quantile_bands[f'{band_name}_upper']
                width = np.mean(np.array(upper_values) - np.array(lower_values))
                relative_width = width / np.mean(main_forecast) * 100
                
                band_info.append({
                    'band': band_name,
                    'label': quantile_bands[key],
                    'width': width,
                    'relative_width': relative_width
                })
                
                print(f"   {quantile_bands[key]:15} Width: ${width:,.0f} ({relative_width:.1f}% of forecast)")
    
    # Create comprehensive forecast data for export
    print("\n📁 Creating comprehensive forecast data...")
    
    # Prepare data for CSV export
    forecast_data = []
    for i in range(len(main_forecast)):
        row = {
            'Period': i + 1,
            'Point_Forecast': main_forecast[i]
        }
        
        # Add quantile bands
        for band in band_info:
            band_name = band['band']
            row[f'{band["label"]}_Lower'] = quantile_bands[f'{band_name}_lower'][i]
            row[f'{band["label"]}_Upper'] = quantile_bands[f'{band_name}_upper'][i]
        
        # Add other forecast methods
        for method, forecast in forecast_results.items():
            if method not in ['quantile_forecast', 'quantile_bands'] and isinstance(forecast, (list, np.ndarray)):
                if len(forecast) > i:
                    method_name = method.replace('_', ' ').replace('basic point', 'Basic Point').replace('quantile point', 'Quantile Point')
                    row[method_name] = forecast[i]
        
        forecast_data.append(row)
    
    # Convert to DataFrame and save
    forecast_df = pd.DataFrame(forecast_data)
    csv_filename = f"forecast_data_with_quantiles_{datetime.now().strftime('%Y%m%d_%H%M%S')}.csv"
    forecast_df.to_csv(csv_filename, index=False)
    
    print(f"✅ Forecast data exported to: {csv_filename}")
    print(f"   Rows: {len(forecast_df)}")
    print(f"   Columns: {list(forecast_df.columns)}")
    
    # Display sample of exported data
    print("\n📊 Sample of exported data:")
    print(forecast_df.head())
    
    # Create a dedicated quantile band visualization
    if len(band_info) > 0:
        print("\n🎨 Creating dedicated quantile band visualization...")
        
        fig, ax = plt.subplots(figsize=(12, 8))
        
        # Plot historical data
        ax.plot(range(len(historical_data)), historical_data, 
               color='blue', linewidth=2, label='Historical Data')
        
        # Plot point forecast
        forecast_x = range(len(historical_data), len(historical_data) + len(main_forecast))
        ax.plot(forecast_x, main_forecast, 
               color='red', linestyle='--', linewidth=2.5, label='Point Forecast')
        
        # Plot quantile bands with different colors
        band_colors = QUANTILE_COLORS[:len(band_info)]
        for i, band in enumerate(band_info):
            band_name = band['band']
            lower_values = quantile_bands[f'{band_name}_lower']
            upper_values = quantile_bands[f'{band_name}_upper']
            
            alpha = 0.3 + (0.2 * (1 - i / max(1, len(band_info) - 1)))
            color = band_colors[i % len(band_colors)]
            
            ax.fill_between(forecast_x, lower_values, upper_values, 
                           alpha=alpha, color=color, 
                           label=band['label'])
        
        # Add forecast start line
        ax.axvline(x=len(historical_data), color='gray', linestyle=':', 
                  alpha=0.7, linewidth=1.5, label='Forecast Start')
        
        # Styling
        ax.set_title('Bitcoin Price Forecast with Quantile Bands', fontsize=16, fontweight='bold')
        ax.set_xlabel('Time Period', fontsize=12)
        ax.set_ylabel('Bitcoin Price ($)', fontsize=12)
        ax.grid(True, alpha=0.3)
        ax.legend(loc='upper left', fontsize=10)
        
        plt.tight_layout()
        plt.savefig('quantile_bands_visualization.png', dpi=300, bbox_inches='tight')
        plt.show()
        
        print("✅ Quantile bands visualization saved as: quantile_bands_visualization.png")

else:
    print("⚠️  No quantile bands available for analysis")
    print("   This may be due to:")
    print("   - Quantile forecasting not enabled")
    print("   - Model not returning quantiles")
    print("   - Invalid quantile indices selected")


## Forecast Analysis and Summary

Analyze the forecasting results and provide a comprehensive summary of the performance and insights.

In [None]:
# =============================================================================
# FORECAST ANALYSIS AND SUMMARY
# =============================================================================

print("📊 Forecast Analysis and Summary")
print("=" * 50)

# Calculate basic statistics for each forecast method
analysis_results = {}

for method, forecast in forecast_results.items():
    if method in ['linear_model', 'quantile_forecast']:  # Skip auxiliary results
        continue
        
    forecast_summary = forecaster.get_forecast_summary(forecast, prediction_intervals)
    analysis_results[method] = forecast_summary
    
    print(f"\n🔮 {method.upper()} FORECAST:")
    stats = forecast_summary['forecast_statistics']
    print(f"   Mean Forecast:     ${stats['mean']:,.0f}")
    print(f"   Forecast Range:    ${stats['min']:,.0f} - ${stats['max']:,.0f}")
    print(f"   Forecast Std:      ${stats['std']:,.0f}")
    
    # Calculate performance metrics if actual future data is available
    if actual_future is not None and len(actual_future) == len(forecast):
        mae = np.mean(np.abs(forecast - actual_future))
        mse = np.mean((forecast - actual_future) ** 2)
        rmse = np.sqrt(mse)
        mape = np.mean(np.abs((actual_future - forecast) / actual_future)) * 100
        
        print(f"   Performance Metrics:")
        print(f"      MAE:  ${mae:,.0f}")
        print(f"      RMSE: ${rmse:,.0f}")
        print(f"      MAPE: {mape:.2f}%")
        
        analysis_results[method]['performance'] = {
            'mae': mae,
            'rmse': rmse,
            'mape': mape
        }

# Display quantile bands analysis
if 'quantile_bands' in forecast_results:
    print(f"\n📈 QUANTILE BANDS ANALYSIS:")
    quantile_bands = forecast_results['quantile_bands']
    
    # Extract band information
    band_info = []
    for key in quantile_bands.keys():
        if key.endswith('_label'):
            band_name = key.replace('_label', '')
            if f'{band_name}_lower' in quantile_bands and f'{band_name}_upper' in quantile_bands:
                lower_values = quantile_bands[f'{band_name}_lower']
                upper_values = quantile_bands[f'{band_name}_upper']
                width = np.mean(np.array(upper_values) - np.array(lower_values))
                relative_width = width / np.mean(main_forecast) * 100
                
                band_info.append({
                    'band': band_name,
                    'label': quantile_bands[key],
                    'width': width,
                    'relative_width': relative_width
                })
                
                print(f"   {quantile_bands[key]:15} Width: ${width:,.0f} ({relative_width:.1f}% of forecast)")

# Display prediction intervals analysis (fallback)
elif prediction_intervals:
    print(f"\n📈 PREDICTION INTERVALS ANALYSIS:")
    for conf_level in CONFIDENCE_LEVELS:
        conf_pct = int(conf_level * 100)
        if f'lower_{conf_pct}' in prediction_intervals:
            lower = prediction_intervals[f'lower_{conf_pct}']
            upper = prediction_intervals[f'upper_{conf_pct}']
            width = np.mean(upper - lower)
            relative_width = width / np.mean(main_forecast) * 100
            
            print(f"   {conf_pct}% Confidence Interval:")
            print(f"      Average Width: ${width:,.0f} ({relative_width:.1f}% of forecast)")
            
            # Check coverage if actual data available
            if actual_future is not None and len(actual_future) == len(lower):
                coverage = np.mean((actual_future >= lower) & (actual_future <= upper)) * 100
                print(f"      Actual Coverage: {coverage:.1f}% (target: {conf_pct}%)")

# Model capabilities summary
print(f"\n🔧 MODEL CAPABILITIES SUMMARY:")
capabilities = model_wrapper.get_model_info()['capabilities']
for capability, available in capabilities.items():
    status = "✅ Available" if available else "❌ Not Available"
    print(f"   {capability:25} {status}")

# Configuration summary
print(f"\n⚙️ CONFIGURATION SUMMARY:")
print(f"   Context Length:        {CONTEXT_LEN} periods")
print(f"   Horizon Length:        {HORIZON_LEN} periods")
print(f"   Backend:               {DEVICE_BACKEND}")
print(f"   Covariates Used:       {'Yes' if USE_COVARIATES else 'No'}")
print(f"   Quantiles Computed:    {'Yes' if USE_QUANTILES else 'No'}")
print(f"   Bootstrap Samples:     {BOOTSTRAP_SAMPLES}")
print(f"   Confidence Levels:     {[int(c*100) for c in CONFIDENCE_LEVELS]}%")

print("\n" + "=" * 50)
print("🎯 SAPHENEIA TIMESFM DEMO COMPLETED SUCCESSFULLY!")
print("=" * 50)

## Experiment with Different Configurations

This section allows you to experiment with different model configurations and parameters. Uncomment and modify the cells below to explore various options.

In [None]:
# =============================================================================
# EXPERIMENTAL SECTION (OPTIONAL)
# =============================================================================

# Uncomment the sections below to experiment with different configurations:

# # 1. Try different horizon lengths
# print("🧪 Experimenting with different horizon lengths...")
# horizon_experiments = [12, 24, 48]
# 
# for horizon in horizon_experiments:
#     print(f"\nTesting horizon length: {horizon}")
#     # Update model horizon
#     model_wrapper.update_horizon(horizon)
#     # Would need to reload model here in practice
#     # ... perform forecasting with new horizon

# # 2. Test different covariate combinations
# print("🧪 Testing different covariate combinations...")
# covariate_combinations = [
#     {'dynamic_numerical_covariates': {'eth_price': covariates['dynamic_numerical_covariates']['eth_price']}},
#     {'dynamic_categorical_covariates': {'quarter': covariates['dynamic_categorical_covariates']['quarter']}}
# ]
# 
# for i, cov_subset in enumerate(covariate_combinations):
#     print(f"\nTesting covariate set {i+1}: {list(cov_subset.keys())}")
#     # ... perform forecasting with subset of covariates

# # 3. Compare different xreg_modes
# print("🧪 Comparing different xreg_modes...")
# xreg_modes = ["timesfm + xreg", "xreg + timesfm"]
# 
# for mode in xreg_modes:
#     print(f"\nTesting xreg_mode: {mode}")
#     # ... perform covariates forecasting with different modes

print("💡 Experimental section ready - uncomment blocks above to run experiments")

## Next Steps and Usage Guide

This completes the comprehensive Sapheneia TimesFM demo. Here's how to use this implementation in your own projects:

### 1. **Prepare Your Data**
- Save your time series data as a CSV file with a 'date' column
- Create a data definition JSON specifying column types
- Ensure sufficient data for your desired context and horizon lengths

### 2. **Configure the Model**
- Choose between HuggingFace checkpoints or local model paths
- Set appropriate context and horizon lengths for your use case
- Select the backend based on your hardware (CPU/GPU/TPU)

### 3. **Customize Forecasting**
- Enable/disable covariates based on your data availability
- Adjust bootstrap samples and confidence levels as needed
- Experiment with different xreg_modes for covariates

### 4. **Integration Options**
- Use the Python modules directly in scripts
- Integrate with web applications using the provided webapp
- Deploy to cloud platforms using the setup scripts

### 5. **Key Files Created**
- **`/src/`**: Complete Python library for TimesFM
- **Data files**: Sample data and data definitions
- **Visualizations**: Professional forecast plots
- **This notebook**: Complete working example

For more information, see the README.md and setup documentation.

## Updated Features (Webapp-Aligned)

This notebook has been updated to align with the webapp implementation and includes:

### 🎯 **Enhanced Quantile Forecasting**
- **Multiple Quantile Bands**: Create multiple quantile bands based on selected indices
- **Proper Sorting**: Quantiles are automatically sorted to ensure Q0 ≤ Q1 ≤ ... ≤ Q9
- **Readable Labels**: Bands are labeled as "Q10–Q90", "Q20–Q80", etc.
- **Color Coding**: Each band gets a unique color for easy distinction

### 📊 **Improved Visualization**
- **Quantile Band Priority**: Quantile bands take priority over bootstrap intervals
- **Dedicated Quantile Plot**: Separate visualization focusing on quantile bands
- **Professional Styling**: Consistent with webapp visualization standards

### 💾 **Data Export**
- **CSV Export**: Complete forecast data with all quantile bands
- **Structured Format**: Period, Point Forecast, and all quantile band boundaries
- **Timestamped Files**: Automatic file naming with timestamps

### ⚙️ **Configuration**
- **Quantile Indices**: Configurable quantile selection (default: [1, 9])
- **Color Palette**: 6 distinct colors for quantile bands
- **Error Handling**: Robust error handling and logging

### 🔄 **Webapp Alignment**
- **Same Logic**: Uses identical forecasting and visualization logic as webapp
- **Consistent Results**: Results match webapp output exactly
- **Easy Migration**: Code can be easily adapted for webapp use


In [None]:
# =============================================================================
# USAGE EXAMPLES AND EXPERIMENTATION
# =============================================================================

print("🧪 Usage Examples and Experimentation")
print("=" * 50)

print("\n💡 EXPERIMENT WITH DIFFERENT QUANTILE INDICES:")
print("   # Try different quantile combinations:")
print("   QUANTILE_INDICES = [0, 2, 4, 6, 8]  # More bands")
print("   QUANTILE_INDICES = [1, 5, 9]        # Fewer, wider bands")
print("   QUANTILE_INDICES = [2, 8]           # 20%-80% band only")

print("\n💡 EXPERIMENT WITH DIFFERENT CONFIGURATIONS:")
print("   # Try different context/horizon lengths:")
print("   CONTEXT_LEN = 50   # Shorter context")
print("   HORIZON_LEN = 48   # Longer forecast")

print("\n💡 EXPERIMENT WITH DIFFERENT DATA:")
print("   # Load your own data:")
print("   DATA_FILE = 'path/to/your/data.csv'")
print("   # Update data_definition accordingly")

print("\n💡 INTEGRATION WITH WEBAPP:")
print("   # The webapp uses the same logic as this notebook")
print("   # Results should be identical between webapp and notebook")
print("   # Use this notebook for development and debugging")

print("\n✅ NOTEBOOK READY FOR EXPERIMENTATION!")
print("   Uncomment and modify the cells above to experiment with different settings.")
