# DeepCAR Experiment: ALPIN-Enhanced DeepAR Forecasting

This notebook implements and compares **baseline DeepAR** vs **ALPIN-enhanced DeepAR** using the **BatchCP** method from the DeepCAR paper.

## Background: DeepCAR (Changepoint-Aware DeepAR)

The DeepCAR paper proposes a simple but effective approach to improve probabilistic forecasting:

1. **Problem**: Standard DeepAR training uses all available data, including windows that span regime changes (changepoints). Training on these "contaminated" batches teaches the model incorrect temporal patterns.

2. **Solution - BatchCP**: Filter out training batches whose encoder windows contain or overlap with detected changepoints. This ensures the model only learns from "clean" homogeneous segments.

3. **Key Insight**: By using ALPIN for accurate changepoint detection, we can identify and exclude problematic training samples, leading to better forecast accuracy.

## Experiment Goal

Compare forecast accuracy (MAE/RMSE) between:
- **Baseline DeepAR**: Trained on all data
- **ALPIN-Enhanced DeepAR**: Trained with BatchCP filtering

We expect the ALPIN-enhanced version to produce better forecasts, especially on signals with clear regime changes.

## 1. Setup and Imports

**Important**: Before running this notebook, ensure dependencies are installed:

```bash
uv sync
```

This will install `pytorch-forecasting`, `pytorch-lightning`, and `torch`.

In [None]:
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from typing import Dict, List, Iterator, Any

# PyTorch
import torch
from torch.utils.data import DataLoader

# PyTorch Lightning
import pytorch_lightning as pl
from pytorch_lightning.callbacks import EarlyStopping

# PyTorch Forecasting
from pytorch_forecasting import TimeSeriesDataSet, DeepAR
from pytorch_forecasting.data import GroupNormalizer
from pytorch_forecasting.metrics import MAE, RMSE

# ALPIN
from alpin import ALPIN
from alpin.data.synthetic import generate_synthetic_signals, alpin_signals_to_deepar_df
from alpin.metrics import evaluate_all
from alpin.visualization import plot_signal

# Reproducibility
SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)
pl.seed_everything(SEED)

# Plotting style
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['figure.dpi'] = 100

print("Setup complete!")
print(f"PyTorch version: {torch.__version__}")
print(f"PyTorch Lightning version: {pl.__version__}")

## 2. Configuration

Define hyperparameters for the experiment.

In [None]:
# Data configuration
N_SIGNALS = 10          # Number of synthetic signals
N_SAMPLES = 500         # Samples per signal
NOISE_STD = 1.0         # Noise standard deviation

# DeepAR configuration
MAX_ENCODER_LENGTH = 60   # Context window
MAX_PREDICTION_LENGTH = 20  # Forecast horizon
BATCH_SIZE = 32

# Training configuration
MAX_EPOCHS = 20
LEARNING_RATE = 1e-3
HIDDEN_SIZE = 32
RNN_LAYERS = 2
DROPOUT = 0.1

# BatchCP configuration
CP_TOLERANCE = 2  # Safety margin around changepoints

# Train/Val split
TRAIN_RATIO = 0.8

print("Configuration:")
print(f"  Signals: {N_SIGNALS} x {N_SAMPLES} samples")
print(f"  Encoder length: {MAX_ENCODER_LENGTH}")
print(f"  Prediction length: {MAX_PREDICTION_LENGTH}")
print(f"  Batch size: {BATCH_SIZE}")
print(f"  Max epochs: {MAX_EPOCHS}")

## 3. Generate Synthetic Data

We generate piecewise constant signals with known changepoints using ALPIN's synthetic data generator.

In [None]:
# Generate synthetic signals with ground truth changepoints
signals, changepoints = generate_synthetic_signals(
    n_signals=N_SIGNALS,
    n_samples=N_SAMPLES,
    noise_std=NOISE_STD,
    seed=SEED
)

print(f"Generated {len(signals)} signals, each with {N_SAMPLES} samples")
print(f"Changepoints per signal: {[len(cp) for cp in changepoints]}")
print(f"Total changepoints: {sum(len(cp) for cp in changepoints)}")

In [None]:
# Convert to DeepAR-compatible DataFrame
df = alpin_signals_to_deepar_df(signals, changepoints)

print(f"DataFrame shape: {df.shape}")
print(f"Columns: {list(df.columns)}")
print(f"Series IDs: {df['series_id'].unique().tolist()[:5]}...")
df.head(10)

In [None]:
# Split into train/validation by time
train_cutoff = int(N_SAMPLES * TRAIN_RATIO)

# For time series, we split by time index
train_df = df[df['time_idx'] < train_cutoff].copy()
val_df = df[df['time_idx'] >= train_cutoff - MAX_ENCODER_LENGTH].copy()  # Include encoder context

print(f"Train samples: {len(train_df)} (time_idx < {train_cutoff})")
print(f"Val samples: {len(val_df)} (time_idx >= {train_cutoff - MAX_ENCODER_LENGTH})")

In [None]:
# Visualize example signals with changepoints
fig, axes = plt.subplots(3, 1, figsize=(14, 10), sharex=True)

for i, ax in enumerate(axes):
    signal = signals[i]
    cps = changepoints[i]
    
    ax.plot(signal, color='#2C3E50', linewidth=1.2, alpha=0.8, label='Signal')
    
    for j, cp in enumerate(cps):
        label = 'Changepoint' if j == 0 else None
        ax.axvline(x=cp, color='#E74C3C', linestyle='--', linewidth=2, alpha=0.7, label=label)
    
    # Mark train/val split
    ax.axvline(x=train_cutoff, color='#27AE60', linestyle='-', linewidth=2, alpha=0.9, label='Train/Val Split')
    
    ax.set_ylabel(f'Signal {i}')
    ax.legend(loc='upper right')
    ax.set_title(f'Series {i}: {len(cps)} changepoints', fontweight='bold')

axes[-1].set_xlabel('Time Index')
plt.suptitle('Synthetic Signals with Ground Truth Changepoints', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

## 4. ALPIN Changepoint Detection

Train ALPIN on training data to detect changepoints. These detections will be used for BatchCP filtering.

In [None]:
# Use only training portion of signals for ALPIN training
train_signals = [s[:train_cutoff] for s in signals]
train_changepoints = [[cp for cp in cps if cp < train_cutoff] for cps in changepoints]

# Train ALPIN model
alpin_model = ALPIN()
alpin_model.fit(train_signals, train_changepoints)

print(f"ALPIN learned optimal beta: {alpin_model.beta_opt:.4f}")

In [None]:
# Predict changepoints on all signals (full length for completeness)
detected_changepoints = {}
for i, signal in enumerate(signals):
    series_id = f"series_{i}"
    detected = alpin_model.predict(signal)
    detected_changepoints[series_id] = detected

print("Detected changepoints per series:")
for sid, cps in detected_changepoints.items():
    print(f"  {sid}: {cps}")

In [None]:
# Evaluate ALPIN detection quality
print("\nALPIN Detection Metrics:")
print("-" * 50)

all_metrics = []
for i in range(N_SIGNALS):
    series_id = f"series_{i}"
    detected = detected_changepoints[series_id]
    ground_truth = changepoints[i]
    
    metrics = evaluate_all(detected, ground_truth, N_SAMPLES, tolerance=10)
    all_metrics.append(metrics)
    
    print(f"Series {i}: Precision={metrics['precision']:.2f}, Recall={metrics['recall']:.2f}")

# Average metrics
avg_metrics = pd.DataFrame(all_metrics).mean()
print("-" * 50)
print(f"Average: Precision={avg_metrics['precision']:.2f}, Recall={avg_metrics['recall']:.2f}")

In [None]:
# Visualize ALPIN detection on one signal
example_idx = 0
plot_signal(
    signals[example_idx],
    true_changepoints=changepoints[example_idx],
    pred_changepoints=detected_changepoints[f'series_{example_idx}'],
    title=f'ALPIN Detection on Series {example_idx}'
)

## 5. Baseline DeepAR Training

Train DeepAR on all available training data without any filtering.

In [None]:
# Create TimeSeriesDataSet for training
training = TimeSeriesDataSet(
    train_df,
    time_idx="time_idx",
    target="value",
    group_ids=["series_id"],
    max_encoder_length=MAX_ENCODER_LENGTH,
    max_prediction_length=MAX_PREDICTION_LENGTH,
    time_varying_unknown_reals=["value"],
    add_relative_time_idx=True,
    add_target_scales=True,
    target_normalizer=GroupNormalizer(groups=["series_id"]),
)

# Create validation dataset from training parameters
validation = TimeSeriesDataSet.from_dataset(training, val_df, stop_randomization=True)

print(f"Training samples: {len(training)}")
print(f"Validation samples: {len(validation)}")

In [None]:
# Create DataLoaders
train_dataloader = training.to_dataloader(train=True, batch_size=BATCH_SIZE, num_workers=0)
val_dataloader = validation.to_dataloader(train=False, batch_size=BATCH_SIZE, num_workers=0)

print(f"Train batches: {len(train_dataloader)}")
print(f"Val batches: {len(val_dataloader)}")

In [None]:
# Create baseline DeepAR model
baseline_deepar = DeepAR.from_dataset(
    training,
    hidden_size=HIDDEN_SIZE,
    rnn_layers=RNN_LAYERS,
    dropout=DROPOUT,
    learning_rate=LEARNING_RATE,
    log_interval=10,
    log_val_interval=1,
    reduce_on_plateau_patience=3,
)

print(f"Baseline DeepAR parameters: {sum(p.numel() for p in baseline_deepar.parameters()):,}")

In [None]:
# Train baseline model
baseline_trainer = pl.Trainer(
    max_epochs=MAX_EPOCHS,
    gradient_clip_val=0.1,
    limit_train_batches=50,  # Limit for quick demo
    limit_val_batches=20,
    enable_progress_bar=True,
    enable_model_summary=False,
    logger=False,
)

print("Training Baseline DeepAR...")
baseline_trainer.fit(
    baseline_deepar,
    train_dataloaders=train_dataloader,
    val_dataloaders=val_dataloader,
)
print("Baseline training complete!")

In [None]:
# Generate baseline predictions
baseline_predictions = baseline_deepar.predict(val_dataloader, return_y=True, mode="prediction")

# Extract predictions and actuals
baseline_preds = baseline_predictions.output
baseline_actuals = baseline_predictions.y[0]

print(f"Baseline predictions shape: {baseline_preds.shape}")
print(f"Baseline actuals shape: {baseline_actuals.shape}")

In [None]:
# Calculate baseline metrics
def calculate_forecast_metrics(predictions: torch.Tensor, actuals: torch.Tensor) -> Dict[str, float]:
    """Calculate MAE and RMSE for forecasts."""
    preds = predictions.cpu().numpy().flatten()
    actual = actuals.cpu().numpy().flatten()
    
    mae = np.mean(np.abs(preds - actual))
    rmse = np.sqrt(np.mean((preds - actual) ** 2))
    
    return {'MAE': mae, 'RMSE': rmse}

baseline_metrics = calculate_forecast_metrics(baseline_preds, baseline_actuals)
print("Baseline DeepAR Metrics:")
print(f"  MAE:  {baseline_metrics['MAE']:.4f}")
print(f"  RMSE: {baseline_metrics['RMSE']:.4f}")

In [None]:
# Plot baseline forecast example
fig, ax = plt.subplots(figsize=(12, 5))

# Get first batch for visualization
sample_idx = 0
pred_sample = baseline_preds[sample_idx].cpu().numpy()
actual_sample = baseline_actuals[sample_idx].cpu().numpy()

time_axis = np.arange(len(actual_sample))

ax.plot(time_axis, actual_sample, 'o-', color='#2E86AB', label='Actual', markersize=4)
ax.plot(time_axis, pred_sample, 's-', color='#E74C3C', label='Baseline Forecast', markersize=4)

ax.set_xlabel('Forecast Horizon')
ax.set_ylabel('Value')
ax.set_title('Baseline DeepAR: Example Forecast vs Actual', fontweight='bold')
ax.legend()
ax.grid(True, linestyle=':', alpha=0.6)

plt.tight_layout()
plt.show()

## 6. ALPIN-Enhanced DeepAR (BatchCP)

Implement the BatchCP filtering method and train DeepAR with changepoint-aware batch selection.

In [None]:
class ChangePointAwareDataLoader:
    """
    Wrapper that filters out batches containing changepoints in encoder window.
    
    Implements the BatchCP method from DeepCAR: batches where the encoder window
    overlaps with a detected changepoint are skipped during training.
    
    Parameters
    ----------
    dataloader : DataLoader
        Original PyTorch DataLoader from TimeSeriesDataSet
    changepoints_dict : Dict[str, List[int]]
        Mapping from series_id to list of changepoint indices
    encoder_length : int
        Length of the encoder window
    tolerance : int
        Safety margin around changepoints
    """
    
    def __init__(
        self,
        dataloader: DataLoader,
        changepoints_dict: Dict[str, List[int]],
        encoder_length: int,
        tolerance: int = 2
    ):
        self.dataloader = dataloader
        self.changepoints_dict = changepoints_dict
        self.encoder_length = encoder_length
        self.tolerance = tolerance
        self.filtered_count = 0
        self.total_count = 0
        
    def _batch_contains_changepoint(self, batch: tuple) -> bool:
        """
        Check if any sample in batch has a changepoint in its encoder window.
        
        Args:
            batch: Tuple of (x_dict, y_tuple) from TimeSeriesDataSet
            
        Returns:
            True if batch should be filtered out
        """
        x_dict, y = batch
        
        # Get encoder time indices - these tell us the time range of each sample
        # encoder_time_idx has shape (batch_size, encoder_length)
        if 'encoder_time_idx' in x_dict:
            encoder_times = x_dict['encoder_time_idx']
        else:
            # Fallback: use relative time index if available
            encoder_times = x_dict.get('time_idx', None)
            if encoder_times is None:
                return False  # Cannot determine, don't filter
        
        # Get series groups to identify which series each sample belongs to
        groups = x_dict.get('groups', None)
        if groups is None:
            return False  # Cannot determine series, don't filter
        
        batch_size = encoder_times.shape[0]
        
        for i in range(batch_size):
            # Get time range of this sample's encoder window
            sample_times = encoder_times[i].cpu().numpy()
            start_time = int(sample_times.min())
            end_time = int(sample_times.max())
            
            # Get series ID for this sample
            series_idx = int(groups[i, 0].item())  # First group dimension is series
            series_id = f"series_{series_idx}"
            
            # Check if any changepoint falls within encoder window
            if series_id in self.changepoints_dict:
                for cp in self.changepoints_dict[series_id]:
                    # Check if changepoint (with tolerance) overlaps encoder window
                    if (start_time - self.tolerance) <= cp <= (end_time + self.tolerance):
                        return True
        
        return False
    
    def __iter__(self) -> Iterator:
        """Iterate over batches, skipping those with changepoints."""
        self.filtered_count = 0
        self.total_count = 0
        
        for batch in self.dataloader:
            self.total_count += 1
            
            if self._batch_contains_changepoint(batch):
                self.filtered_count += 1
                continue  # Skip this batch
            
            yield batch
    
    def __len__(self) -> int:
        """Return length of underlying dataloader (upper bound)."""
        return len(self.dataloader)
    
    def get_filtering_stats(self) -> Dict[str, Any]:
        """Get statistics about batch filtering."""
        return {
            'total_batches': self.total_count,
            'filtered_batches': self.filtered_count,
            'kept_batches': self.total_count - self.filtered_count,
            'filter_ratio': self.filtered_count / max(self.total_count, 1)
        }

print("ChangePointAwareDataLoader class defined.")

In [None]:
# Create filtered dataloader for ALPIN-enhanced training
filtered_train_dataloader = ChangePointAwareDataLoader(
    dataloader=train_dataloader,
    changepoints_dict=detected_changepoints,
    encoder_length=MAX_ENCODER_LENGTH,
    tolerance=CP_TOLERANCE
)

print(f"Created ChangePointAwareDataLoader")
print(f"  Detected changepoints: {sum(len(cps) for cps in detected_changepoints.values())}")
print(f"  Encoder length: {MAX_ENCODER_LENGTH}")
print(f"  Tolerance: {CP_TOLERANCE}")

In [None]:
# Create ALPIN-enhanced DeepAR model (same architecture as baseline)
alpin_deepar = DeepAR.from_dataset(
    training,
    hidden_size=HIDDEN_SIZE,
    rnn_layers=RNN_LAYERS,
    dropout=DROPOUT,
    learning_rate=LEARNING_RATE,
    log_interval=10,
    log_val_interval=1,
    reduce_on_plateau_patience=3,
)

print(f"ALPIN-Enhanced DeepAR parameters: {sum(p.numel() for p in alpin_deepar.parameters()):,}")

In [None]:
# Custom training loop with filtered dataloader
# Note: Since PyTorch Lightning trainer expects a standard dataloader,
# we'll do a manual training loop here for the filtered version

print("Training ALPIN-Enhanced DeepAR with BatchCP filtering...")

optimizer = torch.optim.Adam(alpin_deepar.parameters(), lr=LEARNING_RATE)
alpin_deepar.train()

training_losses = []

for epoch in range(MAX_EPOCHS):
    epoch_loss = 0.0
    batch_count = 0
    
    for batch in filtered_train_dataloader:
        optimizer.zero_grad()
        
        x, y = batch
        
        # Forward pass
        output = alpin_deepar(x)
        loss = alpin_deepar.loss(output, y)
        
        # Check for NaN
        if torch.isnan(loss):
            continue
        
        # Backward pass
        loss.backward()
        torch.nn.utils.clip_grad_norm_(alpin_deepar.parameters(), 0.1)
        optimizer.step()
        
        epoch_loss += loss.item()
        batch_count += 1
        
        # Limit batches for quick demo
        if batch_count >= 50:
            break
    
    avg_loss = epoch_loss / max(batch_count, 1)
    training_losses.append(avg_loss)
    
    if (epoch + 1) % 5 == 0 or epoch == 0:
        print(f"  Epoch {epoch+1}/{MAX_EPOCHS}: Loss = {avg_loss:.4f}, Batches used = {batch_count}")

# Get filtering statistics
filter_stats = filtered_train_dataloader.get_filtering_stats()
print("\nBatchCP Filtering Statistics:")
print(f"  Total batches processed: {filter_stats['total_batches']}")
print(f"  Filtered out: {filter_stats['filtered_batches']} ({filter_stats['filter_ratio']*100:.1f}%)")
print(f"  Batches used: {filter_stats['kept_batches']}")

print("\nALPIN-Enhanced training complete!")

In [None]:
# Generate ALPIN-enhanced predictions
alpin_deepar.eval()

alpin_predictions = alpin_deepar.predict(val_dataloader, return_y=True, mode="prediction")

alpin_preds = alpin_predictions.output
alpin_actuals = alpin_predictions.y[0]

print(f"ALPIN predictions shape: {alpin_preds.shape}")
print(f"ALPIN actuals shape: {alpin_actuals.shape}")

In [None]:
# Calculate ALPIN-enhanced metrics
alpin_metrics = calculate_forecast_metrics(alpin_preds, alpin_actuals)
print("ALPIN-Enhanced DeepAR Metrics:")
print(f"  MAE:  {alpin_metrics['MAE']:.4f}")
print(f"  RMSE: {alpin_metrics['RMSE']:.4f}")

In [None]:
# Plot ALPIN-enhanced forecast example
fig, ax = plt.subplots(figsize=(12, 5))

sample_idx = 0
pred_sample = alpin_preds[sample_idx].cpu().numpy()
actual_sample = alpin_actuals[sample_idx].cpu().numpy()

time_axis = np.arange(len(actual_sample))

ax.plot(time_axis, actual_sample, 'o-', color='#2E86AB', label='Actual', markersize=4)
ax.plot(time_axis, pred_sample, 's-', color='#27AE60', label='ALPIN-Enhanced Forecast', markersize=4)

ax.set_xlabel('Forecast Horizon')
ax.set_ylabel('Value')
ax.set_title('ALPIN-Enhanced DeepAR: Example Forecast vs Actual', fontweight='bold')
ax.legend()
ax.grid(True, linestyle=':', alpha=0.6)

plt.tight_layout()
plt.show()

## 7. Comparison & Analysis

Compare the performance of baseline DeepAR vs ALPIN-enhanced DeepAR.

In [None]:
# Create comparison table
comparison_df = pd.DataFrame({
    'Metric': ['MAE', 'RMSE'],
    'Baseline DeepAR': [f"{baseline_metrics['MAE']:.4f}", f"{baseline_metrics['RMSE']:.4f}"],
    'ALPIN-Enhanced': [f"{alpin_metrics['MAE']:.4f}", f"{alpin_metrics['RMSE']:.4f}"],
})

# Calculate improvement
mae_improvement = (baseline_metrics['MAE'] - alpin_metrics['MAE']) / baseline_metrics['MAE'] * 100
rmse_improvement = (baseline_metrics['RMSE'] - alpin_metrics['RMSE']) / baseline_metrics['RMSE'] * 100

comparison_df['Improvement'] = [f"{mae_improvement:+.2f}%", f"{rmse_improvement:+.2f}%"]

print("=" * 60)
print("FORECAST ACCURACY COMPARISON")
print("=" * 60)
display(comparison_df)
print("\n(Positive improvement = ALPIN-Enhanced is better)")

In [None]:
# Visualization: Side-by-side metric comparison
fig, ax = plt.subplots(figsize=(10, 6))

metrics = ['MAE', 'RMSE']
x = np.arange(len(metrics))
width = 0.35

baseline_vals = [baseline_metrics['MAE'], baseline_metrics['RMSE']]
alpin_vals = [alpin_metrics['MAE'], alpin_metrics['RMSE']]

bars1 = ax.bar(x - width/2, baseline_vals, width, label='Baseline DeepAR', color='#E74C3C', edgecolor='white')
bars2 = ax.bar(x + width/2, alpin_vals, width, label='ALPIN-Enhanced', color='#27AE60', edgecolor='white')

ax.set_ylabel('Error')
ax.set_title('Forecast Accuracy: Baseline vs ALPIN-Enhanced DeepAR', fontweight='bold', fontsize=14)
ax.set_xticks(x)
ax.set_xticklabels(metrics, fontsize=12)
ax.legend(loc='upper right', fontsize=11)
ax.bar_label(bars1, fmt='%.3f', padding=3, fontsize=10)
ax.bar_label(bars2, fmt='%.3f', padding=3, fontsize=10)

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.grid(axis='y', linestyle=':', alpha=0.5)

plt.tight_layout()
plt.show()

In [None]:
# Side-by-side forecast comparison
fig, axes = plt.subplots(2, 1, figsize=(14, 8), sharex=True)

sample_idx = 0
baseline_sample = baseline_preds[sample_idx].cpu().numpy()
alpin_sample = alpin_preds[sample_idx].cpu().numpy()
actual_sample = baseline_actuals[sample_idx].cpu().numpy()  # Same actuals
time_axis = np.arange(len(actual_sample))

# Top: Baseline
axes[0].plot(time_axis, actual_sample, 'o-', color='#2C3E50', label='Actual', markersize=5)
axes[0].plot(time_axis, baseline_sample, 's-', color='#E74C3C', label='Baseline Forecast', markersize=5)
axes[0].set_ylabel('Value')
axes[0].set_title('Baseline DeepAR Forecast', fontweight='bold')
axes[0].legend(loc='upper right')
axes[0].grid(True, linestyle=':', alpha=0.5)
axes[0].spines['top'].set_visible(False)
axes[0].spines['right'].set_visible(False)

# Bottom: ALPIN-Enhanced
axes[1].plot(time_axis, actual_sample, 'o-', color='#2C3E50', label='Actual', markersize=5)
axes[1].plot(time_axis, alpin_sample, 's-', color='#27AE60', label='ALPIN-Enhanced Forecast', markersize=5)
axes[1].set_xlabel('Forecast Horizon')
axes[1].set_ylabel('Value')
axes[1].set_title('ALPIN-Enhanced DeepAR Forecast (BatchCP)', fontweight='bold')
axes[1].legend(loc='upper right')
axes[1].grid(True, linestyle=':', alpha=0.5)
axes[1].spines['top'].set_visible(False)
axes[1].spines['right'].set_visible(False)

plt.suptitle('Forecast Comparison: Example Prediction', fontsize=14, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

In [None]:
# BatchCP Filtering Statistics Summary
print("=" * 60)
print("BATCHCP FILTERING SUMMARY")
print("=" * 60)
print(f"\nDetected Changepoints:")
for sid, cps in detected_changepoints.items():
    print(f"  {sid}: {len(cps)} changepoints at indices {cps}")

print(f"\nFiltering Configuration:")
print(f"  Encoder window: {MAX_ENCODER_LENGTH} samples")
print(f"  Tolerance margin: ±{CP_TOLERANCE} samples")

print(f"\nFiltering Results:")
print(f"  Total batches seen: {filter_stats['total_batches']}")
print(f"  Batches filtered out: {filter_stats['filtered_batches']} ({filter_stats['filter_ratio']*100:.1f}%)")
print(f"  Clean batches used: {filter_stats['kept_batches']}")

In [None]:
# Final Analysis
print("=" * 60)
print("EXPERIMENT CONCLUSIONS")
print("=" * 60)

if alpin_metrics['MAE'] < baseline_metrics['MAE']:
    print("\n✅ ALPIN-Enhanced DeepAR OUTPERFORMS Baseline")
    print(f"   MAE improved by {abs(mae_improvement):.2f}%")
    print(f"   RMSE improved by {abs(rmse_improvement):.2f}%")
    print("\n   The BatchCP method successfully reduced forecast error by")
    print("   filtering out training batches that span regime changes.")
else:
    print("\n⚠️ ALPIN-Enhanced DeepAR shows similar or worse performance")
    print(f"   MAE difference: {mae_improvement:.2f}%")
    print(f"   RMSE difference: {rmse_improvement:.2f}%")
    print("\n   Possible reasons:")
    print("   - Synthetic data may not have strong regime effects")
    print("   - Training data reduced too much by filtering")
    print("   - Model needs more epochs to converge")

print("\n" + "-" * 60)
print("Key Findings:")
print("-" * 60)
print(f"1. ALPIN detected {sum(len(cps) for cps in detected_changepoints.values())} changepoints across {N_SIGNALS} signals")
print(f"2. BatchCP filtered {filter_stats['filter_ratio']*100:.1f}% of training batches")
print(f"3. Baseline MAE: {baseline_metrics['MAE']:.4f}, ALPIN-Enhanced MAE: {alpin_metrics['MAE']:.4f}")
print(f"4. Baseline RMSE: {baseline_metrics['RMSE']:.4f}, ALPIN-Enhanced RMSE: {alpin_metrics['RMSE']:.4f}")

---

## Summary

In this notebook, we:

1. **Generated synthetic signals** with known changepoints using ALPIN's data generator
2. **Trained ALPIN** to detect changepoints in the synthetic data
3. **Trained baseline DeepAR** on all available training data
4. **Implemented BatchCP filtering** via `ChangePointAwareDataLoader`
5. **Trained ALPIN-enhanced DeepAR** using only "clean" batches without changepoints
6. **Compared forecast accuracy** between both approaches

### Key Takeaways

- **BatchCP is a simple but effective preprocessing technique** that can improve forecast quality by avoiding regime-spanning training samples
- **ALPIN provides accurate changepoint detection** that enables the BatchCP filtering
- **The improvement depends on data characteristics**: signals with strong regime changes benefit most
- **Trade-off**: Filtering reduces training data, which may hurt if changepoints are very frequent

### Next Steps

- Test on real financial data with actual regime changes
- Compare with other changepoint-aware methods
- Experiment with different filtering tolerances
- Evaluate on longer forecast horizons