# Change Point Detection

This notebook demonstrates change point detection methods in TimeSmith.

## What You'll Learn

- Creating data with change points
- PELT (Pruned Exact Linear Time) detector
- CUSUM (Cumulative Sum) detector
- Visualizing detected change points

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Try to import change point detectors
try:
    from timesmith.core.changepoint import PELTDetector, CUSUMDetector
    HAS_CHANGEPOINT = True
except ImportError:
    HAS_CHANGEPOINT = False
    print("Change point detectors not available (requires optional 'ruptures' package)")

np.random.seed(42)
print("Change point detection tools loaded!")

## 1. Create Data with Change Points

Let's create a time series with known change points.

In [None]:
# Create data with change points
dates = pd.date_range('2020-01-01', periods=200, freq='D')

# Create series with mean shifts
y1 = np.random.randn(50).cumsum() + 100
y2 = np.random.randn(50).cumsum() + 120  # Mean shift
y3 = np.random.randn(50).cumsum() + 90   # Mean shift
y4 = np.random.randn(50).cumsum() + 110  # Mean shift

y = pd.Series(np.concatenate([y1, y2, y3, y4]), index=dates)

true_changepoints = [50, 100, 150]
print(f"True change points at indices: {true_changepoints}")
print(f"True change point dates: {[dates[i] for i in true_changepoints]}")

# Visualize
plt.figure(figsize=(14, 6))
plt.plot(y.index, y.values, linewidth=2, label='Time Series', color='steelblue')
for cp_idx in true_changepoints:
    plt.axvline(x=dates[cp_idx], color='red', linestyle='--', 
               linewidth=2, alpha=0.7, label='True Change Point' if cp_idx == true_changepoints[0] else '')
plt.title('Time Series with Change Points', fontsize=14, fontweight='bold')
plt.xlabel('Date')
plt.ylabel('Value')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 2. PELT Detector (if available)

PELT (Pruned Exact Linear Time) is an efficient change point detection algorithm.

In [None]:
if HAS_CHANGEPOINT:
    # PELT change point detection
    pelt = PELTDetector(penalty=10)
    pelt.fit(y)
    changepoints = pelt.predict(y)
    
    detected_indices = y.index[changepoints].tolist() if changepoints.sum() > 0 else []
    
    print(f"PELT detection:")
    print(f"  Change points detected: {changepoints.sum()}")
    if detected_indices:
        print(f"  Change point dates: {detected_indices}")
    
    # Visualize
    plt.figure(figsize=(14, 6))
    plt.plot(y.index, y.values, linewidth=2, label='Time Series', color='steelblue')
    for cp_idx in true_changepoints:
        plt.axvline(x=dates[cp_idx], color='green', linestyle=':', 
                   linewidth=2, alpha=0.5, label='True CP' if cp_idx == true_changepoints[0] else '')
    if detected_indices:
        for cp_date in detected_indices:
            plt.axvline(x=cp_date, color='red', linestyle='--', 
                       linewidth=2, alpha=0.7, label='Detected CP' if cp_date == detected_indices[0] else '')
    plt.title('PELT Change Point Detection', fontsize=14, fontweight='bold')
    plt.xlabel('Date')
    plt.ylabel('Value')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()
else:
    print("PELT detector not available. Install 'ruptures' package to use it.")

## 3. CUSUM Detector (if available)

CUSUM (Cumulative Sum) detects changes in mean.

In [None]:
if HAS_CHANGEPOINT:
    # CUSUM detector
    cusum = CUSUMDetector(threshold=3.0)
    cusum.fit(y)
    cusum_changepoints = cusum.predict(y)
    
    detected_indices = y.index[cusum_changepoints].tolist() if cusum_changepoints.sum() > 0 else []
    
    print(f"CUSUM detection:")
    print(f"  Change points detected: {cusum_changepoints.sum()}")
    if detected_indices:
        print(f"  Change point dates: {detected_indices}")
    
    # Visualize
    plt.figure(figsize=(14, 6))
    plt.plot(y.index, y.values, linewidth=2, label='Time Series', color='steelblue')
    for cp_idx in true_changepoints:
        plt.axvline(x=dates[cp_idx], color='green', linestyle=':', 
                   linewidth=2, alpha=0.5, label='True CP' if cp_idx == true_changepoints[0] else '')
    if detected_indices:
        for cp_date in detected_indices:
            plt.axvline(x=cp_date, color='red', linestyle='--', 
                       linewidth=2, alpha=0.7, label='Detected CP' if cp_date == detected_indices[0] else '')
    plt.title('CUSUM Change Point Detection', fontsize=14, fontweight='bold')
    plt.xlabel('Date')
    plt.ylabel('Value')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()
else:
    print("CUSUM detector not available. Install 'ruptures' package to use it.")

## Summary

You've learned:
- How to create data with known change points
- How to use PELT detector for efficient change point detection (if available)
- How to use CUSUM detector for mean change detection (if available)
- How to visualize detected change points

**Key Points:**
- Change point detection helps identify structural breaks in time series
- PELT is efficient and works well for various change types
- CUSUM is good for detecting mean shifts
- These methods require the optional 'ruptures' package