# Advanced Pipelines and Composition

TimeSmith's composition system is one of its most powerful features. This notebook demonstrates how to build complex, reusable pipelines.

## What Makes This Amazing?

- **Flexible Composition**: Chain any transformers and forecasters
- **Type Safety**: Automatic scitype conversion with adapters
- **Feature Unions**: Combine multiple featurizers
- **Pipeline Persistence**: Save entire pipelines
- **Reusable Components**: Build once, use everywhere


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from timesmith import (
    Pipeline,
    ForecasterPipeline,
    FeatureUnion,
    make_pipeline,
    make_forecaster_pipeline,
    save_model,
    load_model,
)
from timesmith.core import (
    LagFeaturizer,
    RollingFeaturizer,
    TimeFeaturizer,
    MissingValueFiller,
    OutlierRemover,
)
from timesmith.examples import LogTransformer, NaiveForecaster
from timesmith import SimpleMovingAverageForecaster

np.random.seed(42)
print("Pipeline composition tools loaded!")


## 1. Simple Pipeline: Preprocessing + Forecasting

Chain transformers before forecasting.


In [None]:
# Create data with some issues
dates = pd.date_range("2020-01-01", periods=100, freq="D")
y = pd.Series(
    np.abs(np.random.randn(100).cumsum()) + 10 + np.random.normal(0, 2, 100),
    index=dates
)

# Add some outliers
y.iloc[20] = y.iloc[20] * 3
y.iloc[50] = y.iloc[50] * 2.5

# Build pipeline: remove outliers → log transform → forecast
outlier_remover = OutlierRemover(method="iqr")
log_transformer = LogTransformer(offset=1.0)
forecaster = SimpleMovingAverageForecaster(window=7)

pipeline = make_forecaster_pipeline(
    outlier_remover,
    log_transformer,
    forecaster=forecaster
)

print("Pipeline created:")
print(f"  Steps: {len(pipeline.steps)}")
print(f"  Forecaster: {type(pipeline.forecaster).__name__}")

# Fit and predict
pipeline.fit(y)
forecast = pipeline.predict(fh=14)

print(f"\nPipeline executed successfully!")
print(f"  Forecast shape: {forecast.y_pred.shape}")


## 2. Feature Union: Combine Multiple Featurizers

Combine features from multiple sources.


In [None]:
# Create multiple featurizers
lag_featurizer = LagFeaturizer(lags=[1, 2, 3, 7])
rolling_featurizer = RollingFeaturizer(window=7, functions=["mean", "std"])
time_featurizer = TimeFeaturizer(features=["hour", "day_of_week", "month"])

# Combine them with FeatureUnion
feature_union = FeatureUnion([
    ("lags", lag_featurizer),
    ("rolling", rolling_featurizer),
    ("time", time_featurizer),
])

# Fit and transform
feature_union.fit(y)
features = feature_union.transform(y)

print("Feature Union Results:")
print(f"  Original shape: {y.shape}")
print(f"  Feature shape: {features.shape}")
print(f"  Feature columns: {list(features.columns)[:10]}...")  # Show first 10
print(f"\nTotal features created: {features.shape[1]}")

# Show sample features
print("\nSample features:")
print(features.head())


## 3. Complex Pipeline: Multiple Steps

Build sophisticated pipelines with many steps.


In [None]:
# Build a complex preprocessing pipeline
preprocessing_pipeline = make_pipeline(
    ("fill_missing", MissingValueFiller(method="forward")),
    ("remove_outliers", OutlierRemover(method="iqr")),
    ("log_transform", LogTransformer(offset=1.0)),
)

# Then add forecasting
complex_pipeline = make_forecaster_pipeline(
    preprocessing_pipeline,
    forecaster=SimpleMovingAverageForecaster(window=7)
)

print("Complex Pipeline Structure:")
print(f"  Preprocessing steps: {len(complex_pipeline.steps)}")
for i, (name, step) in enumerate(complex_pipeline.steps, 1):
    print(f"    {i}. {name}: {type(step).__name__}")
print(f"  Forecaster: {type(complex_pipeline.forecaster).__name__}")

# Use the pipeline
complex_pipeline.fit(y)
forecast = complex_pipeline.predict(fh=14)

print(f"\nComplex pipeline executed!")
print(f"  Forecast points: {len(forecast.y_pred)}")


## 4. Save and Load Entire Pipelines

Pipelines can be serialized just like individual models.


In [None]:
import tempfile
from pathlib import Path

# Save the pipeline
with tempfile.TemporaryDirectory() as tmpdir:
    pipeline_path = Path(tmpdir) / "complex_pipeline.pkl"
    save_model(complex_pipeline, pipeline_path, include_metadata=True)
    
    print(f"Pipeline saved to: {pipeline_path}")
    
    # Load it back
    loaded_pipeline = load_model(pipeline_path)
    
    print(f"Pipeline loaded successfully!")
    print(f"  Type: {type(loaded_pipeline).__name__}")
    print(f"  Steps: {len(loaded_pipeline.steps)}")
    print(f"  Is fitted: {loaded_pipeline.is_fitted}")
    
    # Use the loaded pipeline
    new_forecast = loaded_pipeline.predict(fh=14)
    print(f"\nForecast from loaded pipeline: {len(new_forecast.y_pred)} points")


## 5. Pipeline Parameters: Get and Set

Inspect and modify pipeline parameters.


In [None]:
# Get all parameters
params = complex_pipeline.get_params(deep=True)
print("Pipeline Parameters:")
print("=" * 60)
for key, value in list(params.items())[:10]:  # Show first 10
    if not isinstance(value, (dict, list)) or len(str(value)) < 50:
        print(f"  {key}: {value}")

# Modify parameters using set_params
print("\n" + "=" * 60)
print("Modifying forecaster window...")
complex_pipeline.set_params(forecaster__window=14)
print(f"  New window: {complex_pipeline.forecaster.window}")

# Refit with new parameters
complex_pipeline.fit(y)
new_forecast = complex_pipeline.predict(fh=14)
print(f"  New forecast generated with window=14")


## Summary

You've mastered TimeSmith's composition system:
- **Simple Pipelines**: Chain transformers and forecasters
- **Feature Unions**: Combine multiple featurizers
- **Complex Pipelines**: Build sophisticated workflows
- **Pipeline Persistence**: Save and load entire pipelines
- **Parameter Management**: Get and set parameters at any level

**Key Benefits:**
- Reusable components
- Type-safe composition
- Easy experimentation
- Production-ready workflows

This composition system is what makes TimeSmith so powerful and flexible!


## Feature Union

## Adapter