# How to Create a Custom Forecaster

This guide shows you how to create your own custom forecasting estimator in sktime.

## Overview

Creating a custom forecaster involves:
1. Inheriting from the appropriate base class
2. Implementing required methods
3. Setting appropriate tags
4. Testing your implementation

## 1. Simple Custom Forecaster

Let's create a simple moving average forecaster.

In [None]:
import numpy as np
import pandas as pd
from sktime.forecasting.base import BaseForecaster
from sktime.utils.validation.forecasting import check_fh

class MovingAverageForecaster(BaseForecaster):
    """Simple moving average forecaster.
    
    Parameters
    ----------
    window_length : int, default=5
        Length of the moving average window.
    """
    
    # Define class tags
    _tags = {
        "scitype:y": "univariate",  # univariate forecaster
        "ignores-exogeneous-X": True,  # doesn't use exogenous variables
        "handles-missing-data": False,  # doesn't handle missing data
        "y_inner_mtype": "pd.Series",  # expects pandas Series
        "requires-fh-in-fit": False,  # doesn't need fh in fit
        "X-y-must-have-same-index": True,
        "enforce_index_type": None,
    }
    
    def __init__(self, window_length=5):
        self.window_length = window_length
        super(MovingAverageForecaster, self).__init__()
    
    def _fit(self, y, X=None, fh=None):
        """Fit forecaster to training data.
        
        Parameters
        ----------
        y : pd.Series
            Target time series to which to fit the forecaster.
        X : pd.DataFrame, optional (default=None)
            Exogenous variables.
        fh : ForecastingHorizon, optional (default=None)
            Forecasting horizon.
            
        Returns
        -------
        self : MovingAverageForecaster
            Reference to self.
        """
        # Store the last `window_length` values for prediction
        self._last_window = y.iloc[-self.window_length:]
        return self
    
    def _predict(self, fh, X=None):
        """Forecast time series at future horizon.
        
        Parameters
        ----------
        fh : ForecastingHorizon
            Forecasting horizon.
        X : pd.DataFrame, optional (default=None)
            Exogenous variables.
            
        Returns
        -------
        y_pred : pd.Series
            Forecasted values.
        """
        # Calculate moving average
        mean_value = self._last_window.mean()
        
        # Create forecast index
        fh = check_fh(fh)
        pred_index = fh.to_absolute(self.cutoff)
        
        # Return constant forecast (moving average)
        y_pred = pd.Series([mean_value] * len(fh), index=pred_index)
        y_pred.name = self._y.name if hasattr(self._y, 'name') else None
        
        return y_pred

print("Custom MovingAverageForecaster created!")

## 2. Test the Custom Forecaster

In [None]:
from sktime.datasets import load_airline
from sktime.utils.plotting import plot_series
import matplotlib.pyplot as plt

# Load data
y = load_airline()
y_train = y.iloc[:-12]
y_test = y.iloc[-12:]

# Test our custom forecaster
forecaster = MovingAverageForecaster(window_length=10)
forecaster.fit(y_train)
y_pred = forecaster.predict(fh=range(1, 13))

print(f"Forecast generated: {y_pred.shape}")
print(f"Forecast values: {y_pred.head()}")

# Plot results
plot_series(y_train.iloc[-24:], y_test, y_pred, 
           labels=["Training", "Actual", "Custom MA Forecast"])
plt.title("Custom Moving Average Forecaster")
plt.legend()
plt.show()

# Evaluate performance
from sktime.performance_metrics.forecasting import mean_absolute_percentage_error
mape = mean_absolute_percentage_error(y_test, y_pred)
print(f"MAPE: {mape:.2%}")

## 3. Advanced Custom Forecaster with Parameters

Let's create a more sophisticated forecaster with tunable parameters.

In [None]:
class ExponentialMovingAverageForecaster(BaseForecaster):
    """Exponential moving average forecaster with trend.
    
    Parameters
    ----------
    alpha : float, default=0.3
        Smoothing parameter for level (0 < alpha <= 1).
    beta : float, default=0.1
        Smoothing parameter for trend (0 <= beta <= 1).
    include_trend : bool, default=True
        Whether to include trend component.
    """
    
    _tags = {
        "scitype:y": "univariate",
        "ignores-exogeneous-X": True,
        "handles-missing-data": False,
        "y_inner_mtype": "pd.Series",
        "requires-fh-in-fit": False,
        "X-y-must-have-same-index": True,
        "enforce_index_type": None,
    }
    
    def __init__(self, alpha=0.3, beta=0.1, include_trend=True):
        self.alpha = alpha
        self.beta = beta
        self.include_trend = include_trend
        super(ExponentialMovingAverageForecaster, self).__init__()
        
        # Parameter validation
        if not (0 < alpha <= 1):
            raise ValueError("alpha must be between 0 and 1")
        if not (0 <= beta <= 1):
            raise ValueError("beta must be between 0 and 1")
    
    def _fit(self, y, X=None, fh=None):
        """Fit exponential smoothing model."""
        # Initialize level and trend
        self._level = y.iloc[0]
        self._trend = (y.iloc[1] - y.iloc[0]) if len(y) > 1 else 0
        
        # Apply exponential smoothing
        for i in range(1, len(y)):
            prev_level = self._level
            
            # Update level
            self._level = self.alpha * y.iloc[i] + (1 - self.alpha) * (prev_level + self._trend)
            
            # Update trend if enabled
            if self.include_trend:
                self._trend = self.beta * (self._level - prev_level) + (1 - self.beta) * self._trend
            
        return self
    
    def _predict(self, fh, X=None):
        """Generate forecasts."""
        fh = check_fh(fh)
        pred_index = fh.to_absolute(self.cutoff)
        
        # Generate forecasts
        forecasts = []
        for h in fh.to_relative(self.cutoff):
            if self.include_trend:
                forecast = self._level + h * self._trend
            else:
                forecast = self._level
            forecasts.append(forecast)
        
        y_pred = pd.Series(forecasts, index=pred_index)
        y_pred.name = self._y.name if hasattr(self._y, 'name') else None
        
        return y_pred
    
    def get_params(self, deep=True):
        """Get parameters for this estimator."""
        return {
            "alpha": self.alpha,
            "beta": self.beta, 
            "include_trend": self.include_trend
        }
    
    def set_params(self, **params):
        """Set parameters for this estimator."""
        for key, value in params.items():
            setattr(self, key, value)
        return self

print("Advanced ExponentialMovingAverageForecaster created!")

## 4. Test Advanced Forecaster with Parameter Tuning

In [None]:
# Test the advanced forecaster
ema_forecaster = ExponentialMovingAverageForecaster(alpha=0.7, beta=0.2)
ema_forecaster.fit(y_train)
y_pred_ema = ema_forecaster.predict(fh=range(1, 13))

# Compare both custom forecasters
ma_forecaster = MovingAverageForecaster(window_length=12)
ma_forecaster.fit(y_train)
y_pred_ma = ma_forecaster.predict(fh=range(1, 13))

# Calculate performance
mape_ma = mean_absolute_percentage_error(y_test, y_pred_ma)
mape_ema = mean_absolute_percentage_error(y_test, y_pred_ema)

print(f"Moving Average MAPE: {mape_ma:.2%}")
print(f"Exponential MA MAPE: {mape_ema:.2%}")

# Plot comparison
plot_series(y_train.iloc[-24:], y_test, y_pred_ma, y_pred_ema,
           labels=["Training", "Actual", "Moving Average", "Exponential MA"])
plt.title("Custom Forecasters Comparison")
plt.legend()
plt.show()

## 5. Using Custom Forecaster with sktime Ecosystem

In [None]:
# Test with cross-validation
from sktime.forecasting.model_evaluation import evaluate
from sktime.split import ExpandingWindowSplitter

# Set up cross-validation
cv = ExpandingWindowSplitter(
    initial_window=60,
    step_length=12,
    fh=[1, 3, 6, 12]
)

# Evaluate custom forecaster
print("Evaluating custom forecaster with cross-validation...")
result = evaluate(
    forecaster=ema_forecaster,
    y=y,
    cv=cv,
    scoring="mean_absolute_percentage_error"
)

if isinstance(result, dict):
    result_df = pd.DataFrame(result)
else:
    result_df = result

print(f"\nCross-validation results:")
print(result_df.describe())

# Test with hyperparameter tuning
from sktime.forecasting.model_selection import ForecastingGridSearchCV

param_grid = {
    "alpha": [0.1, 0.3, 0.5, 0.7, 0.9],
    "beta": [0.0, 0.1, 0.2, 0.3],
    "include_trend": [True, False]
}

print("\nTesting hyperparameter tuning...")
tuner = ForecastingGridSearchCV(
    forecaster=ExponentialMovingAverageForecaster(),
    cv=cv,
    param_grid=param_grid,
    scoring="mean_absolute_percentage_error",
    n_jobs=1
)

# Fit on a subset for speed
tuner.fit(y.iloc[:100])

print(f"Best parameters: {tuner.best_params_}")
print(f"Best score: {tuner.best_score_:.4f}")

# Test with pipelines
from sktime.forecasting.compose import TransformedTargetForecaster
from sktime.transformations.series.boxcox import BoxCoxTransformer

print("\nTesting with transformation pipeline...")
pipeline = TransformedTargetForecaster([
    ("boxcox", BoxCoxTransformer()),
    ("forecaster", ExponentialMovingAverageForecaster(alpha=0.5, beta=0.1))
])

pipeline.fit(y_train)
y_pred_pipeline = pipeline.predict(fh=range(1, 13))
mape_pipeline = mean_absolute_percentage_error(y_test, y_pred_pipeline)

print(f"Pipeline MAPE: {mape_pipeline:.2%}")

## 6. Best Practices for Custom Forecasters

In [None]:
print("Best Practices for Custom Forecasters:")
print("=" * 38)

print("\n1. INHERITANCE:")
print("   ✓ Always inherit from BaseForecaster")
print("   ✓ Call super().__init__() in constructor")
print("   ✓ Implement required methods: _fit, _predict")

print("\n2. TAGS:")
print("   ✓ Set appropriate tags for your forecaster")
print("   ✓ Common tags:")
print("     - scitype:y: 'univariate' or 'multivariate'")
print("     - ignores-exogeneous-X: True/False")
print("     - handles-missing-data: True/False")
print("     - y_inner_mtype: preferred data format")

print("\n3. PARAMETER VALIDATION:")
print("   ✓ Validate parameters in __init__")
print("   ✓ Provide clear error messages")
print("   ✓ Document parameter constraints")

print("\n4. METHOD IMPLEMENTATION:")
print("   ✓ _fit: Store necessary state for prediction")
print("   ✓ _predict: Generate forecasts for given horizon")
print("   ✓ Handle forecast horizon correctly")
print("   ✓ Return properly indexed Series/DataFrame")

print("\n5. TESTING:")
print("   ✓ Test with different data types")
print("   ✓ Verify compatibility with sktime ecosystem")
print("   ✓ Test edge cases (short series, missing data)")
print("   ✓ Check parameter tuning works")

print("\n6. DOCUMENTATION:")
print("   ✓ Provide clear docstrings")
print("   ✓ Document parameters and their constraints")
print("   ✓ Include usage examples")
print("   ✓ Specify input/output formats")

# Example of proper testing
print("\n\nExample Test Function:")
print("=" * 22)

def test_custom_forecaster():
    """Test custom forecaster implementation."""
    from sktime.utils._testing.forecasting import make_forecasting_problem
    
    # Test basic functionality
    y_train, y_test = make_forecasting_problem(n_timepoints=50, return_y=True)
    
    forecaster = ExponentialMovingAverageForecaster()
    forecaster.fit(y_train)
    y_pred = forecaster.predict(fh=[1, 2, 3])
    
    # Basic checks
    assert len(y_pred) == 3, "Prediction length should match forecast horizon"
    assert isinstance(y_pred, pd.Series), "Should return pandas Series"
    assert not y_pred.isna().any(), "Should not contain NaN values"
    
    # Test parameter setting
    params = forecaster.get_params()
    forecaster.set_params(alpha=0.5)
    assert forecaster.alpha == 0.5, "Parameter setting should work"
    
    print("✓ All tests passed!")

# Run the test
test_custom_forecaster()

## Summary

You've learned how to create custom forecasters in sktime:

1. **Basic Structure**: Inherit from `BaseForecaster` and implement `_fit` and `_predict`
2. **Tags**: Set appropriate tags to describe your forecaster's capabilities
3. **Parameters**: Implement parameter validation and tuning support
4. **Integration**: Ensure compatibility with sktime's ecosystem (CV, tuning, pipelines)
5. **Testing**: Thoroughly test your implementation

## Key Points

- Always inherit from `BaseForecaster`
- Implement required methods: `_fit`, `_predict`
- Set appropriate tags for your forecaster
- Validate parameters and handle edge cases
- Test compatibility with sktime ecosystem
- Document your implementation clearly

Your custom forecasters can now be used with all of sktime's tools: cross-validation, hyperparameter tuning, pipelines, and more!