# Combined Forecast

The forecast model when created had hypothesis produced and the parameter is fitted by the initial data available when the forecast instance was created.

To combine two forecast models, we can use semantic that a combined instance is created from individual forecast models.

So we defined the combined class that contain one or more forecast model (the fitted hypothesis).


Previously we defined TimeSeriesCluster class and ForecastModel class that perform their functions.


The GroupForecast instance can share individual data points from its individual members. This need to ensure the data is standardized so that it make sense to combine them.


For combining data points, use MinMaxScaler to normalize the data.


Specification:

1. MSE  
2. 80/20 train/test split  
3. MinMax scaler  
4. more complex data path for simulation, like a rotated sine wave or superposition of two basis paths


In [18]:
import numpy as np
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score

# Data access class 
class TimeSeriesDataset:
    """
        - All time series share the same timeline."""
    def __init__(self, lookback='3m', lookahead='7d'):
        if lookback=='3m':
            lookback = 60
        if lookahead=='7d':
            lookahead = 7
        # TODO: make lookback and lookahead argument support more shorthand notations.
        self.lookback = lookback
        self.lookahead = lookahead
        # TODO: need semantic for t.
        t = np.arange(0,100)
        self.t = t # the main timeline.
    @property
    def timeline(self):
        """The main timeline. Used for the clustering. This timeline is shared among all time series."""
        return self.t
    @property
    def series(self):
        """Time-series data as list. For clustering."""
        items = []
        for k, s in self.series_itemset.items():
            items.append[s]
        return items
    @property
    def labels(self):
        """Item labels. For clustering."""
        labels = []
        for k, s in self.series_itemset.items():
            labels.append[k]
        return labels
    @property
    def series_itemset(self):
        """The item set of series."""
        return []
        
    def get_series(self, series_id):
        """Get data series by ID. All series share same timeline. Use lookback/lookahead to set up a time line."""
        


# Two classes from 2_Forecasting.ipynb are defined in condensed form here: TimeSeriesCluster and ForecastModel
class TimeSeriesCluster:
    """Main class for time-series clustering. See original for reference."""
    def __init__(self, timeline, series, labels):
        """Receive timeline, time-series items, and object labels.
            Ex.
            timeline = np.arange(100)
            labels = ['y1', 'y2', 'y3', 'y4']
            items = [y1, y2, y3, y4]
        """
        # choose distance function
        d_func = self.dtw_dist_ret
        self.items = series # map series to a list of item. Define series as argument as the list of individual series.
        self.labels = labels # object labels
        self.D = pdist(self.items, d_func) # compute distance table
        self.Z = linkage(self.D, method='single')
        self.k = 2
        # clusters # the output labels
        self.clusters = fcluster(self.Z, self.k, criterion='maxclust')
    @classmethod
    def dtw_dist_ret(cls, u, v):
        # use FastDTW (approximation of DTW) to measure dissimarity. This generally works on price series.
        # If series received is returns series, use cumsum() to reconstruct price series.
        # 
        # If want to explore further transformation, try: compute_daily_returns and my_dtw_dist
        return fastdtw(u,v)[0]

class ForecastModel:
    """This is a template forecast model."""
    def __init__(self, timeline, data):
        """timeline is shared among all models. data is unique to each model.
        - Main accuracy metric is MSE.
        - Validation set is the last 20% of the data.
        - data argument is the training/validation data we can use to fit the model.
        
        - y_prime is the output of the make_hypothesis mockup.
        - y_hat is the hypothesis to be tested in the hold-out data.
        """
        self.t = timeline.astype(np.int)
        self.y = data
        y_prime, slope, intercept = self.make_hypothesis(data, timeline)
        self.y_hat = y_prime.reshape((self.y.size,)) # make data same dimension
        self.slope = slope[0] # keep just the first coefficient. Note that for multivariate regression this will be vectors
        self.intercept = intercept
        self.errors = self.fitting_errors()
        self.avg_error = np.average(self.errors)
    def make_hypothesis(self, data, t):
        """Returns a linear forecast model."""
        # data is np.array of values of quantities observed in each time step.
        t, v = t[:, np.newaxis], data  # Newaxis trick follow the example in statsmodel linear regression tutorial
        # Create linear regression object
        regr = linear_model.LinearRegression()
        regr.fit(t, v) # find slope and intercept
        # Our model H is y = m * x + c
        # - m is the slope which is the coefficient
        # - c is the bias or in this case the first data point (left-most value).
        m = regr.coef_
        c = v[0]
        y_hat = m * t + c  # this is the model for linear forecast.  it will be subject to IID error.
        return y_hat, m, c
    def fitting_errors(self):
        return np.abs(self.y - self.y_hat).sum()
    def forecast(self, y_train, horizon=60):
        """Stores data in t_forecast variable and y_forecast variable. return the portion of the forecast.
        
        To make forecast.
        """
        steps = horizon # time steps to forecast
        # use linear model to make forecast.
        # Returns forecast values.
        # last value becomes the intercept c.
        last_value = y_train[-1]  # TODO: consider change the name of y to y train
        c = self.intercept # use the intercept from fitting stage!
        # needing y_train as input feels wired. But without this we can get confused if the self.y has future data or not.
        # the t here is the steps
        # t = steps to step into the future
        m = self.slope # slope comes from the hypothesis already fitted.        
        # time-index for the forecasted period
        self.t_forecast = np.arange(60) + self.t[-1] + 1   # The first value is the horizon. Is is a bug?
        
        # y_hat = m * t + c    
        self.y_forecast = m * self.t_forecast + c
        return self.y_forecast
    def __repr__(self):
        avg_err = self.avg_error
        return """LinearModel slope=%s intercept=%s error=%s""" % (self.slope, self.intercept, avg_err)


class ForecastAsGroup:
    """ForecastAsGroup combines individual forecast model and make forecast as a group."""
    def __init__(self, models=[]):
        self.avg_error = 0.0
        for m in models:
            assert isinstance(m, ForecastModel) # Model or type ForecastModel
            self.avg_error += m.avg_error # Add individual average error. See individual model.
        self.models = models 
    def __repr__(self):
        avg_err = self.avg_error
        return """GroupForecastModel error=%s n=%s""" % (avg_err, len(self.models))

In [19]:
x = np.arange(0,100)
iid_noise = lambda : np.random.normal(0,5,x.size)
m_std = 0.1

def time_series_data_type1(m, c, m_std = 0.1):
    m = m + np.random.normal(0,m_std,100)
    c = c + iid_noise()
    y = m * x + c
    return y
dgf_1 = time_series_data_type1
y1 = dgf_1(m=5, c=100)
y2 = dgf_1(m=5, c=130)
y3 = dgf_1(m=1, c=50)
y4 = dgf_1(m=1, c=75)
fm1 = ForecastModel(x, y1)
fm2 = ForecastModel(x, y2)
fm3 = ForecastModel(x, y3)
fm4 = ForecastModel(x, y4)

gf1 = ForecastAsGroup([fm1, fm2, fm3, fm4])
gf1

GroupForecastModel error=3270.8040955849647 n=4

## Forecast with 80/20 split

Because we're using 80/20 split for the forecast model. Due to linear ordering of the time sequence. It is common to use forward chaining to set up the partition between training and validation data. This mean use the first 80% of the data for training the model, then using the remaining 20% of the data as validation set. This validation set is used for optimizing the model's hyperparameters. 

To choose the hyperparameter during the model tuning process we use training set and validation set to calibrate the model. This means adjusting parameters using training set and observing improvement or degradation of the model-in-training. We can exhaustively search the solution space for optimal parameter combination using this setting. (Of course this naive approch needs a good heuristic guide to be successful.)

In this section we simply develop generic class implementation for this step.

Where to place the MSE calculation method?


Where to do the 80/20 split?


What is the procedure for doing the standardization/normalization and inverse the procedure? Describe step by step where this happen and when the transform back occur.