## Components and Residuals

* series = components + residuals<br>
&nbsp;(components: trend, seasons, cycles,   residuals: error)
1) first learn the trend and subtract it out from the series,
2) then learn the seasonality from the detrended residuals and subtract the seasons out,
3) then learn the cycles and subtract the cycles out,
4) and finally only the unpredictable error remains.
5) Add together all the components we learned and we get the complete model. (linear regression)



```python
# 1. Train and predict with first model
model_1.fit(X_train_1, y_train)
y_pred_1 = model_1.predict(X_train)

# 2. Train and predict with second model on residuals
model_2.fit(X_train_2, y_train - y_pred_1)
y_pred_2 = model_2.predict(X_train_2)

# 3. Add to get overall predictions
y_pred = y_pred_1 + y_pred_2
```

* Feature-transforming algorithms : linear regression, neural nets 
* Target-transforming algorithms : decision trees, nearest neighbors

* This difference is what motivates the hybrid design in this lesson: use linear regression to extrapolate the trend, transform the target to remove the trend, and apply XGBoost to the detrended residuals. To hybridize a neural net (a feature transformer), you could instead include the predictions of another model as a feature, which the neural net would then include as part of its own predictions. The method of fitting to residuals is actually the same method the gradient boosting algorithm uses, so we will call these boosted hybrids; the method of using predictions as features is known as "stacking", so we will call these stacked hybrids.

* To predict multiple series at once with XGBoost, we'll instead convert these series from wide format, with one time series per column, to long format, with series indexed by categories along rows.
```python
    # The `stack` method converts column labels to row labels, pivoting from wide format to long
    X = retail.stack()  # pivot dataset wide to long
    display(X.head())
    y = X.pop('Sales')  # grab target series
```

<img src="05_before.png" alt="title" width="300"/> $\Rightarrow$ <img src="05_after.png" alt="title" width="300"/>

* Turn the row labels for 'Industries' into a categorical feature with a label encoding. Also create a feature for annual seasonality by pulling the month numbers out of the time index.
```python
    # Turn row labels into categorical feature columns with a label encoding
    X = X.reset_index('Industries')
    # Label encoding for 'Industries' feature
    for colname in X.select_dtypes(["object", "category"]):
        X[colname], _ = X[colname].factorize()

    # Label encoding for annual seasonality
    X["Month"] = X.index.month  # values are 1, 2, ..., 12

    # Create splits
    X_train, X_test = X.loc[idx_train, :], X.loc[idx_test, :]
    y_train, y_test = y.loc[idx_train], y.loc[idx_test]
```

* convert the trend predictions made earlier to long format and then subtract them from the original series. That will give us detrended (residual) series that XGBoost can learn.
```python
    # Pivot wide to long (stack) and convert DataFrame to Series (squeeze)
    y_fit = y_fit.stack().squeeze()    # trend from training set
    y_pred = y_pred.stack().squeeze()  # trend from test set

    # Create residuals (the collection of detrended series) from the training set
    y_resid = y_train - y_fit

    # Train XGBoost on the residuals
    xgb = XGBRegressor()
    xgb.fit(X_train, y_resid)

    # Add the predicted residuals onto the predicted trends
    y_fit_boosted = xgb.predict(X_train) + y_fit
    y_pred_boosted = xgb.predict(X_test) + y_pred
```

## Exercise Notes

* Create a boosted hybrid
    ```python
    class BoostedHybrid:
        def __init__(self, model_1, model_2):
            self.model_1 = model_1
            self.model_2 = model_2
            self.y_columns = None  # store column names from fit method
    ```

* Add method to class<br>
```python
    classname.method = method   
    # e.g. BoostedHybrid.predict = predict
```

* Train boosted hybrid
    ```python
    # Create LinearRegression + XGBRegressor hybrid with BoostedHybrid
    model = BoostedHybrid(
        model_1=LinearRegression(),
        model_2=XGBRegressor(),
    )

    # Fit and predict
    model.fit(X_1, X_2, y)
    y_pred = model.predict(X_1, X_2)

    y_pred = y_pred.clip(0.0)
    ```