# Gradient Boost Regressor

**Gradient Boosting Regressor** is a powerful ensemble machine learning algorithm that builds models in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function.

## Concept

Gradient Boosting involves three main components:
1. **Loss Function to be optimized:** Gradient Boosting is a flexible method that can be used on differentiable loss functions. For regression tasks, it typically uses squared error or absolute error.
2. **Weak Learner to make predictions:** Gradient Boosting uses decision trees as the weak learners. Trees are added one at a time, and existing trees in the model are not changed.
3. **Additive Model to add weak learners:** Trees are added one at a time, and gradient descent is used to minimize the loss when adding trees.

## How It Works

The algorithm builds the model in a stage-wise fashion:
1. Fit a decision tree to the data, e.g., predict the mean of the target variable.
2. Apply the decision tree to the data and calculate the error residuals.
3. Fit a new decision tree to the residuals from the previous step.
4. Add this new decision tree into the ensemble, update the model.
5. Repeat steps 2-4 until a specified number of trees have been added or the loss changes minimally on adding a new tree.

## Key Parameters

- $n_{\text{estimators}}$: The number of boosting stages to perform. More stages increase the model complexity.
- $\text{learning\_rate}$: Shrinks the contribution of each tree by the learning rate. There is a trade-off between learning rate and number of stages.
- $\text{max\_depth}$: Limits the number of nodes in the decision trees. Used to control over-fitting as higher depth will allow model to learn relations very specific to a particular sample.
- $\text{min\_samples\_split}$: The minimum number of samples required to split an internal node.
- $\text{min\_samples\_leaf}$: The minimum number of samples required to be at a leaf node.

## Advantages

- Can handle heterogeneous features (numeric and categorical).
- Robust to outliers in output space (via robust loss functions).
- Provides predictive score distributions by way of quantile regression.

## Applications

Gradient Boosting can be used for:
- Demand forecasting in retail,
- Price prediction in real estate,
- Predicting customer lifetime value in various industries.


# Implementation

### Import Libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import io
import ipywidgets as widgets
from IPython.display import display, clear_output

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error
from IPython.display import display, clear_output, HTML

import warnings
warnings.filterwarnings("ignore")

### Import and show Data

In [2]:
data = pd.read_csv('./Data/BeadArea.csv')
display(data.head())
print ("The data is composed of ", data.shape[0], " rows and ", data.shape[1], " columns.")

Unnamed: 0,Time,Position Feedback,Velocity Feedback from Axis 1,Velocity Feedback from Axis 2,Temperature,Current Feedback,Interpolated Bead Area
0,0.056,15.567,11.178,6.4952,1.1173,1.2309,1.1184
1,0.057,15.579,11.257,6.4952,1.1175,1.2735,1.1191
2,0.058,15.59,11.375,3.2476,1.1175,1.1567,1.1198
3,0.059,15.601,11.324,6.4952,1.1173,1.2975,1.1205
4,0.06,15.613,11.285,6.4952,1.1173,1.2168,1.1212


The data is composed of  295568  rows and  7  columns.


### Data Preprocessing

In [3]:
data['Lag1'] = data['Interpolated Bead Area'].shift(1)
data['Lag2'] = data['Interpolated Bead Area'].shift(2)
data['Lag3'] = data['Interpolated Bead Area'].shift(3)
data.dropna(inplace=True)

### Predict Bead Area

In [4]:

# Define widgets with adjusted layout
index_range_slider = widgets.IntRangeSlider(
    value=[0, min(500, len(data))],
    min=0,
    max=len(data),
    step=1,
    description='Index Range:',
    layout=widgets.Layout(width='600px'),  # Increase width for better readability
    style={'description_width': '150px'},  # Increase description width
    continuous_update=False
)

feature_select = widgets.SelectMultiple(
    options=['Lag1', 'Lag2', 'Lag3', 'Time', 'Position Feedback', 'Velocity Feedback from Axis 1', 'Velocity Feedback from Axis 2', 'Temperature', 'Current Feedback'],
    value=['Lag1', 'Lag2', 'Lag3', 'Time', 'Position Feedback', 'Velocity Feedback from Axis 1', 'Velocity Feedback from Axis 2', 'Temperature', 'Current Feedback'],
    description='Features:',
    layout=widgets.Layout(width='600px', height='180px'),  # Increase width and height
    style={'description_width': '150px'},  # Increase description width
    disabled=False
)

train_size_slider = widgets.IntSlider(
    value=80,
    min=50,
    max=95,
    step=1,
    description='Train %:',
    layout=widgets.Layout(width='600px'),  # Increase width
    style={'description_width': '150px'},  # Increase description width
    continuous_update=False
)

# Gradient Boosting parameter sliders
n_estimators_slider = widgets.IntSlider(
    value=100,
    min=10,
    max=500,
    step=10,
    description='Number of Estimators:',
    layout=widgets.Layout(width='600px'),
    style={'description_width': '160px'},
    continuous_update=False
)

max_depth_slider = widgets.IntSlider(
    value=3,
    min=1,
    max=20,
    step=1,
    description='Maximum Depth:',
    layout=widgets.Layout(width='600px'),
    style={'description_width': '150px'},
    continuous_update=False
)

learning_rate_slider = widgets.FloatSlider(
    value=0.1,
    min=0.01,
    max=1,
    step=0.01,
    description='Learning Rate:',
    layout=widgets.Layout(width='600px'),
    style={'description_width': '150px'},
    continuous_update=False
)

apply_button = widgets.Button(description="Apply Changes", layout=widgets.Layout(width='800px'))

# Define the function to apply changes and update the plots
def apply_changes(b):
    with output:
        clear_output(wait=True)
        
        # Extract the parameters from widgets
        index_range = index_range_slider.value
        selected_features = list(feature_select.value)
        train_size_pct = train_size_slider.value / 100
        n_estimators = n_estimators_slider.value
        max_depth = max_depth_slider.value
        learning_rate = learning_rate_slider.value
        
        # Slice the data
        df = data[index_range[0]:index_range[1]]
        
        # Prepare the data (assuming 'Interpolated Bead Area' is already in `df`)
        X = df[selected_features]
        y = df['Interpolated Bead Area']
        
        # Train-test split
        train_size = int(len(df) * train_size_pct)
        X_train, X_test = X[:train_size], X[train_size:]
        y_train, y_test = y[:train_size], y[train_size:]
        
        # Train the model
        model = GradientBoostingRegressor(
            n_estimators=n_estimators,
            max_depth=max_depth,
            learning_rate=learning_rate,
            random_state=42
        )
        model.fit(X_train, y_train)
        
        # Predict on test data
        y_pred = model.predict(X_test)
        mse = mean_squared_error(y_test, y_pred)
        display(HTML(f'<b>Mean Squared Error: {mse:.5f}</b>'))  # Display MSE in bold
        
        # Plot predicted vs actual
        plt.figure(figsize=(10, 6))
        plt.plot(y_train.index, y_train, label='Training', color='green')
        plt.plot(y_test.index, y_test, label='Actual', color='blue')
        plt.plot(y_test.index, y_pred, label='Predicted', color='red', linestyle='--')
        plt.xlabel('Time')
        plt.ylabel('Interpolated Bead Area')
        plt.title('Actual vs Predicted Interpolated Bead Area')
        plt.legend()
        plt.show()
        
        # Calculate loss for each point
        pointwise_mse_loss = (y_test - y_pred) ** 2
        
        # Plot the pointwise loss
        plt.figure(figsize=(10, 6))
        plt.plot(y_test.index, y_test, label='Actual', color='blue')
        plt.plot(y_test.index, y_pred, label='Predicted', color='red', linestyle='--')
        plt.plot(y_test.index, pointwise_mse_loss, label='Pointwise MSE Loss', color='orange')
        plt.xlabel('Time')
        plt.ylabel('MSE Loss')
        plt.title('Pointwise MSE Loss of Predicted vs Actual Interpolated Bead Area')
        plt.legend()
        plt.show()

# Link the apply button to the function
apply_button.on_click(apply_changes)

# Display the widgets and the output area
output = widgets.Output()

display(index_range_slider, feature_select, train_size_slider, n_estimators_slider, max_depth_slider, learning_rate_slider, apply_button, output)


IntRangeSlider(value=(0, 500), continuous_update=False, description='Index Range:', layout=Layout(width='600px…

SelectMultiple(description='Features:', index=(0, 1, 2, 3, 4, 5, 6, 7, 8), layout=Layout(height='180px', width…

IntSlider(value=80, continuous_update=False, description='Train %:', layout=Layout(width='600px'), max=95, min…

IntSlider(value=100, continuous_update=False, description='Number of Estimators:', layout=Layout(width='600px'…

IntSlider(value=3, continuous_update=False, description='Maximum Depth:', layout=Layout(width='600px'), max=20…

FloatSlider(value=0.1, continuous_update=False, description='Learning Rate:', layout=Layout(width='600px'), ma…

Button(description='Apply Changes', layout=Layout(width='800px'), style=ButtonStyle())

Output()