# MLP Regressor

**Multi-Layer Perceptron (MLP) Regressor** is a type of artificial neural network used in supervised learning tasks where the goal is to predict a continuous dependent variable. MLPs are capable of learning complex patterns using multiple layers and neurons.

## Concept

MLP consists of at least three layers of nodes: an input layer, a hidden layer, and an output layer. Except for the input nodes, each node is a neuron that uses a nonlinear activation function. MLP utilizes a technique called backpropagation for training the network.

## Architecture

- **Input Layer:** Receives the feature set as inputs which are then passed to the first hidden layer.
- **Hidden Layers:** Each hidden layer transforms the inputs from the previous layer based on weights, biases, and activation functions. MLP can have one or more hidden layers.
- **Output Layer:** Produces the final output of the network. In the case of regression, it typically has a single neuron for single-target regression or multiple neurons for multi-target regression.

## Key Parameters

- $n_{\text{hidden}}$: Number of neurons in the hidden layers.
- $\text{activation}$: The activation function for the neurons, typically ReLU, sigmoid, or tanh.
- $\text{solver}$: The algorithm for weight optimization, such as SGD, Adam, or LBFGS.
- $\text{learning\_rate}$: Determines the step size at each iteration while moving toward a minimum of a loss function.
- $\text{max\_iter}$: Maximum number of iterations. The solver iterates until convergence or this number of iterations.
- $\alpha$: L2 penalty (regularization term) parameter. A higher value of $\alpha$ increases the regularization strength, which helps reduce overfitting but can make the network less sensitive to the training data.


## Training Process

1. **Forward Propagation:** Compute the predicted output $\hat{y}$, using the current weights and biases in the network layers.
2. **Calculate Loss:** The difference between the actual output $y$ and predicted output $\hat{y}$ using a loss function, typically Mean Squared Error (MSE) for regression.
3. **Backpropagation:** Update the weights and biases in the network in a way that minimizes the loss.

## Advantages

- Capable of modeling highly non-linear functions.
- Flexible to model various types of responses through the choice of activation and loss functions.
- Suitable for large datasets and complex feature interactions.

## Applications

MLP is widely used in:
- Predicting energy consumption,
- Stock market predictions,
- Real estate price forecasting,
- Any complex regression tasks in quantitative finance and economics.


# Implementation

### Import Libraries

**Press ▶ to import the libraries.**

In [None]:
# import pandas as pd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import io
import ipywidgets as widgets
from IPython.display import display, clear_output

from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
from IPython.display import display, clear_output, HTML

import warnings
warnings.filterwarnings("ignore")

print("Libraries are imported.")

### Import and show Data

**Press ▶ to load the data.**

In [None]:
import os
import pandas as pd
import ipywidgets as widgets
from IPython.display import display, clear_output

# List all .csv and Excel files in the current directory
supported_extensions = ['.csv', '.xlsx', '.xls']
files = [f for f in os.listdir('./Data') if any(f.endswith(ext) for ext in supported_extensions)]

# Create a dropdown widget
dropdown = widgets.Dropdown(
    options=files,
    description='Files:',
    disabled=False,
)

# Create a button widget
button = widgets.Button(
    description='Select',
    disabled=False,
    button_style='',
    tooltip='Click to select file',
    icon='check'
)

# Output widget to display messages
output = widgets.Output()

# Function to handle button click
def on_button_click(b):
    with output:
        clear_output()
        selected_file = dropdown.value
        global data
        if selected_file.endswith('.csv'):
            data = pd.read_csv("./Data/" +selected_file)
        elif selected_file.endswith(('.xlsx', '.xls')):
            data = pd.read_excel("./Data/" +selected_file)
        print(f"File '{selected_file}' uploaded as data.")

# Attach the function to the button widget
button.on_click(on_button_click)

# Display the dropdown, button widgets, and initial message within the output widget
with output:
    print("Please select a file from the dropdown and click 'Select'.")
display(output)
display(dropdown)
display(button)



**Press ▶ to display the data.**

In [None]:
display(data.head())
print ("The data is composed of ", data.shape[0], " rows and ", data.shape[1], " columns.")

### Data Preprocessing

**Press ▶ to specify the target column.**

In [None]:
import ipywidgets as widgets
import pandas as pd

# Create a Dropdown widget for column selection
dropdown = widgets.Dropdown(
    options=data.columns.tolist(),
    value=data.columns[0],
    description='Select Target Column:',
    disabled=False,
    layout=widgets.Layout(width='500px'),
    style={'description_width': '200px'}
)

# Create a Button widget
button = widgets.Button(
    description='Select',
    button_style='',  # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Click to select the target column as the last column',
    icon='check'  # FontAwesome icon names (without 'fa-')
)

# Create an Output widget for displaying messages
output = widgets.Output()

# Function to handle button click that rearranges the DataFrame
def on_button_clicked(b):
    with output:
        output.clear_output()
        global data
        # Get the selected column name
        selected_column = dropdown.value
        # Reorder the DataFrame columns
        new_columns = [col for col in data.columns if col != selected_column] + [selected_column]
        data = data[new_columns]
        print(f"Column '{selected_column}' has been moved to the last position.")

# Link the button click event to the function
button.on_click(on_button_clicked)

# Display the widgets and output
display(widgets.VBox([dropdown, button, output]))


**Press ▶ to create a lagged target column.**

In [None]:
target = data.columns[-1]
data['Target_Lag1'] = data.iloc[:, -1].shift(1)

data.dropna(inplace=True)

### Predict Bead Area

## Parameters

### Hidden Layer Sizes
Hidden layer sizes refer to the number of neurons in each hidden layer. This determines the network's complexity and capacity. More neurons and layers allow the model to learn complex patterns but can increase overfitting risk. Properly configuring hidden layers balances complexity and generalization.

### Activation Function
The activation function applies a non-linear transformation to each neuron's input. Common functions include ReLU, sigmoid, and tanh. The choice of activation function affects the model's ability to capture non-linear relationships. Selecting the right activation function is crucial for effective learning of data patterns.

### L2 Regularization
L2 regularization penalizes large weights to prevent overfitting by adding a term to the loss function proportional to the sum of squared weights, controlled by a parameter (alpha). Higher alpha values increase regularization, reducing overfitting risk but potentially causing underfitting. Tuning L2 regularization helps balance bias and variance.


**Press ▶ to specify independent variables, train/test split, and the model parameter and to forecast the data.**

In [None]:
# Define widgets with adjusted layout
index_range_slider = widgets.IntRangeSlider(
    value=[0, min(5000, len(data))],
    min=0,
    max=len(data),
    step=1,
    description='Index Range:',
    layout=widgets.Layout(width='600px'),  # Increase width for better readability
    style={'description_width': '150px'},  # Increase description width
    continuous_update=False
)

feature_select = widgets.SelectMultiple(
    options=tuple(col for col in data.columns if col != target),
    value=tuple(col for col in data.columns if col != target),
    description='Features:',
    layout=widgets.Layout(width='600px', height='180px'),  # Increase width and height
    style={'description_width': '150px'},  # Increase description width
    disabled=False
)

train_size_slider = widgets.IntSlider(
    value=80,
    min=50,
    max=95,
    step=1,
    description='Train %:',
    layout=widgets.Layout(width='600px'),  # Increase width
    style={'description_width': '150px'},  # Increase description width
    continuous_update=False
)

# MLP Regressor parameter sliders
hidden_layer_sizes_slider = widgets.IntSlider(
    value=100,
    min=10,
    max=500,
    step=10,
    description='Hidden Layer Sizes:',
    layout=widgets.Layout(width='600px'),
    style={'description_width': '150px'},
    continuous_update=False
)

activation_dropdown = widgets.Dropdown(
    options=['identity', 'logistic', 'tanh', 'relu'],
    value='relu',
    description='Activation Function:',
    layout=widgets.Layout(width='600px'),
    style={'description_width': '150px'},
    continuous_update=False
)

alpha_slider = widgets.FloatSlider(
    value=0.0001,
    min=0.00001,
    max=0.1,
    step=0.00001,
    description='L2 regularization:',
    layout=widgets.Layout(width='600px'),
    style={'description_width': '150px'},
    continuous_update=False
)

apply_button = widgets.Button(description="Apply Changes", layout=widgets.Layout(width='800px'))

# Define the function to apply changes and update the plots
def apply_changes(b):
    with output:
        clear_output(wait=True)
        
        # Extract the parameters from widgets
        index_range = index_range_slider.value
        selected_features = list(feature_select.value)
        train_size_pct = train_size_slider.value / 100
        hidden_layer_sizes = hidden_layer_sizes_slider.value
        activation = activation_dropdown.value
        alpha = alpha_slider.value
        
        # Slice the data
        df = data[index_range[0]:index_range[1]]
        
        # Prepare the data (assuming 'Interpolated Bead Area' is already in `df`)
        X = df[selected_features]
        y = df[target]
        
        # Train-test split
        train_size = int(len(df) * train_size_pct)
        X_train, X_test = X[:train_size], X[train_size:]
        y_train, y_test = y[:train_size], y[train_size:]
        
        # Train the model
        model = MLPRegressor(
            hidden_layer_sizes=(hidden_layer_sizes,),
            activation=activation,
            alpha=alpha,
            random_state=42,
            max_iter=100000
        )
        model.fit(X_train, y_train)
        
        # Predict on test data
        y_pred = model.predict(X_test)
        mse = mean_squared_error(y_test, y_pred)
        display(HTML(f'<b>Mean Squared Error: {mse:.5f}</b>'))  # Display MSE in bold
        
        # Plot predicted vs actual
        plt.figure(figsize=(10, 6))
        plt.plot(y_train.index, y_train, label='Training', color='green')
        plt.plot(y_test.index, y_test, label='Actual', color='blue')
        plt.plot(y_test.index, y_pred, label='Predicted', color='red', linestyle='--')
        plt.xlabel('Time')
        plt.ylabel(target)
        plt.title('Actual vs Predicted '+target)
        plt.legend()
        plt.show()
        
        # Calculate loss for each point
        pointwise_mse_loss = (y_test - y_pred) ** 2
        
        # Plot the pointwise loss
        plt.figure(figsize=(10, 6))
        plt.plot(y_test.index, y_test, label='Actual', color='blue')
        plt.plot(y_test.index, y_pred, label='Predicted', color='red', linestyle='--')
        plt.plot(y_test.index, pointwise_mse_loss, label='Pointwise MSE Loss', color='orange')
        plt.xlabel('Time')
        plt.ylabel('MSE Loss')
        plt.title('Pointwise MSE Loss of Predicted vs Actual '+target)
        plt.legend()
        plt.show()

# Link the apply button to the function
apply_button.on_click(apply_changes)

# Display the widgets and the output area
output = widgets.Output()

display(index_range_slider, feature_select, train_size_slider, hidden_layer_sizes_slider, activation_dropdown, alpha_slider, apply_button, output)
