# Module 1: Introduction to Scikit-Learn

## Section 2: Supervised Learning Algorithms

### Part 19: Neural Networks

In this section, we will explore Neural Networks using Scikit-Learn's Multi-Layer Perceptron (MLP), a powerful algorithm for complex learning tasks. MLP is a type of artificial neural network that consists of multiple layers of interconnected nodes (neurons).

Scikit-learn primarily focuses on traditional machine learning algorithms and doesn't include deep learning libraries for neural networks. In particular, scikit-learn offers no GPU support. However, we can provide an overview of neural networks and the perceptron, which are fundamental concepts in deep learning.

If you're interested in working with neural networks and deep learning, consider using specialized deep learning libraries like TensorFlow, Keras, or PyTorch, which offer a wide range of tools and models for neural network development.

### 19.1 Understanding Perceptron

The perceptron is the simplest form of a neural network, consisting of a single artificial neuron. It takes a weighted sum of input features, adds a bias term, and passes the result through an activation function (often a step function). The output is binary, typically 0 or 1.

Perceptron it's a binary linear classifier that can learn to separate data points into two classes using a linear decision boundary. Perceptrons are limited to linearly separable problems; they cannot solve problems with non-linear decision boundaries.  They played a significant role in the history of neural networks but are rarely used in practice today, as modern deep learning relies on more complex architectures like feedforward neural networks and convolutional neural networks.

Scikit-learn doesn't include a native implementation of the simple single-layer perceptron because it's a linear classifier that can only handle linearly separable problems. Instead, scikit-learn provides more powerful classifiers like Support Vector Machines (SVMs), logistic regression, and multi-layer perceptrons (MLPs) for more complex tasks.

### 19.2 Understanding Neural Networks

Neural networks are a class of machine learning models inspired by the structure and functioning of the human brain. They consist of interconnected layers of artificial neurons, also called nodes or units. Neural networks have the ability to model complex relationships in data, making them suitable for various tasks such as image recognition, natural language processing, and more.

Key components of neural networks include:
- Input Layer: The input layer receives the initial data or features.
- Hidden Layers: Intermediate layers between the input and output layers, where transformations and feature extraction occur.
- Output Layer: The final layer that produces the network's predictions or outputs.
- Weights and Biases: Each connection between neurons is associated with a weight, and each neuron has an associated bias. These parameters are learned during training.
- Activation Functions: Non-linear functions applied to the output of neurons. Common activation functions include sigmoid, ReLU (Rectified Linear Unit), and softmax.
- Loss Function: A function that measures the difference between the predicted and actual values. The goal during training is to minimize this loss.
- Backpropagation: A learning algorithm that calculates gradients of the loss with respect to weights and biases, enabling weight updates to minimize the loss.

### 19.3 Understanding Multi-Layer Perceptron (MLP)

Multi-Layer Perceptron (MLP) is a feedforward neural network architecture, where information flows from the input layer through one or more hidden layers to the output layer. Each layer consists of multiple interconnected neurons, and each neuron applies a non-linear activation function to the weighted sum of its inputs. An MLP is a specific type of neural network that consists of multiple layers of artificial neurons. It includes at least one hidden layer, which provides the capacity to learn complex patterns in data.

MLP can learn complex patterns and relationships in the data, making it suitable for a wide range of tasks, including classification and regression.

To train an MLP model, we need a labeled dataset with the target variable and the corresponding feature values. The model learns by adjusting the weights and biases of the connections between neurons using an optimization algorithm, such as stochastic gradient descent (SGD).

Once trained, we can evaluate the model's performance using evaluation metrics suitable for classification or regression tasks, such as accuracy, precision, recall, F1-score, mean squared error, or area under the ROC curve (AUC-ROC).

Scikit-Learn provides the MLPClassifier class for classification tasks and the MLPRegressor class for regression tasks.

MLP models can be prone to overfitting, especially when the model is too complex or when the dataset is small. Techniques like regularization (e.g., L1 or L2 regularization), dropout, and early stopping can help mitigate overfitting and improve generalization.

Scaling features is important for MLP models, as it helps ensure that all features contribute equally to the learning process. It is recommended to scale the features to a similar range, such as using StandardScaler or MinMaxScaler, before training the MLP model.

MLP models have several hyperparameters that can be tuned to improve performance. These include the number of hidden layers, the number of neurons in each layer, the activation function, the optimization algorithm, the learning rate, and more.

Hyperparameter tuning can be performed using techniques like grid search or randomized search. Scikit-Learn provides tools like GridSearchCV and RandomizedSearchCV to efficiently search through the hyperparameter space.

#### MLPClassifier Example

In [None]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, classification_report

iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

classifier = MLPClassifier(hidden_layer_sizes=(50, 20), max_iter=1000, random_state=42)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
classification_rep = classification_report(y_test, y_pred, target_names=iris.target_names)
print("Classification Report:\n", classification_rep)

In this example we load the Iris dataset, a classic multi-class classification problem. The data is split into training and testing sets. Feature scaling is performed to standardize the features, which is often recommended for neural networks. We create an MLP classifier using MLPClassifier from scikit-learn. The hidden_layer_sizes parameter specifies the number of neurons in each hidden layer. The model is trained on the training data. We make predictions on the test data. Accuracy and a classification report are printed to evaluate the model's performance.

This code demonstrates how to use an MLP classifier for multi-class classification in scikit-learn. You can adjust the number of hidden layers, neurons, and other hyperparameters to suit your specific problem.

#### MLPRegressor Example

In [None]:
import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error, r2_score

X, y = make_regression(n_samples=100, n_features=2, noise=0.1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

regressor = MLPRegressor(hidden_layer_sizes=(50, 20), max_iter=10000, random_state=42)
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.2f}")
r2 = r2_score(y_test, y_pred)
print(f"R-squared: {r2:.2f}")

In this example we generate synthetic regression data using make_regression from scikit-learn. The data is split into training and testing sets. Feature scaling is performed to standardize the features. We create an MLP regressor using MLPRegressor from scikit-learn. The hidden_layer_sizes parameter specifies the number of neurons in each hidden layer. The model is trained on the training data. We make predictions on the test data. We calculate the mean squared error (MSE) and R-squared (coefficient of determination) to evaluate the model's performance for regression.

This code demonstrates how to use an MLP regressor for regression tasks in scikit-learn. You can adjust the architecture and hyperparameters to suit your specific regression problem.

### 19.4 Summary

Multi-Layer Perceptron (MLP) is a powerful neural network algorithm for complex learning tasks. It consists of multiple interconnected layers of neurons and can learn complex patterns and relationships in the data. Scikit-Learn provides the necessary classes to implement MLP easily.