# <span style="color:darkblue;">[LDATS2350] - DATA MINING</span>

### <span style="color:darkred;">Python16 - Multi-Layer Perceptron (MLP)</span>

**Prof. Robin Van Oirbeek**  

<br/>

**<span style="color:darkgreen;">Guillaume Deside</span>** (<span style="color:gray;">guillaume.deside@uclouvain.be</span>)

---

## **🔹 What is a Multi-Layer Perceptron (MLP)?**  
A **Multi-Layer Perceptron (MLP)** is a type of artificial neural network (ANN) that consists of **multiple layers** of neurons. Unlike simple perceptrons, which can only handle linearly separable problems, MLP can model **complex, non-linear relationships**.

MLP is widely used for **classification and regression tasks** in machine learning.

---

### **🚀 Why Use MLP?**
✅ Can model **non-linear** relationships  
✅ Uses **backpropagation** for learning  
✅ Suitable for **both classification & regression**  
✅ Works well with **structured and unstructured data**  

---

## **🔹 Structure of an MLP**  
An MLP consists of the following layers:

1️⃣ **Input Layer**  
   - Receives raw input features (e.g., numerical values from a dataset).

2️⃣ **Hidden Layers**  
   - Apply transformations to extract meaningful patterns.  
   - Use activation functions like **ReLU**, **sigmoid**, or **tanh**.  

3️⃣ **Output Layer**  
   - Produces final predictions.  
   - For **classification**, applies **softmax** for probabilities.  
   - For **regression**, outputs continuous values.

---

### **🖼️ MLP Architecture**
![MLP Neural Network](https://www.simplilearn.com/ice9/free_resources_article_thumb/MultilayerANN_1.jpg)

---

## **🔹 MLP Forward Propagation**  
Each neuron applies a transformation:


$z = W \cdot X + b$

where:
- $ W $ = **Weights**
- $ X $ = **Input features**
- $ b $ = **Bias**

Then, an **activation function** is applied:

$
a = f(z)
$

where $ f $ can be:
- **ReLU**: $ f(x) = \max(0, x) $
- **Sigmoid**: $f(x) = \frac{1}{1+e^{-x}} $ (used for binary classification)
- **Softmax**: Converts scores into probabilities for multi-class classification.

---

## **🔹 Implementing MLP with Scikit-Learn**
We will now train an **MLP classifier** using `MLPClassifier` from `sklearn.neural_network`.


## **🔹 Key Parameters in `MLPClassifier`**
- `hidden_layer_sizes=(10, 5)`: Defines the number of neurons in each hidden layer.
- `activation='relu'`: Specifies the activation function.
- `solver='adam'`: Optimizer for weight updates (SGD, Adam, etc.).
- `max_iter=500`: Number of training iterations.

---

## **🔹 Applications of MLP**
✅ **Image Recognition**  
✅ **Spam Detection**  
✅ **Medical Diagnosis**  
✅ **Fraud Detection**  
✅ **Stock Market Prediction**  

---

## **🎯 Key Takeaways**
✔️ MLP is a powerful **neural network model** for classification and regression.  
✔️ It consists of **input, hidden, and output layers**.  
✔️ **Backpropagation & gradient descent** help optimize weights.  
✔️ Can handle **non-linearity** and **complex data patterns**.  
✔️ **Used in deep learning applications** across various domains.  

# Data loading

In [9]:
#IMPORT DATA
import pandas as pd
data = pd.read_csv('diabetes.csv')

X = data.iloc[:,0:-1]
column_names = list(X) 
y = data.iloc[:,-1] 

from sklearn.model_selection import train_test_split

#SPLIT DATA INTO TRAIN AND TEST SET
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size =0.30, #by default is 75%-25%
                                                    #shuffle is set True by default,
                                                    stratify=y,
                                                    random_state= 123) #fix random seed for replicability

print(X_train.shape)

(537, 8)


# MLP model

<div style="border: 2px solid darkblue; padding: 10px; background-color: #89D9F5;">

### **Exercise: Multi-Layer Perceptron (MLP) with GridSearchCV**


#### **Instructions**
1. **Set up the MLP Classifier**
   - Use `MLPClassifier()` from `sklearn.neural_network`.
   - Define a grid of hyperparameters:
     - `hidden_layer_sizes`: Different neural network architectures.
     - `max_iter`: Number of iterations.
     - `alpha`: Regularization strength.

2. **Perform Grid Search**
   - Use `GridSearchCV` to search for the best combination of hyperparameters.
   - Use **3-fold cross-validation** and **F1-score** as the scoring metric.

3. **Train the Best Model**
   - Fit the best model on the training data.
   - Predict on both **training** and **test** sets.

4. **Evaluate the Performance**
   - Print **classification report** for precision, recall, and F1-score.
   - Compute and visualize the **ROC curve**.
   - Compute the **AUC (Area Under Curve) score**.

---

### **💡 Key Questions to Answer**
1. **Which set of hyperparameters yielded the best results?**
2. **How does the F1-score on the test set compare to the training set?**
3. **What does the ROC curve indicate about the classifier’s performance?**
4. **What does the AUC score tell us about the model?**

---

### **🚀 Challenge**
- Try different activation functions (`relu`, `tanh`, `logistic`).
- Experiment with different `hidden_layer_sizes`.
- Test on another dataset and compare the results.
