# Multilayer Perceptron

This notebook demonstrates how to use the `MLP` module from the `rice2025.supervised_learning` library.  

## Setup
Import necessary modules and load data. For this example, the wine dataset from sklearn will be used. 

The Wine dataset is a small classification dataset that has:

- **Samples:** 178  
- **Features:** 13 numeric chemical properties of wines  
- **Classes:** 3 types of wine  

**Goal:** Predict the type of wine based on its chemical features.  

In [56]:
# import library
from rice2025.supervised_learning import multilayer_perceptron
import rice2025.utilities as util

# load dataset
from sklearn.datasets import load_wine
data = load_wine()
X, y = data.data, data.target

## Data Pre-Processing
Before training, we split the dataset into **training** and **test** sets using `train_test_split`. We can verify the split by printing the lengths of each output dataset. Then, we can use the `scale` function to scale our data. 

In [57]:
# split dataset
X_train, X_test, y_train, y_test = util.train_test_split(X, y, test_size=.2)
print(f"Train size: {X_train.shape}, Test size: {X_test.shape}")

# scale dataset
X_train, X_test = util.fit_transform_split(X_train, X_test)

Train size: (142, 13), Test size: (36, 13)


## Initializing and Training the MLP Model

We will use the default parameters for `MultilayerPerceptron`:
- `n_hidden` = 100
- `lr` = .01
- `n_iter` = 1000  

Use the `fit()` method to "train" the model on the training data.

In [58]:
model = multilayer_perceptron.MultilayerPerceptron()
returns = model.fit(X_train, y_train)

## Making Predictions
Once the model is trained, the `predict()` method can be used to classify new data points.

In [59]:
y_pred = model.predict(X_test)

## Evaluating the Model

The model's performance can be measured using **accuracy** or a more detailed **classification report**.  
The `accuracy_score` and `classification_report` functions from scikit-learn can help measure performance.

In [60]:
from sklearn.metrics import accuracy_score, classification_report

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy on test set: {accuracy:.2f}")

# Detailed report
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=data.target_names))

Accuracy on test set: 0.81

Classification Report:
              precision    recall  f1-score   support

     class_0       1.00      0.93      0.96        14
     class_1       0.70      1.00      0.82        16
     class_2       0.00      0.00      0.00         6

    accuracy                           0.81        36
   macro avg       0.57      0.64      0.59        36
weighted avg       0.70      0.81      0.74        36



  _warn_prf(average, modifier, f"{metric.capitalize()} is", result.shape[0])
  _warn_prf(average, modifier, f"{metric.capitalize()} is", result.shape[0])
  _warn_prf(average, modifier, f"{metric.capitalize()} is", result.shape[0])


The model struggles with predicting `class_2` due to its small representation in the dataset, which leads to zero precision and recall for this class. This is a common issue in imbalanced datasets where underrepresented classes are harder for the model to learn.