# Multilayer Perceptrons
You should build an end-to-end machine learning pipeline using a multilayer perceptron model. In particular, you should do the following:
- Load the `mnist` dataset using [Pandas](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html). You can find this dataset in the datasets folder.
- Split the dataset into training and test sets using [Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).
- Build an end-to-end machine learning pipeline, including a [multilayer perceptron](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html) model.
- Optimize your pipeline by validating your design decisions.
- Test the best pipeline on the test set and report various [evaluation metrics](https://scikit-learn.org/0.15/modules/model_evaluation.html).  
- Check the documentation to identify the most important hyperparameters, attributes, and methods of the model. Use them in practice.

# Task
Build an end-to-end machine learning pipeline using a multilayer perceptron model on the "mnist.csv" dataset. The pipeline should include data loading, splitting into training and test sets, model training, and evaluation. Optimize the pipeline and report the performance on the test set using various evaluation metrics. Additionally, identify and demonstrate the use of important hyperparameters, attributes, and methods of the `MLPClassifier` model.

## Load the dataset

### Subtask:
Load the `mnist` dataset from the provided CSV file into a pandas DataFrame.


**Reasoning**:
The subtask requires loading the dataset and displaying its head and info. This can be done in a single code block.



In [1]:
import pandas as pd

df = pd.read_csv('/content/mnist-2.csv')
display(df.head())
display(df.info())

Unnamed: 0,id,class,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
0,31953,5,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,34452,8,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,60897,5,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,36953,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,1981,3,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4000 entries, 0 to 3999
Columns: 786 entries, id to pixel784
dtypes: int64(786)
memory usage: 24.0 MB


None

## Prepare the data

### Subtask:
Separate the features (images) and the target (labels) from the DataFrame.


**Reasoning**:
Separate the features (X) and the target (y) from the dataframe by dropping the 'id' and 'class' columns from the features and selecting the 'class' column as the target.



In [2]:
X = df.drop(['id', 'class'], axis=1)
y = df['class']

display(X.head())
display(y.head())

Unnamed: 0,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,pixel10,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Unnamed: 0,class
0,5
1,8
2,5
3,0
4,3


## Split the data

### Subtask:
Split the dataset into training and testing sets.


**Reasoning**:
Split the features and target into training and testing sets using train_test_split and display their shapes.



In [3]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print("Shape of X_train:", X_train.shape)
print("Shape of X_test:", X_test.shape)
print("Shape of y_train:", y_train.shape)
print("Shape of y_test:", y_test.shape)

Shape of X_train: (3200, 784)
Shape of X_test: (800, 784)
Shape of y_train: (3200,)
Shape of y_test: (800,)


## Build the pipeline

### Subtask:
Create a machine learning pipeline that includes a `MLPClassifier`.


**Reasoning**:
Create a machine learning pipeline with an MLPClassifier.



In [4]:
from sklearn.pipeline import Pipeline
from sklearn.neural_network import MLPClassifier

pipeline = Pipeline([
    ('mlp', MLPClassifier())
])

## Train the model

### Subtask:
Train the pipeline on the training data.


**Reasoning**:
Fit the pipeline to the training data.



In [5]:
pipeline.fit(X_train, y_train)

## Evaluate the model

### Subtask:
Evaluate the trained model on the test set and report various evaluation metrics.


**Reasoning**:
Evaluate the trained model on the test set and report various evaluation metrics as requested in the instructions.



In [6]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

y_pred = pipeline.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')
conf_matrix = confusion_matrix(y_test, y_pred)

print(f"Accuracy: {accuracy:.4f}")
print(f"Precision (weighted): {precision:.4f}")
print(f"Recall (weighted): {recall:.4f}")
print(f"F1 Score (weighted): {f1:.4f}")
print("Confusion Matrix:")
print(conf_matrix)

Accuracy: 0.8550
Precision (weighted): 0.8576
Recall (weighted): 0.8550
F1 Score (weighted): 0.8555
Confusion Matrix:
[[66  0  0  1  0  3  0  0  0  0]
 [ 0 94  0  1  0  2  1  0  2  0]
 [ 1  1 57  3  1  0  4  3  3  0]
 [ 1  1  2 69  0  7  0  3  2  1]
 [ 0  1  1  0 69  1  1  0  3  4]
 [ 1  0  0  2  1 54  0  1  4  1]
 [ 0  0  3  0  2  5 78  1  1  0]
 [ 1  0  0  1  0  0  0 59  1  5]
 [ 0  1  3  5  1  4  2  1 74  3]
 [ 2  0  0  2  4  0  0  3  1 64]]


## Optimize the pipeline

### Subtask:
Explore hyperparameters and potentially add preprocessing steps to improve the model's performance.


**Reasoning**:
Define a parameter grid for GridSearchCV and use GridSearchCV to find the best hyperparameters for the MLPClassifier within the pipeline.



In [None]:
from sklearn.model_selection import GridSearchCV

param_grid = {
    'mlp__hidden_layer_sizes': [(50,), (100,), (50, 50), (100, 50)],
    'mlp__activation': ['tanh', 'relu'],
    'mlp__solver': ['sgd', 'adam'],
    'mlp__alpha': [0.0001, 0.001, 0.01]
}

grid_search = GridSearchCV(pipeline, param_grid, cv=5, scoring='accuracy', n_jobs=-1)

grid_search.fit(X_train, y_train)

best_pipeline = grid_search.best_estimator_

print("Best parameters found: ", grid_search.best_params_)
print("Best cross-validation accuracy: ", grid_search.best_score_)

## Final evaluation

### Subtask:
Evaluate the optimized pipeline on the test set and report the metrics.


**Reasoning**:
Use the best pipeline to make predictions on the test set and calculate the required evaluation metrics.



In [None]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

y_pred_optimized = best_pipeline.predict(X_test)

accuracy_optimized = accuracy_score(y_test, y_pred_optimized)
precision_optimized = precision_score(y_test, y_pred_optimized, average='weighted')
recall_optimized = recall_score(y_test, y_pred_optimized, average='weighted')
f1_optimized = f1_score(y_test, y_pred_optimized, average='weighted')
conf_matrix_optimized = confusion_matrix(y_test, y_pred_optimized)

print(f"Optimized Pipeline Metrics:")
print(f"Accuracy: {accuracy_optimized:.4f}")
print(f"Precision (weighted): {precision_optimized:.4f}")
print(f"Recall (weighted): {recall_optimized:.4f}")
print(f"F1 Score (weighted): {f1_optimized:.4f}")
print("Confusion Matrix:")
print(conf_matrix_optimized)