# Multilayer Perceptrons
You should build an end-to-end machine learning pipeline using a multilayer perceptron model. In particular, you should do the following:
- Load the `mnist` dataset using [Pandas](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html). You can find this dataset in the datasets folder.
- Split the dataset into training and test sets using [Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).
- Build an end-to-end machine learning pipeline, including a [multilayer perceptron](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html) model.
- Optimize your pipeline by validating your design decisions.
- Test the best pipeline on the test set and report various [evaluation metrics](https://scikit-learn.org/0.15/modules/model_evaluation.html).  
- Check the documentation to identify the most important hyperparameters, attributes, and methods of the model. Use them in practice.

In [22]:
import pandas as pd
from sklearn.model_selection import train_test_split, RandomizedSearchCV
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix
from sklearn.preprocessing import StandardScaler

In [13]:
df= pd.read_csv('https://raw.githubusercontent.com/m-mahdavi/teaching/refs/heads/main/datasets/mnist.csv')
df.head()

Unnamed: 0,id,class,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
0,31953,5,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,34452,8,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,60897,5,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,36953,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,1981,3,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [16]:
X=df.drop('class',axis=1)
y=df['class']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print (X_train.shape, X_test.shape, y_train.shape, y_test.shape)

(3200, 785) (800, 785) (3200,) (800,)


In [17]:
scalar= StandardScaler()
scalar.fit(X_train)
X_train=scalar.fit_transform(X_train)
X_test=scalar.transform(X_test)

In [18]:
mlp = MLPClassifier(hidden_layer_sizes=(100,),max_iter=500, random_state=42)
mlp.fit(X_train, y_train)
y_pred = mlp.predict(X_test)

In [19]:
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')
conf_matrix = confusion_matrix(y_test, y_pred)

print(f"Accuracy: {accuracy}")
print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"F1-Score: {f1}")
print(f"Confusion Matrix:\n{conf_matrix}")

Accuracy: 0.90875
Precision: 0.9106607924738745
Recall: 0.90875
F1-Score: 0.9087860144744804
Confusion Matrix:
[[69  0  0  0  0  1  0  0  0  0]
 [ 0 95  1  1  1  1  0  0  1  0]
 [ 0  2 64  2  2  0  0  1  2  0]
 [ 1  0  3 75  1  2  1  1  2  0]
 [ 0  0  1  0 77  0  1  0  1  0]
 [ 0  0  0  1  1 59  0  0  2  1]
 [ 0  0  2  0  0  3 84  1  0  0]
 [ 0  0  1  0  1  0  0 61  0  4]
 [ 1  2  3  2  1  2  0  0 81  2]
 [ 0  0  2  1  8  0  0  2  1 62]]


In [32]:
param_dist = {
    'hidden_layer_sizes': [(50,50), (100,50),],
    'activation': ['identity',],
    'solver': ['sgd', ],
    'alpha': [0.001],
    'learning_rate': ['invscaling', 'adaptive'],
    'learning_rate_init': [0.01],
    'max_iter': [100, 300],
    'shuffle': [False],
    'random_state': [42],
}

In [33]:
random_search = RandomizedSearchCV(
    estimator=MLPClassifier(random_state=42),
    param_distributions=param_dist,
    n_iter=10,
    scoring='accuracy',
    cv=5,
    verbose=2,
    random_state=42,
    n_jobs=-1
)

In [34]:
random_search.fit(X_train, y_train)

Fitting 5 folds for each of 8 candidates, totalling 40 fits




In [35]:
best_params = random_search.best_params_
print("Best Hyperparameters:", best_params)

Best Hyperparameters: {'solver': 'sgd', 'shuffle': False, 'random_state': 42, 'max_iter': 100, 'learning_rate_init': 0.01, 'learning_rate': 'adaptive', 'hidden_layer_sizes': (50, 50), 'alpha': 0.001, 'activation': 'identity'}


In [37]:
best_mlp = MLPClassifier(**best_params)
best_mlp.fit(X_train, y_train)



In [38]:
y_pred = best_mlp.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy with Best Hyperparameters:", accuracy)

Accuracy with Best Hyperparameters: 0.87125


In [39]:
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')
conf_matrix = confusion_matrix(y_test, y_pred)

print(f"Accuracy: {accuracy}")
print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"F1-Score: {f1}")
print(f"Confusion Matrix:\n{conf_matrix}")

Accuracy: 0.87125
Precision: 0.8744062273920824
Recall: 0.87125
F1-Score: 0.8712454756584113
Confusion Matrix:
[[68  0  0  0  0  2  0  0  0  0]
 [ 0 97  1  1  0  0  0  0  1  0]
 [ 0  3 59  3  2  0  3  1  2  0]
 [ 0  0  5 72  1  6  0  0  1  1]
 [ 0  0  1  0 76  0  1  0  1  1]
 [ 0  1  1  1  2 53  0  0  5  1]
 [ 0  1  6  0  1  2 79  1  0  0]
 [ 0  1  0  1  1  0  0 61  0  3]
 [ 1  2  5  3  3  5  0  0 71  4]
 [ 0  0  1  1  9  0  0  2  2 61]]
