# Multilayer Perceptrons
You should build an end-to-end machine learning pipeline using a multilayer perceptron model. In particular, you should do the following:
- Load the `mnist` dataset using [Pandas](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html). You can find this dataset in the datasets folder.
- Split the dataset into training and test sets using [Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).
- Build an end-to-end machine learning pipeline, including a [multilayer perceptron](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html) model.
- Optimize your pipeline by validating your design decisions.
- Test the best pipeline on the test set and report various [evaluation metrics](https://scikit-learn.org/0.15/modules/model_evaluation.html).  
- Check the documentation to identify the most important hyperparameters, attributes, and methods of the model. Use them in practice.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split, RandomizedSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from scipy.stats import uniform, randint


In [2]:
# 1. Load the dataset

df = pd.read_csv("https://raw.githubusercontent.com/m-mahdavi/teaching/main/datasets/mnist.csv")


In [3]:
# 2. Drop 'id' column
df = df.drop('id', axis=1)

In [4]:
# 3. Split features and target
X = df.drop("class", axis=1)
y = df["class"]

In [5]:
# 4. Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [6]:
# 5. Build pipeline
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('mlp', MLPClassifier(max_iter=100, random_state=42))
])


In [7]:
# 6. Define hyperparameter search space
param_distributions = {
    'mlp__hidden_layer_sizes': [(randint.rvs(80, 150),), (randint.rvs(100, 200), randint.rvs(50, 150))],
    'mlp__activation': ['relu', 'tanh'],
    'mlp__solver': ['adam', 'sgd'],
    'mlp__alpha': uniform(0.0001, 0.01),
    'mlp__learning_rate_init': uniform(0.001, 0.01)
}


In [8]:
# 7. Randomized search
random_search = RandomizedSearchCV(
    pipeline,
    param_distributions=param_distributions,
    n_iter=10,
    cv=3,
    scoring='accuracy',
    random_state=42,
    n_jobs=-1,
    verbose=1
)

In [9]:
# 8. Train model
random_search.fit(X_train, y_train)

Fitting 3 folds for each of 10 candidates, totalling 30 fits


In [12]:
# 9. Evaluate
best_model = random_search.best_estimator_

In [13]:
y_pred = best_model.predict(X_test)


In [15]:
print("✅ Best Parameters:", random_search.best_params_)
print("✅ Accuracy:", accuracy_score(y_test, y_pred))
print("✅ Classification Report:\n", classification_report(y_test, y_pred))


✅ Best Parameters: {'mlp__activation': 'relu', 'mlp__alpha': np.float64(0.008065429868602328), 'mlp__hidden_layer_sizes': (122,), 'mlp__learning_rate_init': np.float64(0.008319939418114052), 'mlp__solver': 'adam'}
✅ Accuracy: 0.91125
✅ Classification Report:
               precision    recall  f1-score   support

           0       0.97      0.97      0.97        70
           1       0.94      0.96      0.95       100
           2       0.85      0.90      0.87        73
           3       0.94      0.86      0.90        86
           4       0.87      0.95      0.91        80
           5       0.86      0.95      0.90        64
           6       0.97      0.92      0.94        90
           7       0.97      0.93      0.95        67
           8       0.87      0.85      0.86        94
           9       0.89      0.83      0.86        76

    accuracy                           0.91       800
   macro avg       0.91      0.91      0.91       800
weighted avg       0.91      0.91   