<a href="https://colab.research.google.com/github/AbelAdissu/MNIST_Model_Selection_And_Evaluation/blob/main/MNIST_Model_Selection_And_Evaluation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Cell 1: Importing Libraries and Loading the Data

In [None]:
import numpy as np
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score, KFold
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score

# Load the MNIST dataset
mnist = fetch_openml("mnist_784", version=1)
X = mnist.data
y = mnist.target


Explanation:

In this cell, we import necessary libraries for data manipulation, model selection, and evaluation.
We load the MNIST dataset using fetch_openml from scikit-learn and store the features in X and the labels in y.

Cell 2: Defining Models and Hyperparameter Grids

In [None]:
# Define models
models = {
    'Logistic Regression': LogisticRegression(),
    'K-Nearest Neighbors': KNeighborsClassifier()
}

# Define hyperparameter grids for each model
param_grids = {
    'Logistic Regression': {
        'C': [0.001, 0.01, 0.1, 1, 10, 100]
    },
    'K-Nearest Neighbors': {
        'n_neighbors': [3, 5, 7, 9],
        'weights': ['uniform', 'distance']
    }
}


Explanation:

In this cell, we define two machine learning models, Logistic Regression and K-Nearest Neighbors, and store them in the models dictionary.
We also specify hyperparameter grids for each model. These grids contain values for hyperparameters like C for Logistic Regression and n_neighbors and weights for K-Nearest Neighbors

Cell 3: Defining Accuracy Metrics

In [None]:
# Define accuracy metrics
scoring = {
    'Accuracy': 'accuracy',
    'F1 Score': 'f1_macro',
    'Precision': 'precision_macro',
    'Recall': 'recall_macro'
}


Explanation:

In this cell, we define a dictionary named scoring that maps different accuracy metrics to their respective scoring functions. The metrics include Accuracy, F1 Score, Precision, and Recall.

Cell 4: Splitting Data into Training and Testing Sets

In [None]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


Explanation:

Here, we split the dataset into training and testing sets using train_test_split. The training set (X_train and y_train) contains 80% of the data, and the testing set (X_test and y_test) contains 20%. The random_state parameter ensures reproducibility.
In the next cells, we will continue with the code to perform model selection and hyperparameter tuning using k-fold cross-validation, as well as evaluating the models using different accuracy metrics.

Cell 5: Model Selection with Cross-Validation

In [None]:
# Define the number of folds for cross-validation
num_folds = 5

# Create a KFold cross-validation splitter
kf = KFold(n_splits=num_folds, shuffle=True, random_state=42)

# Loop over each model
for model_name, model in models.items():
    param_grid = param_grids[model_name]

    # Create a GridSearchCV object with k-fold cross-validation
    grid_search = GridSearchCV(model, param_grid, cv=kf, scoring=scoring, refit='Accuracy')
    grid_search.fit(X_train, y_train)

    best_model = grid_search.best_estimator_
    best_models[model_name] = best_model

    print(f"Best {model_name} Model:")
    print(f"Best Parameters: {grid_search.best_params_}")
    print(f"Best Accuracy: {grid_search.best_score_:.4f}")
    print()


Explanation:

In this cell, we set the number of folds for k-fold cross-validation to 5 using num_folds.
We create a KFold object, kf, which will be used to split the training data into folds for cross-validation. The shuffle parameter ensures that the data is shuffled before splitting, and random_state ensures reproducibility.
We loop over each model defined earlier and create a GridSearchCV object for hyperparameter tuning with k-fold cross-validation. This allows us to find the best hyperparameters for each model while considering multiple accuracy metrics.
The best model and its hyperparameters are stored in the best_models dictionary, and the best parameters and accuracy are printed for each model.

Cell 6: Evaluating Models on Test Data

In [None]:
# Evaluate the best models on the test data
for model_name, model in best_models.items():
    y_pred = model.predict(X_test)
    print(f"Evaluation for {model_name}:")
    print(f"Accuracy: {accuracy_score(y_test, y_pred):.4f}")
    print(f"F1 Score: {f1_score(y_test, y_pred, average='macro'):.4f}")
    print(f"Precision: {precision_score(y_test, y_pred, average='macro'):.4f}")
    print(f"Recall: {recall_score(y_test, y_pred, average='macro'):.4f}")
    print()


Explanation:

In this cell, we evaluate the best models selected during hyperparameter tuning on the test data.
We use each model to make predictions on the test set and calculate various accuracy metrics, including Accuracy, F1 Score, Precision, and Recall.
The results for each model are printed, providing a comprehensive evaluation of their performance on the test data.