<div align="center">

# Assignment 5
##  _Adel Movahedian_
## **400102074**  
## _Sharif University of Technology_

</div>


#  Accuracy Measures Using MNIST

In this notebook we perform the following tasks with the MNIST dataset:

1. **Regression Accuracy Metrics:**  
   - We treat the digit label (0–9) as a continuous variable.
   - We fit a linear regression model to predict the digit.
   - We compute:
     - Mean Squared Error (MSE)
     - Mean Absolute Error (MAE)
     - Mean Absolute Percentage Error (MAPE)
     - R² Score

2. **Binary Classification Accuracy Metrics:**  
   - We create a binary classification task by labeling each digit as **even** (1) or **odd** (0).
   - We fit a logistic regression model.
   - We compute:
     - Precision
     - Recall
     - F1-Score

3. **Multi-class Classification Accuracy Metrics:**  
   - We use the original 10 classes (digits 0–9).
   - We fit a multinomial logistic regression model.
   - We compute:
     - Class-specific precision and recall
     - Macro, Weighted, and Micro-averaged F1-Scores

4. **Multi-label Classification Simulation:**  
   - We simulate a multi-label scenario by deriving four binary attributes for each sample:
     - **Label 1:** Digit is even.
     - **Label 2:** Digit is a prime number (2, 3, 5, or 7).
     - **Label 3:** Digit is less than 5.
     - **Label 4:** Digit is greater than 5.
   - We train a multi-label classifier (using One-vs-Rest strategy) and compute:
     - Hamming Loss
     - Sample-averaged F1-Score

Each cell includes explanations to clarify the purpose and findings.


# Import Libraries and Load MNIST

In [1]:
# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import (mean_squared_error, mean_absolute_error,
                             mean_absolute_percentage_error, r2_score,
                             precision_score, recall_score, f1_score,
                             hamming_loss)

# Load the MNIST dataset from OpenML
# Note: This may take a moment as the dataset is large.
mnist = fetch_openml('mnist_784', version=1, as_frame=False)
X = mnist.data    # 70000 samples, 784 features (28x28 images flattened)
X = X[:10000]
y = mnist.target.astype(int)  # Convert string labels to integers
y = y[:10000]
# Print basic information
print("MNIST dataset loaded:")
print("Shape of X:", X.shape)
print("Shape of y:", y.shape)

# Explanation:
# - We load the MNIST dataset using fetch_openml.
# - The features (X) represent pixel intensities and y contains the digit labels (0–9).
# - We choose the first 10000 data for process


MNIST dataset loaded:
Shape of X: (10000, 784)
Shape of y: (10000,)


# Regression Task Using MNIST

In [2]:
# For regression, we treat the digit label as a continuous variable.
# Split the data into training and testing sets (using 20% of the data for testing)
X_train_reg, X_test_reg, y_train_reg, y_test_reg = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train a linear regression model
regressor = LinearRegression()
regressor.fit(X_train_reg, y_train_reg)

# Predict the digit labels (as continuous values) on the test set
y_pred_reg = regressor.predict(X_test_reg)

# Compute regression metrics
mse = mean_squared_error(y_test_reg, y_pred_reg)
mae = mean_absolute_error(y_test_reg, y_pred_reg)
mape = mean_absolute_percentage_error(y_test_reg, y_pred_reg)
r2 = r2_score(y_test_reg, y_pred_reg)

# Display the results
print("Regression Metrics (Predicting digit as continuous value):")
print(f"Mean Squared Error (MSE): {mse:.4f}")
print(f"Mean Absolute Error (MAE): {mae:.4f}")
print(f"Mean Absolute Percentage Error (MAPE): {mape:.4f}")
print(f"R² Score: {r2:.4f}")

# Explanation:
# - We use a linear regression model to predict the digit label.
# - Although digit labels are discrete, treating them as continuous allows us to compute standard regression metrics.


Regression Metrics (Predicting digit as continuous value):
Mean Squared Error (MSE): 4.6868
Mean Absolute Error (MAE): 1.4820
Mean Absolute Percentage Error (MAPE): 666409616012723.8750
R² Score: 0.4281


<font color= "green"> these regression metrics indicate that the regression approach (using a linear regression model on raw MNIST images for predicting digit values) is not suitable or effective with the given data subset. For MNIST, using models tailored for classification (as We did for the other parts) is the proper approach, and that's why those parts yield reasonable results.

# Binary Classification Task Using MNIST

In [3]:
# For binary classification, we label digits as even (1) and odd (0)
y_binary = (y % 2 == 0).astype(int)  # 1 for even, 0 for odd

# Split the data (80% training, 20% testing)
X_train_bin, X_test_bin, y_train_bin, y_test_bin = train_test_split(X, y_binary, test_size=0.2, random_state=42)

# Scale the features to help with convergence
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_bin = scaler.fit_transform(X_train_bin)
X_test_bin = scaler.transform(X_test_bin)

# Initialize and train a logistic regression classifier with increased max_iter
clf_bin = LogisticRegression(max_iter=20000, random_state=42)
clf_bin.fit(X_train_bin, y_train_bin)

# Predict on the test set
y_pred_bin = clf_bin.predict(X_test_bin)

# Compute binary classification metrics
precision_bin = precision_score(y_test_bin, y_pred_bin)
recall_bin = recall_score(y_test_bin, y_pred_bin)
f1_bin = f1_score(y_test_bin, y_pred_bin)

# Display the results
print("Binary Classification Metrics (Even vs. Odd):")
print(f"Precision: {precision_bin:.4f}")
print(f"Recall: {recall_bin:.4f}")
print(f"F1-Score: {f1_bin:.4f}")

# Explanation:
# - We convert the digit labels into a binary format: even (1) and odd (0).
# - Scaling the data helps the optimization algorithm converge faster and more reliably.
# - Increasing max_iter to 20000 ensures that the solver has enough iterations to converge.


Binary Classification Metrics (Even vs. Odd):
Precision: 0.8978
Recall: 0.8710
F1-Score: 0.8842


# Multi-class Classification Task Using MNIST

In [4]:
# For multi-class classification, we use the original digit labels (0-9)
# Split the data (80% training, 20% testing)
X_train_mc, X_test_mc, y_train_mc, y_test_mc = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train a logistic regression classifier
# Removed the 'multi_class' parameter to avoid the FutureWarning.
clf_mc = LogisticRegression(solver='lbfgs', max_iter=2000, random_state=42)
clf_mc.fit(X_train_mc, y_train_mc)

# Predict on the test set
y_pred_mc = clf_mc.predict(X_test_mc)

# Compute class-specific precision and recall (returned as an array for each class)
precision_each = precision_score(y_test_mc, y_pred_mc, average=None)
recall_each = recall_score(y_test_mc, y_pred_mc, average=None)

# Compute aggregated F1-scores using different averaging methods
f1_macro = f1_score(y_test_mc, y_pred_mc, average='macro')
f1_weighted = f1_score(y_test_mc, y_pred_mc, average='weighted')
f1_micro = f1_score(y_test_mc, y_pred_mc, average='micro')

# Display the results
print("Multi-class Classification Metrics (Digits 0-9):")
print(f"Precision for each class: {precision_each}")
print(f"Recall for each class: {recall_each}")
print(f"Macro-averaged F1-Score: {f1_macro:.4f}")
print(f"Weighted-averaged F1-Score: {f1_weighted:.4f}")
print(f"Micro-averaged F1-Score: {f1_micro:.4f}")

# Explanation:
# - We train a logistic regression model to classify all 10 digits.
# - Class-specific precision and recall help evaluate performance for each digit.
# - Aggregated F1-scores (macro, weighted, micro) summarize overall performance.
# - By removing the 'multi_class' parameter, we avoid the FutureWarning while retaining the default multinomial behavior.


Multi-class Classification Metrics (Digits 0-9):
Precision for each class: [0.9478673  0.93693694 0.86528497 0.83       0.93170732 0.83050847
 0.9321267  0.88181818 0.78658537 0.86096257]
Recall for each class: [0.96618357 0.96296296 0.81862745 0.86458333 0.90521327 0.83522727
 0.93636364 0.89814815 0.77710843 0.83854167]
Macro-averaged F1-Score: 0.8802
Weighted-averaged F1-Score: 0.8842
Micro-averaged F1-Score: 0.8845


# Multi-label Classification Simulation Using MNIST

In [5]:
# For the multi-label classification simulation, we derive 4 binary attributes from each digit:
# Label 1: Digit is even.
# Label 2: Digit is prime (prime digits: 2, 3, 5, 7).
# Label 3: Digit is less than 5.
# Label 4: Digit is greater than 5.

# Define helper functions for each attribute
def is_even(x):
    return int(x % 2 == 0)

def is_prime(x):
    return int(x in [2, 3, 5, 7])

def less_than_5(x):
    return int(x < 5)

def greater_than_5(x):
    return int(x > 5)

# Create multi-label targets for all samples
y_multilabel = np.column_stack((
    np.array([is_even(val) for val in y]),
    np.array([is_prime(val) for val in y]),
    np.array([less_than_5(val) for val in y]),
    np.array([greater_than_5(val) for val in y])
))

# Split the data into training and testing sets
X_train_ml, X_test_ml, y_train_ml, y_test_ml = train_test_split(X, y_multilabel, test_size=0.2, random_state=42)

# Scale the features to help with convergence
from sklearn.preprocessing import StandardScaler
scaler_ml = StandardScaler()
X_train_ml = scaler_ml.fit_transform(X_train_ml)
X_test_ml = scaler_ml.transform(X_test_ml)

# For multi-label classification, we use One-vs-Rest strategy with logistic regression
clf_ml = OneVsRestClassifier(LogisticRegression(max_iter=20000, random_state=42))
clf_ml.fit(X_train_ml, y_train_ml)

# Predict multi-label outputs on the test set
y_pred_ml = clf_ml.predict(X_test_ml)

# Compute multi-label metrics
hloss = hamming_loss(y_test_ml, y_pred_ml)
f1_ml = f1_score(y_test_ml, y_pred_ml, average='samples')

# Display the results
print("Multi-label Classification Metrics (Simulated Attributes from MNIST):")
print(f"Hamming Loss: {hloss:.4f}")
print(f"Sample-averaged F1-Score: {f1_ml:.4f}")

# Explanation:
# - We derive four binary labels from the digit label based on different criteria.
# - We scale the features to improve the convergence of the logistic regression solver.
# - Increasing max_iter to 20000 gives the solver more iterations to converge.
# - The Hamming Loss indicates the fraction of labels misclassified, while the sample-averaged F1-Score balances precision and recall for each sample.


Multi-label Classification Metrics (Simulated Attributes from MNIST):
Hamming Loss: 0.1212
Sample-averaged F1-Score: 0.8471


# Final Remarks

In this notebook we adapted the MNIST dataset to perform several tasks:

1. **Regression:**  
   - Treated the digit label as a continuous variable.
   - Computed MSE, MAE, MAPE, and R² Score using linear regression.

2. **Binary Classification:**  
   - Defined a binary task (even vs. odd digits).
   - Evaluated performance with precision, recall, and F1-Score.

3. **Multi-class Classification:**  
   - Used the original 10-class problem.
   - Computed class-specific precision/recall and overall F1-Scores (macro, weighted, micro).

4. **Multi-label Classification Simulation:**  
   - Generated four artificial binary labels from the digit information.
   - Employed a One-vs-Rest logistic regression classifier.
   - Measured performance using Hamming Loss and sample-averaged F1-Score.

This approach demonstrates that MNIST can be creatively used for various accuracy measure evaluations, even if it was originally designed for digit recognition.
