# Homework 2: Evaluation of From‑Scratch Classifiers
## Installation
To start your homework, you need to install requirements. We recommend that you use conda environment for this homework.

Assuming you have a freshly installed Ubuntu 22.04 machine, use the following commands:

1. apt-get update
2. apt-get install -y curl gcc
3. curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
4. bash Miniconda3-latest-Linux-x86_64.sh
5. source ~/.bashrc
6. conda create -n lfdhw2 python=3.10.13 numpy=2.2.4 scipy=1.15.1 scikit-learn=1.6.1
7. conda activate lfdhw2
8. conda install -c conda-forge notebook pandas -y
9. pip install mnist1d


This notebook will automatically **train and evaluate** your own implementations of Logistic Regression, Support Vector Machine, and Multi‑Layer Perceptron located in `logreg.py`, `svm.py`, and `mlp.py`. Also, you need to implement OnevsRest classifier in `ovr_logreg.py`(You simply fit each class with a separate logistic regression classifier to distinguish that class versus the rest).

Follow the comments in each code cell if you would like to modify hyper‑parameters or datasets.

**Your accuracy scores should not be more than 12% lower than those from the original scikit-learn library.**

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import numpy as np
import pandas as pd

from mnist1d.data import make_dataset, get_dataset_args
from sklearn.datasets import load_breast_cancer

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report

from sklearn.linear_model import LogisticRegression as SkLogisticRegression
from sklearn.svm import SVC
from sklearn.neural_network import MLPClassifier
from sklearn.tree import DecisionTreeClassifier

from logreg import LogisticRegression   
from svm import SVM
from mlp import MLP
from ovr_logreg import OneVsRestLogisticRegression

from scipy.stats import ttest_rel  # paired t-test

import warnings; warnings.filterwarnings('ignore')

In [3]:
# 1. Load dataset (binary classification)
data_bc = load_breast_cancer()
X = pd.DataFrame(data_bc.data, columns=data_bc.feature_names)
y = pd.Series(data_bc.target)  # y is 0/1

# 2. Standardize features, for simplicity, we intentionally ....(complete the sentence).
scaler = StandardScaler()
X = scaler.fit_transform(X)

# 3. Train / test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42, stratify=y)

print(f'Train samples: {X_train.shape[0]}, Test samples: {X_test.shape[0]}')

Train samples: 426, Test samples: 143


# Logistic Regression

In [4]:
# === Logistic Regression (from scratch) ===
logreg = LogisticRegression(max_iter=1000, random_state=42)
logreg.fit(X_train, y_train)
y_pred_lr = logreg.predict(X_test)
print('Logistic Regression Accuracy:', accuracy_score(y_test, y_pred_lr))
print(classification_report(y_test, y_pred_lr))

Logistic Regression Accuracy: 0.9440559440559441
              precision    recall  f1-score   support

           0       0.96      0.89      0.92        53
           1       0.94      0.98      0.96        90

    accuracy                           0.94       143
   macro avg       0.95      0.93      0.94       143
weighted avg       0.94      0.94      0.94       143



In [5]:
# --- scikit‑learn Logistic Regression ---
sk_logreg = SkLogisticRegression(max_iter=1000, solver='saga', random_state=42)
sk_logreg.fit(X_train, y_train)
y_pred_lr_sk = sk_logreg.predict(X_test)
print('Scikit-learn Logistic Regression Accuracy:', accuracy_score(y_test, y_pred_lr_sk))


Scikit-learn Logistic Regression Accuracy: 0.986013986013986


## Now, we'll use mnist 1D dataset
###### https://github.com/greydanus/mnist1d

In [6]:
defaults = get_dataset_args()
data_mnist1d = make_dataset(defaults)
X_mnist1d, y_mnist1d, t_mnist1d = data_mnist1d['x'], data_mnist1d['y'], data_mnist1d['t']

X_train_mnist1d, X_test_mnist1d, y_train_mnist1d, y_test_mnist1d = \
    train_test_split(X_mnist1d, y_mnist1d, test_size=0.2, random_state=42)

In [7]:
sk_logreg_mnist1d = SkLogisticRegression(max_iter=1000, solver='saga', 
                                         multi_class='multinomial')
# Attention: multi_class='multinomial'was deprecated since version 1.5:
# multi_class was deprecated in version 1.5 and will be removed in 1.7. 
# From then on, the recommended ‘multinomial’ will always be used for n_classes >= 3. 
# Solvers that do not support ‘multinomial’ will raise an error. 
# Use sklearn.multiclass.OneVsRestClassifier(LogisticRegression()) if you still want to use OvR.
# For simplicity, we'll use multi_class='multinomial'

sk_logreg_mnist1d.fit(X_train_mnist1d, y_train_mnist1d)

y_pred_test_mnist1d = sk_logreg_mnist1d.predict(X_test_mnist1d)
accuracy_test_mnist1d = accuracy_score(y_test_mnist1d, y_pred_test_mnist1d)
print(f"Test accuracy Mnist: {accuracy_test_mnist1d:.2%}")

Test accuracy Mnist: 31.13%


In [8]:
ovr_mnist1d = OneVsRestLogisticRegression(max_iter=1000)

ovr_mnist1d.fit(X_train_mnist1d, y_train_mnist1d)

y_pred_test_mnist1d = ovr_mnist1d.predict(X_test_mnist1d)
accuracy_test_mnist1d = accuracy_score(y_test_mnist1d, y_pred_test_mnist1d)
print(f"Test accuracy Mnist: {accuracy_test_mnist1d:.2%}")

Test accuracy Mnist: 23.25%


# SVM

In [10]:
# === Support Vector Machine (from scratch) ===
# Convert labels to {-1, 1} for SVM
y_train_svm = np.where(y_train == 0, -1, 1)
y_test_svm = np.where(y_test == 0, -1, 1)

svm = SVM(C=1.0, max_iter=1000, random_state=42)
svm.fit(X_train, y_train_svm)
y_pred_svm = svm.predict(X_test)

# Map back to 0/1 for metrics
y_pred_svm_bin = np.where(y_pred_svm == -1, 0, 1)
print('SVM Accuracy:', accuracy_score(y_test, y_pred_svm_bin))
print(classification_report(y_test, y_pred_svm_bin))

SVM Accuracy: 0.986013986013986
              precision    recall  f1-score   support

           0       0.98      0.98      0.98        53
           1       0.99      0.99      0.99        90

    accuracy                           0.99       143
   macro avg       0.99      0.99      0.99       143
weighted avg       0.99      0.99      0.99       143



In [11]:
# --- scikit‑learn SVM ---
sk_svm = SVC(kernel='linear')
sk_svm.fit(X_train, y_train)
y_pred_svm_sk = sk_svm.predict(X_test)
print('Scikit-learn SVM Accuracy:', accuracy_score(y_test, y_pred_svm_sk))

Scikit-learn SVM Accuracy: 0.986013986013986


# MLP

In [13]:
# === Multi‑Layer Perceptron (from scratch) ===
# Prepare one‑hot labels for MLP
def one_hot(y, num_classes):
    out = np.zeros((y.size, num_classes))
    out[np.arange(y.size), y] = 1
    return out

y_train_oh = one_hot(y_train_mnist1d, 10)

mlp = MLP(input_size=X_train_mnist1d.shape[1], hidden_sizes=[32, 16], output_size=10,
          activation='relu', output_activation='softmax', learning_rate=0.05)
mlp.fit(X_train_mnist1d, y_train_oh, epochs=100, batch_size=32, verbose=False)

y_pred_probs = mlp.predict_proba(X_test_mnist1d)
y_pred_mlp = np.argmax(y_pred_probs, axis=1)
print('MLP Accuracy:', accuracy_score(y_test_mnist1d, y_pred_mlp))
print(classification_report(y_test_mnist1d, y_pred_mlp))

MLP Accuracy: 0.4925
              precision    recall  f1-score   support

           0       0.90      0.81      0.86        81
           1       0.39      0.21      0.27        87
           2       0.38      0.26      0.31        85
           3       0.63      0.73      0.68        90
           4       0.15      0.09      0.11        70
           5       0.30      0.46      0.36        76
           6       0.81      0.95      0.87        73
           7       0.45      0.47      0.46        93
           8       0.52      0.54      0.53        79
           9       0.26      0.38      0.31        66

    accuracy                           0.49       800
   macro avg       0.48      0.49      0.48       800
weighted avg       0.49      0.49      0.48       800



In [14]:
# --- scikit‑learn MLP ---
sk_mlp = MLPClassifier(hidden_layer_sizes=(32,16), activation='relu', 
                        max_iter=100, random_state=42)
sk_mlp.fit(X_train_mnist1d, y_train_mnist1d)
y_pred_mlp_sk = sk_mlp.predict(X_test_mnist1d)

print('Scikit-learn MLP Accuracy:', accuracy_score(y_test_mnist1d, y_pred_mlp_sk))

Scikit-learn MLP Accuracy: 0.52875


# Apply T-Test
##### In this section we will apply t-test, which is used to find a statistically significant difference between two groups.
##### For a concise explanation of the t-test, see this YouTube channel: https://www.youtube.com/@tilestats

In [15]:
seeds = range(10)
accuracies_1 = []
accuracies_2 = []

for seed in seeds:
    model1 = DecisionTreeClassifier(random_state=seed)
    model2 = DecisionTreeClassifier(random_state=seed+1000)
    model1.fit(X_train, y_train)
    model2.fit(X_train, y_train)
    
    acc1 = accuracy_score(y_test, model1.predict(X_test))
    acc2 = accuracy_score(y_test, model2.predict(X_test))
    
    accuracies_1.append(acc1)
    accuracies_2.append(acc2)

acc1 = np.array(accuracies_1)
acc2 = np.array(accuracies_2)


In [16]:
print(acc1)
print(acc2)

[0.93706294 0.93706294 0.93706294 0.93706294 0.91608392 0.91608392
 0.93006993 0.92307692 0.90909091 0.93706294]
[0.93006993 0.91608392 0.93006993 0.92307692 0.92307692 0.91608392
 0.93706294 0.93006993 0.91608392 0.93006993]


##### Apply a t-test on two accuracy scores and analyze the results

In [18]:
import math 

diff      = acc1 - acc2
n         = diff.size
mean1     = acc1.mean()
mean2     = acc2.mean()
mean_diff = diff.mean()
std_diff  = diff.std(ddof=1)
t_stat    = mean_diff / (std_diff / math.sqrt(n))
df        = n - 1

def t_pdf(x: float, v: int) -> float:
    num   = math.gamma((v + 1) / 2)
    denom = math.sqrt(v * math.pi) * math.gamma(v / 2)
    return num / denom * (1 + x**2 / v) ** (-(v + 1) / 2)

def t_survival(x: float, v: int, step: float = 1e-4, upper: float = 100) -> float:
    x = abs(x)
    if x == 0:
        return 0.5
    n_steps = int((upper - x) / step)
    if n_steps % 2 == 1:
        n_steps += 1
    h  = (upper - x) / n_steps
    s  = t_pdf(x, v) + t_pdf(upper, v)
    for k in range(1, n_steps):
        factor = 4 if k % 2 else 2
        s += factor * t_pdf(x + k * h, v)
    return (h / 3) * s

p_one = t_survival(t_stat, df)
p_val = 2 * p_one

print(f"Mean Accuracy (Model 1): {mean1}")
print(f"Mean Accuracy (Model 2): {mean2}")
print(f"T-statistic: {t_stat}")
print(f"P-value: {p_val}")

Mean Accuracy (Model 1): 0.9279720279720278
Mean Accuracy (Model 2): 0.9251748251748252
T-statistic: 0.8846517369293798
P-value: 0.3993613590441641


Interpration: