<h3>Solution  </h3>
Worked on this:
<ul>
    <li>Laith Mimi: 213923931 </li>
    <li>Mohamad Dweik: 213543010</li>
    <li>Amro tarter: 326697109</li>
</ul>

<h3>Step 1: Import necessary libraries<h3\>

In [1]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, log_loss, f1_score, confusion_matrix

import numpy as np
import time

<h3>Step 2: Load CIFAR-10 feature and label data<h3\>

In [2]:
X = np.load('cifar10_features.npy')  # 50K images, 16 features each
y = np.load('cifar10_labels.npy')    # labels 0–9

# Split the dataset into training and testing sets(70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42)

<h3>Step 3: Manually implement One-vs-All (OvA) training using binary logistic regression<h3\>

In [None]:
def train_ova_model(X_train, y_train):
    classes = range(10)
    ova_models = {}
    start = time.time()
    
    for cls in classes:
        y_binary = (y_train == cls).astype(int) # 1 if class matches, else 0
        model = LogisticRegression(max_iter=1000)
        model.fit(X_train, y_binary)
        ova_models[cls] = model

    print("OvA Training time:", round(time.time() - start, 2), "seconds")
    return ova_models

def predict_ova(ova_models, X):
    probas = [model.predict_proba(X)[:, 1] for model in ova_models.values()]
    probas = np.array(probas).T # shape (n_samples, n_classes)
    y_pred = np.argmax(probas, axis=1)
    return y_pred, probas

<h3>Step 4: Train Softmax model<h3\>

In [4]:
ova_models = train_ova_model(X_train, y_train)

start = time.time()
model_softmax = LogisticRegression(multi_class='multinomial', max_iter=1000) 
model_softmax.fit(X_train, y_train)
print("Softmax Training time:", round(time.time() - start, 2), "seconds")

OvA Training time: 0.66 seconds




Softmax Training time: 1.39 seconds


notes: OvA is faster than Softmax

<hr>

<h3>Step 5: Evaluate OvA and Softmax models using accuracy, log-loss, and F1-score<h3\>


In [5]:
# OvA prediction and probabilities
y_pred_ova, y_proba_ova = predict_ova(ova_models, X_test)
# Softmax predictions
y_pred_softmax = model_softmax.predict(X_test)
y_proba_softmax = model_softmax.predict_proba(X_test)
# Evaluation
print("OvA Accuracy:", accuracy_score(y_test, y_pred_ova))
print("OvA Log-loss:", log_loss(y_test, y_proba_ova))
print("OvA F1-score:", f1_score(y_test, y_pred_ova, average='macro')) #F1-mean

print("Softmax Accuracy:", accuracy_score(y_test, y_pred_softmax))
print("Softmax Log-loss:", log_loss(y_test, y_proba_softmax))
print("Softmax F1-score:", f1_score(y_test, y_pred_softmax, average='macro')) #F1-mean


OvA Accuracy: 0.961
OvA Log-loss: 0.1581440904770372
OvA F1-score: 0.961210892245097
Softmax Accuracy: 0.9631333333333333
Softmax Log-loss: 0.10772019567595487
Softmax F1-score: 0.9633582976769715




notes: we noticed that softmax is more accurate than OvA

<hr>

<h3>Step 6: Compute and print the confusion matrix for the OvA model<h3\>


In [6]:
cm = confusion_matrix(y_test, y_pred_ova)
print("Confusion Matrix - OvA:")
print(cm)

Confusion Matrix - OvA:
[[1402    2   13    5    5    3    2   10   17    5]
 [   2 1464    4    1    0    1    0    1    5    7]
 [  11    1 1370   15   16   15    5    3    1    3]
 [  14    5   15 1450   12   51    5   10    3    4]
 [   6    0   13   15 1466    5    4    9    0    1]
 [   2    3   13   45   12 1437    8   11    2    1]
 [   5    5    9   17    4    6 1415    0    1    1]
 [   3    0    4   17    9    7    0 1456    0    1]
 [  13    2    1    5    1    0    2    0 1480    6]
 [  12    7    5    6    0    4    2    3    5 1475]]


notes: we noticed that there is a confusion between classes 3 and 5

In [None]:
from sklearn.metrics import confusion_matrix

# Step 1: Filter to keep only samples where the true label is 3 or 5
mask = (y_test == 3) | (y_test == 5)
y_true_35 = y_test[mask]
y_pred_35 = y_pred_ova[mask]

# Step 2: Filter predictions to just classes 3 and 5
mask_pred = (y_pred_35 == 3) | (y_pred_35 == 5)
y_true_35 = y_true_35[mask_pred]
y_pred_35 = y_pred_35[mask_pred]

# Step 3: Compute confusion matrix for classes 3 and 5
cm_35 = confusion_matrix(y_true_35, y_pred_35, labels=[3, 5])

# Step 4: Print
print("Confusion Matrix - OvA Model (classes 3 and 5):")
print(cm_35)
print("OVA F1-score:", f1_score(y_true_35, y_pred_35, average='macro'))


Confusion Matrix - OvA Model (classes 3 and 5):
[[1450   51]
 [  45 1437]]
OVA F1-score: 0.9678170220226294


<hr>

<h3>Step 7: Train and evaluate a binary logistic regression model on the most confused classes (3 vs 5)<h3\>

In [13]:
confused_classes = [3, 5]

# Filter training data
mask = (y_train == 3) | (y_train == 5)
X_sub = X_train[mask]
y_sub = y_train[mask]

# Train binary logistic regression on these two classes
model_sub = LogisticRegression()
model_sub.fit(X_sub, y_sub)

# Filter test data
mask_test = (y_test == 3) | (y_test == 5)
X_test_sub = X_test[mask_test]
y_test_sub = y_test[mask_test]

y_pred_sub = model_sub.predict(X_test_sub)

# Confusion matrix
cm_sub = confusion_matrix(y_test_sub, y_pred_sub)
print("Confusion Matrix:\n", cm_sub)
print("binary F1-score:", f1_score(y_test_sub, y_pred_sub, average='macro'))


Confusion Matrix:
 [[1507   62]
 [  55 1479]]
binary F1-score: 0.9622914832789338


notes: our OvA model performs slightly better at separating 3 vs 5 than the binary model

<hr>

<h3>Step 8: Define a testmymodel function to evaluate any model on new unseen data<h3\>

In [8]:
def testmymodel(model, features_path, labels_path):
    # Load test data
    X = np.load(features_path)
    y = np.load(labels_path)

    # Check if model is a dict (OvA case)
    if isinstance(model, dict):
        y_pred, _ = predict_ova(model, X)
    else:
        y_pred = model.predict(X)

    return accuracy_score(y, y_pred)



<hr>

In [21]:
print("Softmax success rate:", testmymodel(model_softmax, 'cifar10_features.npy', 'cifar10_labels.npy'))
print("OvA success rate:", testmymodel(ova_models, 'cifar10_features.npy', 'cifar10_labels.npy'))

Softmax success rate: 0.96218
OvA success rate: 0.96016
