# Assignment 5.1 - Model Validation

Please submit your solution of this notebook in the Whiteboard at the corresponding Assignment entry as .ipynb-file and as .pdf. <br><br>
Please do **NOT** rename the file!

#### State both names of your group members here:
[Jane and John Doe]

In [1]:
# Daniel Thompson, Paola Gega

---

## Grading Info/Details - Assignment 5.1:

The assignment will be graded semi-automatically, which means that your code will be tested against a set of predefined test cases and qualitatively assessed by a human. This will speed up the grading process for us.

* For passing the test scripts: 
    - Please make sure to **NOT** alter predefined class or function names, as this would lead to failing of the test scripts.
    - Please do **NOT** rename the files before uploading to the Whiteboard!

* **(RESULT)** tags indicate checkpoints that will be specifically assessed by a human.

* You will pass the assignment if you pass the majority of test cases and we can at least confirm effort regarding the **(RESULT)**-tagged checkpoints per task.

---

## Task 5.1.1 - Binary Classification Evaluation

* Use model implementations of `sklearn` (or other) for Logistic Regression and SVM for classification tasks. Train both models on the `Breast Cancer` dataset. (see given imports) **(RESULT)**
* Evaluate the performance of both models using appropriate classification metrics and implement them using `numpy` only. Report at least on the following: accuracy, precision, recall, F1-score. **(RESULT)**

In [2]:
# Useful imports
import numpy as np
from sklearn.datasets import load_breast_cancer, load_diabetes, load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC, SVR
from sklearn.linear_model import LinearRegression, LogisticRegression
import matplotlib.pyplot as plt

In [3]:
def metrics_binary_classification(y_true, y_pred):
    # calculate TP, TN, FP, FN
    TP = np.sum((y_true == 1) & (y_pred == 1))
    TN = np.sum((y_true == 0) & (y_pred == 0))
    FP = np.sum((y_true == 0) & (y_pred == 1))
    FN = np.sum((y_true == 1) & (y_pred == 0))

    # calculate metrics
    accuracy = (TP + TN) / len(y_true) if len(y_true) > 0 else 0
    precision = TP / (TP + FP) if (TP + FP) > 0 else 0
    recall = TP / (TP + FN) if (TP + FN) > 0 else 0
    f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0

    return accuracy, precision, recall, f1

In [4]:
breast_cancer_data = load_breast_cancer()
X = breast_cancer_data.data
y = breast_cancer_data.target

In [5]:
# print(breast_cancer_data.DESCR)

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

print("Logistic Regression")
log = LogisticRegression()
log.fit(X_train, y_train)
y_pred_log = log.predict(X_test)
log_metrics = metrics_binary_classification(y_test, y_pred_log)
print(f"Accuracy: {log_metrics[0]:.4f}")
print(f"Precision: {log_metrics[1]:.4f}")
print(f"Recall: {log_metrics[2]:.4f}")
print(f"F1-score: {log_metrics[3]:.4f}")
print()

print("SVM Classifier")
svc = SVC()
svc.fit(X_train, y_train)
y_pred_svc = svc.predict(X_test)
svc_metrics = metrics_binary_classification(y_test, y_pred_svc)
print(f"Accuracy: {svc_metrics[0]:.4f}")
print(f"Precision: {svc_metrics[1]:.4f}")
print(f"Recall: {svc_metrics[2]:.4f}")
print(f"F1-score: {svc_metrics[3]:.4f}")


Logistic Regression
Accuracy: 0.9737
Precision: 0.9722
Recall: 0.9859
F1-score: 0.9790

SVM Classifier
Accuracy: 0.9825
Precision: 0.9726
Recall: 1.0000
F1-score: 0.9861


## Task 5.1.2 - Multi-Class Classification Evaluation

* Do the same as Task 5.1.1 for the multiclass problem `Iris`. Report on the performance metrics: accuracy, precision, recall, F1-score. **(RESULT)**


In [7]:
def metrics_multiclass_classification(y_true, y_pred, k=3):
    TC = np.empty(k)
    for i in range(k):
        TC[i] = np.sum((y_true == i) & (y_pred == i))

    # calculate metrics
    accuracy = np.sum(TC) / len(y_true) if len(y_true) > 0 else 0
    precision = np.empty(k)
    for i in range(k):
        precision[i] = TC[i] / np.sum(y_pred == i) if np.sum(y_pred == i) > 0 else 0
    recall = np.empty(k)
    for i in range(k):
        recall[i] = (
            np.sum((y_true != i) & (y_pred != i)) 
            / np.sum(y_true != i)
            if np.sum(y_true != i) > 0 else 0 
            )
    f1 = 2 * (precision * recall) / (precision + recall)

    return accuracy, precision, recall, f1

In [8]:
iris_data = load_iris()
X = iris_data.data
y = iris_data.target

In [9]:
# print(iris_data.DESCR)

In [10]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

print("Logistic Regression")
log = LogisticRegression()
log.fit(X_train, y_train)
y_pred_log = log.predict(X_test)
log_metrics = metrics_multiclass_classification(y_test, y_pred_log)
print(f"Accuracy: {log_metrics[0]:.4f}")
for i in range(3):
    print(f"Class {iris_data.target_names[i]} Precision: {log_metrics[1][i]:.4f}")
    print(f"Class {iris_data.target_names[i]} Recall: {log_metrics[2][i]:.4f}")
    print(f"Class {iris_data.target_names[i]} F1-score: {log_metrics[3][i]:.4f}")
print()

print("SVM Classifier")
svc = SVC()
svc.fit(X_train, y_train)
y_pred_svc = svc.predict(X_test)
svc_metrics = metrics_multiclass_classification(y_test, y_pred_svc)
print(f"Accuracy: {svc_metrics[0]:.4f}")
for i in range(3):  
    print(f"Class {iris_data.target_names[i]} Precision: {svc_metrics[1][i]:.4f}")
    print(f"Class {iris_data.target_names[i]} Recall: {svc_metrics[2][i]:.4f}")
    print(f"Class {iris_data.target_names[i]} F1-score: {svc_metrics[3][i]:.4f}")

Logistic Regression
Accuracy: 1.0000
Class setosa Precision: 1.0000
Class setosa Recall: 1.0000
Class setosa F1-score: 1.0000
Class versicolor Precision: 1.0000
Class versicolor Recall: 1.0000
Class versicolor F1-score: 1.0000
Class virginica Precision: 1.0000
Class virginica Recall: 1.0000
Class virginica F1-score: 1.0000

SVM Classifier
Accuracy: 1.0000
Class setosa Precision: 1.0000
Class setosa Recall: 1.0000
Class setosa F1-score: 1.0000
Class versicolor Precision: 1.0000
Class versicolor Recall: 1.0000
Class versicolor F1-score: 1.0000
Class virginica Precision: 1.0000
Class virginica Recall: 1.0000
Class virginica F1-score: 1.0000


## Task 5.1.3 - Regression Evaluation

* Now evaluate a trained `Linear Regression` and `SVM` model for the Regression task `Diabetes`. Report on the performance metrics: MSE, RMSE, MAE, RÂ². **(RESULT)**

In [11]:
diabetes_data = load_diabetes()
X = diabetes_data.data
y = diabetes_data.target

In [12]:
# print(diabetes_data.DESCR)

In [13]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

print("Linear Regression")
lin = LinearRegression()
lin.fit(X_train, y_train)
y_pred_lin = lin.predict(X_test)
mse_lin = np.mean((y_pred_lin - y_test)**2)
ss_total = np.sum((y_test - np.mean(y_test))**2)
print("MSE:", mse_lin)
print("RMSE:", np.sqrt(mse_lin))
print("MAE:", np.mean(np.abs(y_pred_lin - y_test)))
print("R^2:", 1 - np.sum((y_pred_lin - y_test)**2)/ss_total)
print()

print("SVM")
svr = SVR(kernel='sigmoid', C=10, epsilon=0.01, tol=1e-5)
svr.fit(X_train, y_train)
y_pred_svr = svr.predict(X_test)
mse_svr = np.mean((y_pred_svr - y_test)**2)
print("MSE:", mse_svr)
print("RMSE:", np.sqrt(mse_svr))
print("MAE:", np.mean(np.abs(y_pred_svr - y_test)))
print("R^2:", 1 - np.sum((y_pred_svr - y_test)**2)/ss_total)

Linear Regression
MSE: 2900.193628493482
RMSE: 53.85344583676593
MAE: 42.79409467959994
R^2: 0.45260276297191937

SVM
MSE: 2958.305360860106
RMSE: 54.39030576178172
MAE: 43.29488016926615
R^2: 0.4416344602269302


## Task 5.1.4 - Cross-Validation (BONUS)

* Set up a cross-validation pipeline for the `Linear Regression` and `SVM` models on the `Diabetes` dataset. (Regression) **(RESULT)**
* Set up a cross-validation pipeline for the `Logistic Regression` and `SVM` models on the `Iris` dataset. (Classification) **(RESULT)**
* Report the performance metrics on all folds (minimum 5-fold) for each model and dataset. **(RESULT)**

## Congratz, you made it! :)