
# Regression and Classification Metrics

This notebook covers various metrics used in regression and classification, including their formulas, explanations, advantages, disadvantages, and Python implementations.



## Regression Metrics

### 1. Mean Absolute Error (MAE)
- **Formula:**  
  \[ MAE = \frac{1}{n} \sum_{i=1}^{n} | y_i - \hat{y}_i | \]
- **Python Implementation:**


In [None]:

from sklearn.metrics import mean_absolute_error

# Example data
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]

# Compute MAE
mae = mean_absolute_error(y_true, y_pred)
print("Mean Absolute Error:", mae)



### 2. Mean Squared Error (MSE)
- **Formula:**  
  \[ MSE = \frac{1}{n} \sum_{i=1}^{n} ( y_i - \hat{y}_i )^2 \]
- **Python Implementation:**


In [None]:

from sklearn.metrics import mean_squared_error

# Compute MSE
mse = mean_squared_error(y_true, y_pred)
print("Mean Squared Error:", mse)



### 3. Root Mean Squared Error (RMSE)
- **Formula:**  
  \[ RMSE = \sqrt{MSE} \]
- **Python Implementation:**


In [None]:

import numpy as np

# Compute RMSE
rmse = np.sqrt(mean_squared_error(y_true, y_pred))
print("Root Mean Squared Error:", rmse)



### 4. R² Score (Coefficient of Determination)
- **Formula:**  
  \[ R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2} \]
- **Python Implementation:**


In [None]:

from sklearn.metrics import r2_score

# Compute R² Score
r2 = r2_score(y_true, y_pred)
print("R² Score:", r2)



### 5. Adjusted R² Score
- **Formula:**  
  \[ Adjusted R^2 = 1 - \left( \frac{(1 - R^2) (n - 1)}{n - k - 1} \right) \]
  where:
  - \( R^2 \) is the standard R² score,
  - \( n \) is the number of observations,
  - \( k \) is the number of independent variables (features).
- **Python Implementation:**


In [None]:

def adjusted_r2_score(r2, n, k):
    return 1 - ((1 - r2) * (n - 1) / (n - k - 1))

# Example values
n = len(y_true)  # Number of observations
k = 1  # Assuming one feature for simplicity
adjusted_r2 = adjusted_r2_score(r2, n, k)

print("Adjusted R² Score:", adjusted_r2)



## Classification Metrics

### 1. Accuracy
- **Formula:**  
  \[ Accuracy = \frac{TP + TN}{TP + TN + FP + FN} \]
- **Python Implementation:**


In [None]:

from sklearn.metrics import accuracy_score

# Example classification data
y_true_class = [1, 0, 1, 1, 0, 1, 0, 0, 1, 1]
y_pred_class = [1, 0, 1, 0, 0, 1, 1, 0, 1, 1]

# Compute Accuracy
accuracy = accuracy_score(y_true_class, y_pred_class)
print("Accuracy:", accuracy)



### 2. Precision, Recall, and F1 Score
- **Precision Formula:**  
  \[ Precision = \frac{TP}{TP + FP} \]
- **Recall Formula:**  
  \[ Recall = \frac{TP}{TP + FN} \]
- **F1 Score Formula:**  
  \[ F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall} \]
- **Python Implementation:**


In [None]:

from sklearn.metrics import precision_score, recall_score, f1_score

# Compute Precision, Recall, and F1 Score
precision = precision_score(y_true_class, y_pred_class)
recall = recall_score(y_true_class, y_pred_class)
f1 = f1_score(y_true_class, y_pred_class)

print("Precision:", precision)
print("Recall:", recall)
print("F1 Score:", f1)



### 3. Confusion Matrix
- **Formula:** A confusion matrix is represented as:

  |               | **Predicted Positive** | **Predicted Negative** |
  |--------------|-----------------------|-----------------------|
  | **Actual Positive**  | TP (True Positive)  | FN (False Negative)  |
  | **Actual Negative**  | FP (False Positive) | TN (True Negative)   |

- **Python Implementation:**


In [None]:

from sklearn.metrics import confusion_matrix

# Compute Confusion Matrix
cm = confusion_matrix(y_true_class, y_pred_class)
print("Confusion Matrix:\n", cm)



### 4. ROC-AUC Score
- **Python Implementation:**


In [None]:

from sklearn.metrics import roc_auc_score

# Compute ROC-AUC Score
roc_auc = roc_auc_score(y_true_class, y_pred_class)
print("ROC-AUC Score:", roc_auc)



## Type 1 and Type 2 Errors

### 1. Type 1 Error (False Positive)
- Occurs when we **reject a true null hypothesis** (detecting something that isn’t there).
- Example: A COVID test incorrectly says a healthy person has COVID.

### 2. Type 2 Error (False Negative)
- Occurs when we **fail to reject a false null hypothesis** (missing something that is there).
- Example: A cancer test fails to detect cancer in a patient who actually has it.

### 3. Python Implementation of Type 1 & Type 2 Errors


In [None]:

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix

# Example classification data
y_true_class = [1, 0, 1, 1, 0, 1, 0, 0, 1, 1]
y_pred_class = [1, 0, 1, 0, 0, 1, 1, 0, 1, 1]

# Compute Confusion Matrix
cm = confusion_matrix(y_true_class, y_pred_class)

# Extract values
TN, FP, FN, TP = cm.ravel()

# Calculate Type 1 and Type 2 Error Rates
type1_error_rate = FP / (FP + TN)  # False Positive Rate (α)
type2_error_rate = FN / (FN + TP)  # False Negative Rate (β)

print("Type 1 Error Rate (False Positive Rate):", type1_error_rate)
print("Type 2 Error Rate (False Negative Rate):", type2_error_rate)

# Visualizing Confusion Matrix
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=["Predicted Negative", "Predicted Positive"],
            yticklabels=["Actual Negative", "Actual Positive"])
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()
