Title: Understanding Regression Metrics

Task 1: Calculate MAE and MSE on test predictions and compare errors.

In [1]:
# Task 1: Calculate MAE and MSE on test predictions and compare errors

from sklearn.metrics import mean_absolute_error, mean_squared_error
import numpy as np

# Example actual and predicted values (test set)
y_test = np.array([3.0, 5.0, 2.5, 7.0])
y_pred = np.array([2.8, 4.9, 2.7, 7.1])

# Calculate Mean Absolute Error (MAE)
mae = mean_absolute_error(y_test, y_pred)
print(f"Mean Absolute Error (MAE): {mae:.2f}")

# Calculate Mean Squared Error (MSE)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error (MSE): {mse:.2f}")

# Comparison
if mse > mae:
    print("MSE is greater than MAE, indicating larger errors are penalized more in MSE.")
else:
    print("MAE is greater than or equal to MSE, which is uncommon unless errors are very small or uniform.")

Mean Absolute Error (MAE): 0.15
Mean Squared Error (MSE): 0.03
MAE is greater than or equal to MSE, which is uncommon unless errors are very small or uniform.


Task 2: Evaluate R2 Score on varying datasets and discuss significance.

In [2]:
# Task 2: Evaluate R2 Score on varying datasets and discuss significance

from sklearn.metrics import r2_score
import numpy as np

# Perfect prediction
y_true1 = np.array([10, 20, 30, 40])
y_pred1 = np.array([10, 20, 30, 40])
r2_1 = r2_score(y_true1, y_pred1)
print(f"Perfect prediction R2 Score: {r2_1:.2f} (Perfect fit)")

# Good prediction
y_true2 = np.array([10, 20, 30, 40])
y_pred2 = np.array([12, 19, 29, 41])
r2_2 = r2_score(y_true2, y_pred2)
print(f"Good prediction R2 Score: {r2_2:.2f} (High, but not perfect)")

# Poor prediction
y_true3 = np.array([10, 20, 30, 40])
y_pred3 = np.array([30, 10, 40, 20])
r2_3 = r2_score(y_true3, y_pred3)
print(f"Poor prediction R2 Score: {r2_3:.2f} (Low or negative)")

# Discussion:
print("\nR2 Score (coefficient of determination) measures how well predictions approximate actual values.")
print("R2 = 1 means perfect prediction, R2 = 0 means model predicts no better than the mean, and R2 < 0 means the model is worse than predicting the mean.")

Perfect prediction R2 Score: 1.00 (Perfect fit)
Good prediction R2 Score: 0.99 (High, but not perfect)
Poor prediction R2 Score: -1.00 (Low or negative)

R2 Score (coefficient of determination) measures how well predictions approximate actual values.
R2 = 1 means perfect prediction, R2 = 0 means model predicts no better than the mean, and R2 < 0 means the model is worse than predicting the mean.


Task 3: Use a sample dataset, compute all three metrics, and deduce model performance.

In [3]:
# Task 3: Use a sample dataset, compute all three metrics, and deduce model performance

from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import numpy as np

# Sample dataset: actual and predicted values
y_true = np.array([100, 150, 200, 250, 300])
y_pred = np.array([110, 140, 195, 260, 290])

# Compute metrics
mae = mean_absolute_error(y_true, y_pred)
mse = mean_squared_error(y_true, y_pred)
r2 = r2_score(y_true, y_pred)

print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"R2 Score: {r2:.2f}")

# Deduction
if r2 > 0.8:
    print("Model performance is strong: predictions closely match actual values.")
elif r2 > 0.5:
    print("Model performance is moderate: predictions are reasonably close to actual values.")
else:
    print("Model performance is weak: predictions do not match actual values well.")

print("MAE and MSE indicate the average and squared average error, respectively. Lower values mean better performance.")

Mean Absolute Error (MAE): 9.00
Mean Squared Error (MSE): 85.00
R2 Score: 0.98
Model performance is strong: predictions closely match actual values.
MAE and MSE indicate the average and squared average error, respectively. Lower values mean better performance.
