# Regression and Classification Metrics

This notebook demonstrates how to calculate common evaluation metrics for both regression and classification tasks using scikit-learn.

In [None]:
# Import necessary libraries
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error, accuracy_score, precision_score, recall_score, f1_score, confusion_matrix
import numpy as np

# Assuming you have y_test and y_pred defined from your model training
# Replace with your actual y_test and y_pred
# Example dummy data:
y_test = np.random.rand(100)
y_pred = np.random.rand(100)

y_test_clf = np.random.randint(0, 2, 100)
y_pred_clf = np.random.randint(0, 2, 100)

## Regression Metrics

### 1. R-Squared (R²)
R-Squared tells us how well the model explains the data. A value closer to 1 means the model is a good fit; a value closer to 0 means it's not.

In [None]:
r2 = r2_score(y_test, y_pred)
print(f"R-Squared: {r2}")

### 2. Mean Squared Error (MSE)
MSE shows how far off the predictions are from the actual values. It squares the errors, so bigger mistakes have a larger impact.

In [None]:
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error (MSE): {mse}")

### 3. Mean Absolute Error (MAE)
MAE measures the average of the absolute differences between predicted and actual values. It tells how far off, on average, your predictions are.

In [None]:
mae = mean_absolute_error(y_test, y_pred)
print(f"Mean Absolute Error (MAE): {mae:.4f}")

### 4. Root Mean Squared Error (RMSE)
RMSE is just like MSE, but it takes the square root of the error, bringing it back to the same units as the target variable. It punishes larger errors more.

In [None]:
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f"Root Mean Squared Error (RMSE): {rmse}")

## Classification Metrics

### 1. Accuracy
Explanation: Accuracy measures how many predictions were correct out of all predictions. It's the most basic classification metric.

In [None]:
accuracy = accuracy_score(y_test_clf, y_pred_clf)
print(f"Accuracy: {accuracy:.4f}")

### 2. Precision
Explanation: Precision tells you how many of the predicted positive cases were actually positive. It's important when the cost of false positives is high.

In [None]:
precision = precision_score(y_test_clf, y_pred_clf)
print(f"Precision: {precision:.4f}")

### 3. Recall
Explanation: Recall measures how many of the actual positive cases were correctly identified by the model. It's important when the cost of false negatives is high.

In [None]:
recall = recall_score(y_test_clf, y_pred_clf)
print(f"Recall: {recall:.4f}")

### 4. F1-Score
Explanation: F1-Score is the harmonic mean of precision and recall. It balances precision and recall when you need a single metric to evaluate your model.

In [None]:
f1 = f1_score(y_test_clf, y_pred_clf)
print(f"F1-Score: {f1:.4f}")

### 5. Confusion Matrix
Explanation: A confusion matrix shows the counts of true positives, true negatives, false positives, and false negatives. It helps understand how well the model is performing.

In [None]:
conf_matrix = confusion_matrix(y_test_clf, y_pred_clf)
print(f"Confusion Matrix:\n{conf_matrix}")