# Performance Metrics in Machine Learning

*  Model performance metrics help assess how well a model is performing and are critical for model selection and tuning.
* The choice of metric depends on the type of task: 
   * Regression,
   * Classification,
   * Unsupervised Learning.

![image.png](attachment:image.png)

## Regression Evaluation Metrics

* In the **regression task**, we are supposed to predict the target variable which is in the form of continuous values. To evaluate the performance of such a model below mentioned evaluation metrics are used:

   * Mean Absolute Error(MAE)

   * Mean Squared Error(MSE)
   * Root Mean Square Error
   * Root Mean Square Logarithmic Error
   * R2 - Score

### 1. Mean Absolute Error (MAE)

* Mean Absolute Error (MAE) represents the average absolute difference between the predicted values and the actual values.

* It provides a straightforward interpretation of how far off, on average, the predictions are from the true outcomes.
* However, one limitation of MAE is that it doesn’t indicate whether the model is consistently **under-predicting or over-predicting**, as it considers only the magnitude of the errors, not their direction.

**Formula:**

&nbsp;&nbsp;![image-2.png](attachment:image-2.png)

---

### 2. Mean Squared Error (MSE)

* MSE, like MAE, measures the average error between predicted and actual values, but it squares each error before averaging. 

* This means larger errors are penalized more heavily than smaller ones.

* One key advantage of MSE is that it's **differentiable**, which makes it easier to optimize during **gradient-based learning** processes.

**Formula:**

&emsp;&emsp;![image.png](attachment:image.png)

---

### 3. Root Mean Squared Error (RMSE)

* RMSE is simply the square root of the MSE. 

* It provides an error metric in the same units as the target variable, making it easier to interpret. 

* However, like MSE, it is **sensitive to outliers**, since larger errors contribute more to the final value.

**Formula:**

![image.png](attachment:image.png)

---

### 4. Root Mean Squared Logarithmic Error (RMSLE)

* RMSLE is useful when the target variable spans a wide range of values and when over-predictions are less critical than under-predictions.

* It applies a logarithmic transformation before calculating the RMSE, reducing the impact of large differences and focusing more on the relative error.

* This metric is ideal for cases where we care more about the scale of prediction rather than absolute values.

**Formula:**

&emsp;&emsp;&emsp;&emsp;&emsp;&emsp; ![image.png](attachment:image.png)

---

### 5. R² Score (Coefficient of Determination)

* The R² score quantifies the proportion of variance in the dependent variable that is predictable from the independent variable(s). 

* It evaluates how well the regression model fits the data. An R² of 1 indicates perfect prediction, whereas an R² of 0 indicates the model does no better than the mean of the target values.

**Formula:**

&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;![image-2.png](attachment:image-2.png)

---

## Classification Metrics

* In classification tasks, the goal is to predict discrete labels for a target variable. To evaluate the performance of such models, the following metrics are commonly used:

  * Accuracy
  
  * Logarithmic Loss (Log Loss)
  * Area Under the Curve (AUC)
  * Precision
  * Recall
  * F1 Score
  * Confusion Matrix

### 1. Accuracy

 * Accuracy is a basic yet widely used metric that indicates how often the model makes correct predictions. It is calculated as:

&emsp; &emsp;&emsp; &emsp;`Accuracy = (Number of Correct Predictions) / (Total Number of Predictions)`

  *  While accuracy provides a quick snapshot of performance, it can be misleading in cases of class imbalance.
  
  * For example, if 90% of the training data belongs to class A and only 10% to class B, a model predicting only class A will still achieve 90% accuracy—failing to recognize class B entirely. In such cases, alternative metrics provide better insights.

### 2. Logarithmic Loss (Log Loss)

* Log Loss penalizes incorrect predictions based on their confidence scores. It is particularly effective in multi-class classification problems.

The formula is:

![image.png](attachment:image.png)

Where:

yᵢⱼ = 1 if sample i belongs to class j, otherwise 0
pᵢⱼ = predicted probability of sample i being in class j
Log Loss ranges from 0 to ∞. A value closer to 0 indicates better model performance. Minimizing log loss often results in higher overall accuracy.

### 3. Area Under Curve (AUC)

AUC is a performance metric for binary classification problems. It represents the probability that a randomly chosen positive instance is ranked higher than a randomly chosen negative instance.

**Key terms:**

 * **True Positive** Rate (TPR / Sensitivity):
  
    **TPR = TP / (TP + FN)**

  * **True Negative** Rate (TNR / Specificity):

    **TNR = TN / (TN + FP)**
  
  * **False Positive** Rate (FPR):

    **FPR = FP / (FP + TN)**

AUC is derived by plotting the **ROC (Receiver Operating Characteristic) curve**, which is a graph of **TPR vs. FPR**. A higher AUC (closer to 1) signifies a better-performing model.

![image.png](attachment:image.png)

### 3. Precision

 * Precision measures how many of the positively predicted instances are actually correct.

&emsp;&emsp;&emsp;&emsp;`Precision = TP / (TP + FP)`

It is especially important when the cost of false positives is high.

### 4. Recall

* Recall is the ratio of correctly predicted positive instances to the total actual positive instances.

* It measures how well the model captures all **relevant positive cases**.


&emsp;&emsp;&emsp;&emsp;`Recall = TP / (TP + FN)` 

### 5. F1 Score

* The F1 Score is the harmonic mean of Precision and Recall.

* It provides a balance between the two, especially when there's a trade-off between them.

 &emsp; &emsp; &emsp;![image.png](attachment:image.png)

A higher F1 Score indicates a model that is both precise and robust in identifying positives.

### Confusion Matrix

* A Confusion Matrix is a table used to visualize the performance of a classification algorithm.

* For binary classification, it is typically a 2×2 matrix:

### Confusion Matrix

|                    | **Predicted: YES** | **Predicted: NO** | **Total (Actual)** |
|--------------------|--------------------|-------------------|---------------------|
| **Actual: YES**    | 100 (TP)           | 5 (FN)            | 105                 |
| **Actual: NO**     | 10 (FP)            | 50 (TN)           | 60                  |
| **Total (Predicted)** | 110              | 55                | **165**             |


![image-2.png](attachment:image-2.png)


---

## Unsupervised Learning Evaluation Metrics

 * Unsupervised learning involves tasks like clustering, dimensionality reduction, and anomaly detection, where labels are not provided.
 
Because of this, traditional metrics like accuracy or precision aren't directly applicable unless ground truth labels are available for validation purposes.

### 1. Silhouette Score

* How well each data point fits within its assigned cluster vs. other clusters.

* Range: [-1, 1]

  * **1**: The point is well-matched to its own cluster and poorly matched to neighboring clusters.
  * **0**: The point is on or very close to the decision boundary between two clusters.

* Negative values: The point may have been assigned to the wrong cluster.

**Formula:**


&nbsp; &nbsp;` Silhouette Score (s) = (b - a) / max(a, b)`

where:

 &nbsp; &nbsp;`a = average intra-cluster distance (mean distance to other points in the same cluster)`

 &nbsp; &nbsp;`b = average nearest-cluster distance (mean distance to points in the next closest cluster)`

 ---

### 2. Davies–Bouldin Index (DBI)

 &nbsp; &nbsp; * The average similarity between each cluster and its most similar one.

 &nbsp; &nbsp; * Lower values are better; indicate well-separated clusters.

**Formula:**

&nbsp; &nbsp; `DBI = (1 / n) * Σ max((Si + Sj) / Dij)`

where:

&nbsp; &nbsp; ` i and Sj = average distance between points in cluster i (or j) and their cluster centroid`

&nbsp; &nbsp; ` Dij = distance between centroids of clusters i and j`