# Classification Evaluation Metrics

### Confusion metrics

![image-4.png](attachment:image-4.png)


### Accuracy
Accuracy measures the proportion of correct predictions made by the model out of all the predictions. High accuracy score indicates that the model is making a large proportion of correct predictions, while a low accuracy score indicates that the model is making too many incorrect predictions.

Accuracy = (TP + TN) / (TP + TN + FP + FN)

Accuracy Useful when target classes are well balanced. 

### Recall
Recall is a performance metric that measures the proportion of positive instances that are correctly identified by a binary classification model out of all the actual positive instances.

Recall = TP / (TP + FN)

Recall, also known as sensitivity or true positive rate (TPR), measures the proportion of true positive (TP) instances among all the actual positive instances. In other words, recall measures the model’s ability to correctly identify positive instances. A high recall score indicates that the model is able to identify a large proportion of positive instances, while a low recall score indicates that the model is missing many positive instances.

### Precision
Precision measures the accuracy of the positive predictions made by the model. A high precision score indicates that the model is able to accurately identify positive instances, while a low precision score indicates that the model is making too many false positive (FP) predictions.

Precision = TP / (TP + FP)

### F1-Score

It is the harmonic mean of precision and recall. It takes both false positive and false negatives into account. Therefore, it performs well on an imbalanced dataset.
![image-3.png](attachment:image-3.png)



![image-2.png](attachment:image-2.png)

Even if data is imbalanced, we can figure out that our model is working well or not. For that, the values of TPR and TNR should be high, and FPR and FNR should be as low as possible.


Both precision and recall are crucial for information retrieval, where positive class mattered the most as compared to negative.

# Regression Evaluation Metrics

### Mean Absolute Error
The mean absolute error, or MAE, is a popular metric for regression problems because the units of the error actually match the ones of the target variable. For instance, if the target variable is in $USD, the error is also in $USD.

![image.png](attachment:image.png)

It is a particularly useful metric when our goal is to evaluate the performance of a regression model in a way that places equal weights on all errors, regardless of their magnitude.

### Mean Squared Error
The mean squared error, or MSE, is frequently used when the goal is to penalize larger errors more than smaller ones.

![image-2.png](attachment:image-2.png)

It is commonly used when the goal is to emphasize larger errors as the MSE squares the difference between actual and predicted values. 
The MSE is squared — a rather unintuitive unit that can often lead to confusion among stakeholders.

### Root Mean Square Error
The root mean squared error, or RMSE, like the MSE, is a popular metric when the goal is to penalize larger errors more than smaller ones. In addition, it’s a more intuitive metric because its unit is the same as that of the target variable. In other words, if the target variable is in $USD, the RMSE is in $USD as well.

![image-3.png](attachment:image-3.png)

### R-Squared
The R-squared metric, or R2, provides an indication of how well a regression model fits the data.
R-squared, also known as the coefficient of determination, represents the proportion of the variance in the target variable that is explained by the predictors.

![image-4.png](attachment:image-4.png)

SSR = Sum of squared residuals
SST = total sum of squares


## Least Squares Method

