### Measuring the success of a model 

##### Recall

- Measure of how well a model identifies all relevant instances of a particular class
- Recall is seen as the percentage of actual positive cases that the model correctly predicted as positive

$${Recall} = \frac{TP}{TP + FN}$$

- TP (True Positives) are the instances correctly identified as positive by the model
- FN (False Negatives) are the instances that are positive but incorrectly identified as negative by the model 

##### Precision

- Measure of the accuracy og the positive predictions made by the model. 
- It tells you how many of the instances that the model predicted as positives are actually positives

$${Precision} = \frac{TP}{TP + FP}$$

- TP is the number of positive instances correctly identified by the model 
- FP is the number of negative instances that the model incorrectly identified as positive
- A high precision means that when the model predicts a positive results it is more likely to be correct

##### F1 score

- The F1 score is used to measure a models performance when the classes are imbalanced. 

$$ F1 Score = 2* \frac{Precision * Recall} {Precision + Recall}$$

- The F1 score ranges from 0 to 1, where 1 is the best possible score and 0 is the worst 
- A high F1 score indicates that the model has a good balance of precision and recall 
- The metric is especially useful when you need a single measure to compare models

##### Confusion matrix
- Used in machine learning to visualise the performance of a classification algorithm 
- It's useful for understnading how well a model is performing in classifying different categories
- TP: Number of positive instances correctly classified 
- FN: Number of negative instances incorrectly classified
- FP: Number of negative instances incorrectly classified as positive
- TN: Number of negative instances correctly classified as negative 

<br>

Summary view

|                     | **Predicted Positive** | **Predicted Negative** |
|---------------------|------------------------|------------------------|
| **Actual Positive** | True Positive (TP)     | False Negative (FN)    |
| **Actual Negative** | False Positive (FP)    | True Negative (TN)     |


<br>

With numbers
|                     | **Predicted Positive** | **Predicted Negative** |
|---------------------|------------------------|------------------------|
| **Actual Positive** | TP: 50 (50%)           | FN: 10 (10%)           |
| **Actual Negative** | FP: 5 (5%)             | TN: 35 (35%)           |


##### AUC

The area under the curve, refers to the area under the receiver operating characteristic curve (ROC). 

1. TPR
- The ROC is a graphical plot that illustrates the diagnostic ability of a binary classifier system. 
- It creates the TPR - true positive rate
- It measures how many actual positives are correctly identified 

$$ TPR = \frac{TP}{TP+FN} $$ 

2. FPR 
- It measures how many actual positives are correctly identified 

$$ FPR = \frac{FP}{FP+TN} $$

3. AUC 
- Measures the entire 2D areas underneath the entire ROC curve 
- Provides an aggregate measure of performance across all possible classification thresholds
- AUC ranges from 0-to-1 
- AUC is used as a summary of the model's ability to distingush between positive and negative classes, the higher the AUC the better the model is at predicting the right classes 
- It measures the quality of the model predictions irrespective of what classification threshold is chosen 
- AUC is good for evaluating classifiers on imbalenced datasets, where accuracy can be misleading
- A model that perfectly seperates the positive and negative classes will have an AUC of 1.0. 
- A model with an AUC less than 0.5 is performing worse than random guessing 
- A model with an AUC of about 0.5, is essentially making random guesses and it has an equal chance of classigying a positive or negative sample correctly 



##### ROC_AUC

- It is used as a summary of the model's ability to distinguish between the postive and negative classes, the higher the AUC the better the model is at predicting 0's as 0's and 1s as 1's
- It measures the quality of the model's predictions irrespective of what the classification threshold 
- AUC evaluates a model's performance across all possible thresholds, wheras accuracy evaluates performance at a specific threshold 


##### Log loss

- A performance metric used in classification models 
- Specifically binary models 
- It measures the uncertainty of the probabilities predicted by a model with respect to the true labels 

$$ \text {Log Loss} = -\frac{1}{N}[y_ilog(p_i) + (1-y_i)log(1-p_i)] $$
- Log loss penalises both types of erroes but specifically those that are confident and wrong 
- A perfect model would have a log loss of 0
- Log loss takes into account the uncertainty of the predictions, this means it uses the predicted probabilities
- A model that is more confident in it's wrong predictons will be penalised more heavily 
- Log loss is particularly useful when you need a measure of how far off the predictions are in terms of the probability, not just in terms of the final classification.
- It is widely used in proabilistic models, where understanding the uncertainty of predictions is important.

##### Accuracy

The formula for accuracy is:

$$ Accuracy = \frac{\text {Number of Correct Predictions}}{\text {Total number of predictions}} $$

- Accuracy ranges from 0 to 1, where 1 indicates a perfect accuracy and 0 indicates a complete inaccuracy 
- An accuracy of 0/8 means that 80% of the predictions made by the model are circa 80% accurate
- Good for general purposes and when the classes are balanced and the costs of false positives and false negatives are roughly the same