# 1. Confusion Matrix

### Example 1
![confusion_matrix.PNG](attachment:confusion_matrix.PNG)

### Example 2
![confusion_matrix2.PNG](attachment:confusion_matrix2.PNG)

### Type 1 and Type 2 Errors
- Sometimes in the literature, you'll see __False Positives__ and __False Negatives__ as Type 1 and Type 2 errors. Here is the correspondence:

1. __Type 1 Error__ (Error of the first kind, or False Positive): In the medical example, this is when we misdiagnose a healthy patient as sick.
2. __Type 2 Error__ (Error of the second kind, or False Negative): In the medical example, this is when we misdiagnose a sick patient as healthy.

### Understanding the Confusion Matrix
1. __True Positive (TP)__ is the number of correct predictions that an example is positive. This means positive class correctly identified as positive.
`Example: Given class is spam and the classifier has been correctly predicted it as spam.`
2. __False Negative (FN)__ is the number of incorrect predictions that an example is negative. This means positive class incorrectly identified as negative.
`Example: Given class is spam however, the classifier has been incorrectly predicted it as non-spam.`
3. __False Negative (FN)__ is the number of incorrect predictions that an example is negative. This means positive class incorrectly identified as negative.
`Example: Given class is spam however, the classifier has been incorrectly predicted it as non-spam.`
4. __False Negative (FN)__ is the number of incorrect predictions that an example is negative. This means positive class incorrectly identified as negative.
`Example: Given class is spam however, the classifier has been incorrectly predicted it as non-spam.`

# 2. Accuracy

- __Accuracy__ is the proportion of the total number of predictions that are correct.
##  $Accuracy = \frac{Correctly \ classified \ points}{All \ points}$

![accuracy.PNG](attachment:accuracy.PNG)

![accuracy2.PNG](attachment:accuracy2.PNG)

### Question: When Accuracy don't work?

### Example: Model 1: All transactions are good
![problem_w_accuracy.PNG](attachment:problem_w_accuracy.PNG)

### Example: Model 2: All transactions are fraudulent

![problem_w_accuracy2.PNG](attachment:problem_w_accuracy2.PNG)

# 3. False Negatives & False Positives

__Question:__ In the medical example, what is worse, a False Positive, or a False Negative?
<br> __Answer:__ A False Positive implies sending a healthy person to get more tests. This is slightly inconvenient, but ok. A False Negative implies sending a sick person home, which can be disastrous!

![medical.png](attachment:medical.png)

__Question:__ In the email example, what is worse, a False Positive, or a False Negative?
<br> __Answer:__ A False Negative implies a spam message will make its way into your inbox. This is slightly inconvenient, but ok. A False Positive implies missing an e-mail from your dear grandma, which can be disastrous!

![email.png](attachment:email.png)

# 4. Precision and Recall

![precision_recall.PNG](attachment:precision_recall.PNG)

![confusionMatrxiUpdated.jpg](attachment:confusionMatrxiUpdated.jpg)

## A. Precision

- __Precision__ is ratio of total number of correctly classified positive examples and the total number of predicted positive examples. It shows correctness achieved in positive prediction.
##  $Precision = \frac{TP}{TP + \ FP}$

![precision_1.PNG](attachment:precision_1.PNG)

![precision2.PNG](attachment:precision2.PNG)

## B. Recall

- __Sensitivity__ is also referred as __True Positive Rate__ or __Recall.__ It is measure of positive examples labeled as positive by classifier. It should be higher.
- For instance, proportion of emails which are spam among all spam emails.
##  $Recall = \frac{TP}{TP + \ FN}$

![recall1.PNG](attachment:recall1.PNG)

![recall2.PNG](attachment:recall2.PNG)

# 5. F1 Score

-__F1 score__ is a weighted average of the __recall (sensitivity)__ and __precision__. F1 score might be good choice when you seek to balance between Precision and Recall. 


![f%20score.jpg](attachment:f%20score.jpg)

![f1_score.PNG](attachment:f1_score.PNG)

### Question: What averages of recall and precision is not a good idea, but weighted average it?

### Example: Model 1: All transactions are good

![f1_1.PNG](attachment:f1_1.PNG)

### Example: Model 2: All transactions are fraudulent

![f1_2.PNG](attachment:f1_2.PNG)

### F1 score = Harmonic Mean of Precision and Recall

![hm.PNG](attachment:hm.PNG)

### Example: Model 1: All transactions are good

![hm_1.PNG](attachment:hm_1.PNG)

- __Notice__: F1-Score is lower than the average of Recall and Precision and, hence a better metric.

# 6. F-beta Score

- The __F-beta score__ is the weighted harmonic mean of precision and recall, reaching its optimal value at 1 and its worst value at 0

![F_beta_formula.PNG](attachment:F_beta_formula.PNG)

### Understanding F-beta score
- When beta=1, the F-beta score is equivalent to the F-1 Score.
- When beta=0.5, this score is the F-0.5 score, and so on.

![F_beta.PNG](attachment:F_beta.PNG)

### Question: Out of the following three models, which one should have an F-beta score of 2, 1, and 0.5? Match each model with its corresponding score.

1. Detecting malfunctioning parts in a spaceship
- For the spaceship model, we can't really afford any malfunctioning parts, and it's ok if we overcheck some of the parts that are working well. Therefore, this is a high recall model, so we associate it with beta = 2.

2. Sending phone notifications about videos a user may like
- For the notifications model, since it's free to send them, we won't get harmed too much if we send them to more people than we need to. But we also shouldn't overdo it, since it will annoy the users. We also would like to find as many interested users as we can. Thus, this is a model which should have a decent precision and a decent recall. Beta = 1 should work here.

3. Sending promotional material in the mail to potential clients
- For the Promotional Material model, since it costs us to send the material, we really don't want to send it to many people that won't be interested. Thus, this is a high precision model. Thus, beta = 0.5 will work here.

![Boundaries_of_F_beta_score.PNG](attachment:Boundaries_of_F_beta_score.PNG)

# 7. Specificity

- __Specificity__ is also know as True Negative Rate. It is measure of negative examples labeled as negative by classifier. There should be high specificity. For instance, proportion of emails which are non-spam among all non-spam emails. 
## $Specificity = \frac{TN}{TN + \ FP}$

# 8. Receiver Operating Characteristic (ROC) Curve

![Roc1.PNG](attachment:Roc1.PNG)

### Example:
- Consider a split (say good split) and find TPR and FPR

![roc2.PNG](attachment:roc2.PNG)

- We will move the divider around the line and try to find the __True Positive Rate__ and __False Positive Rate__ for every possible split.

![roc3.PNG](attachment:roc3.PNG)

- Now, we try to plot this: __TPR as a function of FPR__
- Then, we calculate the under the curve is called __AUC (Area Under the Curve)__. Here, AUC = 0.8
![roc_full.PNG](attachment:roc_full.PNG)

### Question: What happens when the split is Perfect?
- AUC = 1
![perfect_split.PNG](attachment:perfect_split.PNG)

### Question: What happens when the split is Random (Bad)?
- AUC = 0.5
![random_split.PNG](attachment:random_split.PNG)

### Summarizing
![roc_summary.PNG](attachment:roc_summary.PNG)

__Conclusion:__
- Closer the area under the ROC curve to 1, the better the model is.

### Question: Can the area under the ROC curve be less than 0.5?
- Yes, it can be all the way to zero. It will have flipping of the data points in case of a perfect split

# 9. Regression Metrics

## A. Mean Absolute Error

- MAE is calculated by taking the sum of the absolute value of error.
- Error = Real output minus Predicted output 
- MAE has a problem that it is not differentiable, it would be problematic if we want to use methods such as __Gradient Descent__
![MAE.PNG](attachment:MAE.PNG)

![MAE_1.PNG](attachment:MAE_1.PNG)

## B. Mean Squared Error

- MSE is calculated by the sum of square of prediction error which is real output minus predicted output and then divide by the number of data points. It gives you an absolute number on how much your predicted results deviate from the actual number.
![MSE_formula.png](attachment:MSE_formula.png)

![MSE.PNG](attachment:MSE.PNG)

## C. RMSE

- __Root Mean Square Error(RMSE)__ is the square root of MSE. It is used more commonly than MSE because:
- Firstly sometimes MSE value can be too big to compare easily.
- Secondly, MSE is calculated by the square of error, and thus square root brings it back to the same level of prediction error and makes it easier for interpretation.

## D. R- Squared

- __R Squared__ measures how much variability in dependent variable can be explained by the model. It is the square of the Correlation Coefficient(R) and that is why it is called R Square.
- R Square is calculated by the sum of squared of prediction error divided by the total sum of the square which replaces the calculated prediction with mean. R Square value is between 0 to 1 and a bigger value indicates a better fit between prediction and actual value.

![R_squared.png](attachment:R_squared.png)

![R2_eg.PNG](attachment:R2_eg.PNG)

![R2_python.PNG](attachment:R2_python.PNG)

## E. Adjusted R-Squared

- __R Square__ is a good measure to determine how well the model fits the dependent variables. However, it does not take into consideration of overfitting problem. If your regression model has many independent variables, because the model is too complicated, it may fit very well to the training data but performs badly for testing data. That is why Adjusted R Square is introduced because it will penalize additional independent variables added to the model and adjust the metric to prevent overfitting issues

![Adj_r2_formula.PNG](attachment:Adj_r2_formula.PNG)