# Classification Extras

- In this notebook, we are going to look at some of the extra topics which may help you to make the model efficient.
    1. Confusion Matrix
    2. Calculate two rates
    3. Accuracy Paradox
    4. CAP Curve
    5. CAP Curve Analysis

## Confusion Matrix

<img src="../static/confusion_matrix.png" alt="confusion_matrix.png" width="500">

- What is this 🟧 orange box denoting?
    - It is called **True Negatives** where the actual value is *False* & we predicted the *False* value correctly.
- What is this 🟦 blue box denoting?
    - It is called **True Positives** where the actual value is *True* & we predicted the *True* value correctly.
- What is this 🟪 purple box denoting?
    - It is called **False Positives** where the actual value is *False* but, we predicted it as a *True* value.
    - It is also know as **Type I Error**
- What is this 🟨 yellow box denoting?
    - It is called **False Negatives** where the actual value is *True* but, we predicted it as a *False* value.
    - It is also know as **Type II Error**

## Calculate two rates

We can perform a simple calculation on the confusion matrix to find the rate of the successful predictions (Accuracy Rate) and error predictions (Error Rate).

- From above confusion matrix we can see, 
    ```
    Total data (🟧 + 🟦 + 🟪 + 🟨) = 35 + 50 + 5 + 10 = 100
    ```
- Accuracy Rate :-
    ```
    AR = No. of. Correct Predictions / Total
    AR = (🟧 + 🟦) / 100 
    AR = (35 + 50) / 100 
    AR = 0.85 
    AR = 85%
    ```
- Error Rate :- 
    ```
    AR = No. of. Wrong Predictions / Total
    ER = (🟪 + 🟨) / 100 
    ER = (5 + 10) / 100 
    ER = 0.15
    ER = 15%
    ```

## Accuracy Paradox

- Let's assume we have a confusion matrix like this

<img src="../static/accuracy_paradox_confusion_matrix.png" alt="accuracy_paradox_confusion_matrix.png" width="500">

- The Accuracy Rate is,
    ```
    Total data (🟧 + 🟦 + 🟪 + 🟨) = 97 + 10 + 15 + 5 = 127
    AR = No. of. Correct Predictions / Total
    AR = (🟧 + 🟦) / 127 
    AR = (97 + 10) / 127 
    AR = 107 / 127 
    AR = 0.84251968503
    AR = 84.251968503 %
    ```

- We are not straining to predict True for now! We are only predicting False cases.
    - So, every positive prediction will add to False cases, as we can see in below confusion matrix

<img src="../static/accuracy_paradox_confusion_matrix_without_true.png" alt="accuracy_paradox_confusion_matrix_without_true.png" width="500">

- The Accuracy Rate is,
    ```
    Total data (🟧 + 🟦 + 🟪 + 🟨) = 112 + 0 + 0 + 15 = 127
    AR = No. of. Correct Predictions / Total
    AR = (🟧 + 🟦) / 127 
    AR = (112 + 0) / 127 
    AR = 112 / 127 
    AR = 0.88188976378
    AR = 88.188976378 % ⬆️
    ```

- Wow 😮, Our Accuracy increased by 3.93 % even after we are not performing an effort to predict True case.
- Then our effort to find true predictions are a waste! It should not happen like this. 😤

## CAP Curve

- (CAP) Cumulative Accuracy Profile, let's dig into it. 
- We have a case, where we are sent out some offers to 100,000 users were approximately 10% peoples are accepted and purchased the offer.

### Random curve
- So, if we have random 20,000 users then ⇾ 2,000 accepted the offer.
- So, if we have random 40,000 users then ⇾ 4,000 accepted the offer.
- Similarly, for every random users we can consider 10% accepted the offer.

|Random User|Accepted User|
|:---:|:---:|
|0|0|
|20,000|2,000|
|40,000|4,000|
|60,000|6,000|
|80,000|8,000|
|100,000|10,000|

<br>

<img src="../static/random_curve.png" alt="random_curve.png" width="500">

### Model CAP Curve
- Let's apply our classification model and find users accepted the offer or not. (Yes | No)
- Our model gives, if we have trained model users 20,000 users then ⇾ 4,400 accepted the offer as per the model users similarly like these 20,000 user accepted the offer.
- Similarly, for every trained model users we can consider given table users are accepted the offer.

|Trained Model Users|Accepted User|
|:---:|:---:|
|0|0|
|20,000|5,000|
|40,000|8,100|
|60,000|9,100|
|80,000|9,800|
|100,000|10,000|

<br>

<img src="../static/good_model_cap_curve.png" alt="good_model_cap_curve.png" width="500">

- <span style="color:red">RED</span> curve is the CAP curve of the model which we trained now.
- For, some other dataset random state. Our model is poor, and it is not trained well, then we can draw a <span style="color:green">GREEN</span> line as shown in below.

<img src="../static/poor_model_cap_curve.png" alt="poor_model_cap_curve.png" width="500">

> How I'm said that <span style="color:green">GREEN</span> CAP curve is poor model & <span style="color:red">RED</span> CAP curve is good model ?

- Let's find out 😜

### Perfect Curve (Crystal Ball)
- Before, going to understand CAP curve is poor or good. We need to learn one more curve called Perfect Curve.
- As per the case, where we are sent out some offers to 100,000 users were approximately 10% peoples are accepted and purchased the offer.
- Perfect curve, will be look like

<img src="../static/perfect_curve.png" alt="perfect_curve.png" width="500">

- So, we are getting exact users who purchased the offer which is 10%. **(Perfect Case)**

### How we can say that <span style="color:green">GREEN</span> CAP curve is poor model & <span style="color:red">RED</span> CAP curve is good model ?

- <span style="color:green">GREEN</span> is closer to **_Random Curve_**, it is farther from **_Perfect Curve_**. So, it is poor Curve.
- <span style="color:red">RED</span> is farther from **_Random Curve_**, it is slightly closer to **_Perfect Curve_**. So, it is Good Curve.

## CAP Analysis

There are many ways to find model performance. 
1. We can directly say, by plotting the graph.
    - If Model CAP Curve is closer to **_Random Curve_** than it is poor model.
    - If Model CAP Curve is closer to **_Perfect Curve_** than it is good model.
2. We can find the ratio of area of Model CAP Curve & area of Perfect Curve.
    - If ratio is close to 0 than it is poor model.
    - If ratio is close to 1 than it is good model.
3. We can take 50% users and need to find X % prediction of offer purchased. 

    | X % | Model Performance |
    |:---:|:---:|
    | 90% < X < 100% | Too Good |
    | 80% < X < 90% | Very Good |
    | 70% < X < 80% | Good |
    | 60% < X < 70% | Poor |
    | X < 60% | Rubbish |

    <br>

    <img src="../static/cap_analysis.png" alt="cap_analysis.png" width="500">


## How do I know which model to choose for my problem ?

✅ If your problem is **linear**, you can go for 
- Logistic Regression
- SVM.

✅ If your problem is **non-linear**, you can go for 
- K-NN
- Naive Bayes
- Decision Tree 
- Random Forest.

## Pros & Cons


<img src="../static/classification_pros_cons.png" alt="classification_pros_cons.png" width="800">
