
Problem Statement
**Predict number of asteroids that will hit the earth**

Accuracy : 99.9% if I just say 0 asteroids hit earth on a particular day

So my model will be accurate but not really valuable.

What should we be doing in such cases?

Different accuracy metrics

#### Accuracy, Precision, and Recall

### Confusion Matrix
<img src='img/cm.png'>

#### Accuracy = (TP + TN) / (TP + FP + TN + FN)

**When to use Accuracy?**

* When there is no skew in classes, i.e. the class imbalance problem doesn't exist.


* Find overfitting problem by comparing test and training accuracy


**Caveat**

If someone says no asteroid will be hit, he / she will be correct 99.9% of the time.

### Asteroid Problem - Will Hit (0) / Won't Hit (1)

df['Hit'].value_counts()

Will Hit - 10
Won't Hit - 10000000000

Class Imbalance exists


#### Chatbot Problem - Utterances / Responses 

Right Response / Wrong Response

Right Response - 1000000
Wrong Response - 999999

#### Cat / Dog image classification

Cat - 1000
Dog - 1000

Class imbalance doesn't exist

#### Precision - what is the proportion of predicted positives that is truly positive?

Precision = TP / (TP+FP)

In asteroid problem, precision will be 0 because we said that no asteroid will hit the earth.


When to use precision?

Precision is to be used when we want to be very sure of our prediction. 

e.g. we are building a system to score the customer credit limit - whether customer is credit worthy or not. based on this we will be increasing or decreasing customer's credit limit.

In this case precision would be important because if we are classifying a good customer as not credit worthy it is going to impact the business in a bad manner.

#### Recall - what is the proportion of actual positives that are correctly classified?

Recall = (TP) / (TP + FN)

Asteroid prediction problem - Recall is 0 because tp =0


**When to use recall?**

Recall is used as an evaluation metric when we want to capture as many as positives as possible. e.g. if we are trying to predict disease, then recall is important because we want to capture as many positives as we can.

### F1 Score - between 0 and 1, harmonic mean of precision and recall

**F1 = 2 * (precision * recall) / (precision + recall)**

**Asteroid prediction problem**

If we are saying "No" for whole training dataset then:


* precision = 0

* recall = 0

* accuracy = around 99%

* F1 Score = 0

Classifier with 99% accuracy is worthless for our case.

**When to use?**

We want to have a model with good precision and good recall.

F1 score is a metric that maintains the balance between precision and recall.

If you are CBI agent, then you've 2 goals:

1) Catch the right person i.e. person who is a criminal - Precision


2) Maximise the catches, catch all of them - Recall


3) The balance between precision and recall is F1 score.

In [3]:
from sklearn.metrics import f1_score, accuracy_score

y_actual = [0,1,1,1,0,1,1,0]
y_pred =   [0,0,1,1,0,1,0,1]

print(f'F1 Score: {f1_score(y_actual,y_pred)}')
print(f'Accuracy Score: {accuracy_score(y_actual,y_pred)}')


F1 Score: 0.6666666666666665
Accuracy Score: 0.625


In [5]:
import pandas as pd
import numpy as np



In [6]:
df = pd.read_csv('dataset/diabetes.csv')

df.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [7]:
features = ['Pregnancies','Insulin','BMI']

In [8]:
X = df[features]
y = df['Outcome']

In [10]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=42)

In [11]:
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
lr.fit(X_train,y_train)

LogisticRegression()

In [12]:
y_pred = lr.predict(X_test)

In [13]:
y_pred

array([0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
       0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
       0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1,
       0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1])

In [14]:
y_test

668    0
324    0
624    0
690    0
473    0
      ..
355    1
534    0
344    0
296    1
462    0
Name: Outcome, Length: 154, dtype: int64

In [15]:
from sklearn import metrics

print(metrics.accuracy_score(y_test,y_pred))

0.6688311688311688


#### Null Accuracy

In [16]:
df['Outcome'].value_counts()

0    500
1    268
Name: Outcome, dtype: int64

In [17]:
y_test.value_counts()

0    99
1    55
Name: Outcome, dtype: int64

In [18]:
500/768

0.6510416666666666

In [21]:
99/154

0.6428571428571429

In [22]:
print(metrics.confusion_matrix(y_test,y_pred))

[[85 14]
 [37 18]]


In [24]:
#51 wrong, 103 correct

(85+18)/(154)

0.6688311688311688

#### Confusion matrix

* TP - 18 - correctly predicted those who have diabetes
* TN - 85 - correctly predicted those who don't have diabetes
* FP - Type I Error - 14 - incorrect prediction that person has diabetes
* FN - Type II Error - 37 - incorrect prediction that person doesn't have diabetes

##### Precision

In [25]:
metrics.precision_score(y_test,y_pred)

0.5625

In [27]:
18/(18+14)

0.5625

##### Recall

In [28]:
metrics.recall_score(y_test,y_pred)

0.32727272727272727

In [30]:
18/(18+37)

0.32727272727272727

##### ICMR said in its advisory that it had evaluated the antigen kit’s performance in two labs, and found its sensitivity to be 50.6% and 84%, and specificity 99.3% and 100%, respectively

#### Senstivity - Recall or True Positive Rate

- when actual value is positive how often the prediction is correct?

#### Specificity - when actual value is negative and we try to determine how often the prediction is correct?

###### Specificity = TN / (TN + FP)

###### Precision = TP / (TP+FP)

A 100% specificity would mean that test will never confuse the other antibodies for novel corona virus.

A 98% senstivity would mean if test kit checks 100 samples, it would detect antibodies 98% of the time.

# Great Job !