# Evaluating Logistic Regression Models - Lab

## Introduction

In regression, you are predicting values so it makes sense to discuss error as a distance of how far off our estimates were. When classifying a binary variable, however, a model is either correct or incorrect. As a result, we tend to quantify this in terms of how many false positives versus false negatives we come across. In particular, we examine a few different specific measurements when evaluating the performance of a classification algorithm. In this review lab, we'll review precision, recall, accuracy, and F1-score in order to evaluate our logistic regression models.


## Objectives
You will be able to:  
* Understand and assess precision, recall, and accuracy of classifiers
* Evaluate classification models using various metrics

## Terminology Review  

Let's take a moment and review some classification evaluation metrics:  


$Precision = \frac{\text{Number of True Positives}}{\text{Number of Predicted Positives}}$    
  

$Recall = \frac{\text{Number of True Positives}}{\text{Number of Actual Total Positives}}$  
  
$Accuracy = \frac{\text{Number of True Positives + True Negatives}}{\text{Total Observations}}$

$\text{F1-Score} = 2\ \frac{Precision\ x\ Recall}{Precision + Recall}$


At times, it may be superior to tune a classification algorithm to optimize against precision or recall rather than overall accuracy. For example, imagine the scenario of predicting whether or not a patient is at risk for cancer and should be brought in for additional testing. In cases such as this, we often may want to cast a slightly wider net, and it is preferable to optimize for recall, the number of cancer positive cases, than it is to optimize precision, the percentage of our predicted cancer-risk patients who are indeed positive.

## 1. Split the data into train and test sets

In [3]:
import pandas as pd
df = pd.read_csv('heart.csv')
df

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1
5,57,1,0,140,192,0,1,148,0,0.4,1,0,1,1
6,56,0,1,140,294,0,0,153,0,1.3,1,0,2,1
7,44,1,1,120,263,0,1,173,0,0.0,2,0,3,1
8,52,1,2,172,199,1,1,162,0,0.5,2,0,3,1
9,57,1,2,150,168,0,1,174,0,1.6,2,0,2,1


In [2]:
from sklearn.model_selection import train_test_split

In [20]:
X_train, X_test, y_train, y_test = train_test_split(df[['age', 'sex', 'cp', 'trestbps', 
                                                        'chol', 'fbs', 'restecg', 
                                                        'thalach','exang', 'oldpeak', 
                                                        'slope', 'ca', 'thal']], 
                                                    df['target'], test_size=0.2)


## 2. Create a standard logistic regression model

In [34]:
from sklearn.linear_model import LogisticRegression

logistic_model = LogisticRegression()
logistic_model.fit(X_train, y_train)
y_hat_train = logistic_model.predict(X_train)
y_hat_test = logistic_model.predict(X_test)



## 3. Write a function to calculate the precision

In [28]:
from sklearn.metrics import confusion_matrix

In [71]:
def precision(y_hat, y):
    tn, fp, fn, tp = confusion_matrix(y, y_hat).ravel()
    # true positive (accurate prediction)
    # divided by all true positives (real data)
    return (tp)/(tp+fp)

## 4. Write a function to calculate the recall

In [62]:
def recall(y_hat, y):
    tn, fp, fn, tp = confusion_matrix(y, y_hat).ravel()
    # true positive (accurate prediction)
    # divided by ()
    return tp/(tp+fp)

## 5. Write a function to calculate the accuracy

In [63]:
def accuracy(y_hat, y):
    tn, fp, fn, tp = confusion_matrix(y, y_hat).ravel()
    return (tp+tn)/(tp+tn+fp+fn)

## 6. Write a function to calculate the F1-score

In [64]:
def f1_score(y_hat,y):
    prec = precision(y_hat, y)
    rec = recall(y_hat,y)
    return 2*(prec*rec)/(prec*rec)

## 7. Calculate the precision, recall, accuracy, and F1-score of your classifier.

Do this for both the train and the test set

In [72]:
p_train = precision(y_hat_train, y_train)
p_test = precision(y_hat_test, y_test)

r_train = recall(y_hat_train, y_train)
r_test = recall(y_hat_test, y_test)


a_train = accuracy(y_hat_train, y_train)
a_test = accuracy(y_hat_test, y_test)


f_train = f1_score(y_hat_train, y_train)
f_test = f1_score(y_hat_test, y_test)

Great Job! Now it's time to check your work with sklearn. 

## 8. Calculating Metrics with sklearn

Each of the metrics we calculated above is also available inside the `sklearn.metrics` module.  

In the cell below, import the following functions:

* `precision_score`
* `recall_score`
* `accuracy_score`
* `f1_score`

Compare the results of your performance metrics functions with the sklearn functions above. Calculate these values for both your train and test set.

In [73]:
from sklearn.metrics import precision_score, recall_score, accuracy_score, f1_score

In [75]:
print(p_test)
print(precision_score(y_hat_test,y_test))

0.675
0.84375


## 9. Comparing Precision, Recall, Accuracy, and F1-Score of Test vs Train Sets


Calculate and then plot the precision, recall, accuracy, and F1-score for the test and train splits using different training set sizes. What do you notice?

In [None]:
import  matplotlib.pyplotmatplot  as plt
%matplotlib inline

In [None]:
training_Precision = []
testing_Precision = []
training_Recall = []
testing_Recall = []
training_Accuracy = []
testing_Accuracy = []

for i in range(10,95):
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size= None) #replace the "None" here
    logreg = LogisticRegression(fit_intercept = False, C = 1e12)
    model_log = None
    y_hat_test = None
    y_hat_train = None

# Your code here

Create 4 scatter plots looking at the test and train precision in the first one, test and train recall in the second one, test and train accuracy in the third one, and test and train f1-score in the fourth one.

In [None]:
# code for test and train precision

In [None]:
# code for test and train recall

In [None]:
# code for test and train accuracy

In [None]:
# code for test and train F1-score

## Summary

Nice! In this lab, you gained some extra practice with evaluation metrics for classification algorithms. You also got some further python practice by manually coding these functions yourself, giving you a deeper understanding of how they work. Going forward, continue to think about scenarios in which you might prefer to optimize one of these metrics over another.