# Rainfall Prediction - Evaluating a Binary Classifier
## Accuracy Score and Confusion Matrix
As always we need to load in the libraries, and data, we will be using. In this case we can load in the testing and training datasets created in previous notebooks.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

X_train = pd.read_csv("X_train.csv", header=None)
X_test = pd.read_csv("X_test.csv", header=None)
y_train = pd.read_csv("y_train.csv")
y_test = pd.read_csv("y_test.csv")

Now, as in the last notebook, we can train a logistic regression model on our training data, test it using our test data.

In [2]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

model = LogisticRegression(solver='liblinear')
fit = model.fit(X_train, y_train['RainTomorrow'].values.ravel())
y_pred = model.predict(X_test)

accuracy_score(y_test['RainTomorrow'], y_pred)

0.8328000281303843

We can now compare this with the performance of the model on the training data.

In [3]:
y_train_pred = model.predict(X_train)
accuracy_score(y_train['RainTomorrow'], y_train_pred)

0.8333069606343513

We can see that the difference in the accuracy is minimal i.e. $0.1\%$ difference. However, since our data isn't balanced i.e. it rains far fewer days than it doesn't, it's worth exploring the performance of our model versus a static prediction.

In [4]:
rain_model = np.repeat(1.0,len(y_test['RainTomorrow']))
accuracy_score(y_test['RainTomorrow'], rain_model)

0.22128063574668588

In [5]:
dry_model = np.repeat(0.0,len(y_test['RainTomorrow']))
accuracy_score(y_test['RainTomorrow'], dry_model)

0.7787193642533141

We can see from the output above that our model performs better than either of these static models, but only by a margin of about $5\%$. To get a better understanding of how our model is performing we can use a confusion matrix. 

In [6]:
from sklearn.metrics import confusion_matrix, classification_report

confusion_matrix(y_test['RainTomorrow'], y_pred)

array([[20760,  1386],
       [ 3369,  2924]], dtype=int64)

We can also use `classification_report` to generate several metrics to evaluate model performance, like precision, recall, and $f_{1}$ score.

In [7]:
print(classification_report(y_test['RainTomorrow'], y_pred))

              precision    recall  f1-score   support

         0.0       0.86      0.94      0.90     22146
         1.0       0.68      0.46      0.55      6293

    accuracy                           0.83     28439
   macro avg       0.77      0.70      0.72     28439
weighted avg       0.82      0.83      0.82     28439



To reduce our type II error (also known as the false negative rate), we can adjust the threshold probability for classification.

In [8]:
new_pred = [1 if x > 0.25 else 0 for x in model.predict_proba(X_test)[:,1]]
confusion_matrix(y_test['RainTomorrow'], new_pred)

array([[17624,  4522],
       [ 1570,  4723]], dtype=int64)