# Brier Score

In this tutorial you will learn:
* What is brier score?
* How is brier score calculated?
* Implementation using sckit-learn's inbuilt function

## What is brier score?

Brier score is a type of probability scoring method in python. It is similar to the log loss method of scoring, but the only difference is that it is gentler than log loss in penalising distance from expected value.

Brier score calculates the mean squared error between predicted probabilities and the expected values. The error score is always between 0.0 and 1.0, where a model with perfect skill has a score of 0.0.
Predictions that are further away from the expected probability are penalized, but less severely as in the case of log loss.

The skill of a model can be summarized as the average Brier score across all probabilities predicted for a test dataset. The lower the brier score is, the better is the perfomance or skill of your model. 

A lower brier score means that your model has not deviated much from the original outcomes in terms of probability.

Lets see how brier score is calculated.

## How is brier score calculated?

According to definition, you already know that the calculation of brier score involves subtracting predicted probability from the original outcome and then squaring it, and finally finding the mean for all observations in the dataset.

Lets see how this will look in code. We will define a random array of original outcomes y and an array of predicted probabilities ypreds.

In [2]:
import numpy as np
y = np.array([1, 0, 1, 1, 1, 0, 0, 1, 1, 1])
ypreds = np.array([0.31, 0.22, 0.83, 0.74, 0.91, 0.23, 0.56, 0.76, 0.73, 0.97])
losses = np.subtract(y, ypreds)**2
brier_score = losses.sum()/10
brier_score, losses

(0.11269999999999998,
 array([0.4761, 0.0484, 0.0289, 0.0676, 0.0081, 0.0529, 0.3136, 0.0576,
        0.0729, 0.0009]))

We get a brier score of 0.112. From this score we can infer that our model has good perfomance or skill. But this was still done on random arrays. How do we calculate the brier score for an actual classification problem?

For that purpose we can use the inbuilt function from scikit-learn, `brier_score_loss()`.

## Implementation using sckit-learn's inbuilt function

To implement the scikit-learn function we will use a more complex classification dataset called 'admissions.csv'. This dataset has details of different students' GRE and TOEFL scores, their university ratings and scores for their SOP, LOR and CGPA.

We are determining the chances of them having done research in during their study.

First, we will import all relevant libraries and the dataset.

In [51]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline 
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import brier_score_loss

In [85]:
df = pd.read_csv('admissions.csv')
df.set_index('Serial No.', drop=True, inplace=True)
df = df.drop('Chance of Admit ', axis=1)
df

Unnamed: 0_level_0,GRE Score,TOEFL Score,University Rating,SOP,LOR,CGPA,Research
Serial No.,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,337,118,4,4.5,4.5,9.65,1
2,324,107,4,4.0,4.5,8.87,1
3,316,104,3,3.0,3.5,8.00,1
4,322,110,3,3.5,2.5,8.67,1
5,314,103,2,2.0,3.0,8.21,0
...,...,...,...,...,...,...,...
396,324,110,3,3.5,3.5,9.04,1
397,325,107,3,3.0,3.5,9.11,1
398,330,116,4,5.0,4.5,9.45,1
399,312,103,3,3.5,4.0,8.78,0


As you can see, we have 400 entries in the dataset and the 'Research' column has binary information. Keep in mind that brier score is only used to gauge the perfomance of classification problems. 

Let's divide the dataset into train and test sets and calculate the brier score using `brier_score_loss` function from sklearn library.
The `brier_score_loss()` function takes the probabilities for the positive class only, and returns an average score.

In [97]:
X = df.drop("Research", axis=1)

y = df["Research"]

In [98]:
np.random.seed(42)

# Split into train & test set
X_train, X_test, y_train, y_test = train_test_split(X,
                                                    y,
                                                    test_size=0.2)

Next step is to fit our data to the Logistic Regression model. 

In [99]:
lr = LogisticRegression()
lr.fit(X_train, y_train)



LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='warn', n_jobs=None, penalty='l2',
                   random_state=None, solver='warn', tol=0.0001, verbose=0,
                   warm_start=False)

In [100]:
lr.score(X_test, y_test)

0.725

The accuracy of our model without any tuning is 72.5%. But our aim is to find the brier score loss, so we will first calculate the probabilities for each data entry in X using `predict_proba()` function.

In [101]:
probs = lr.predict_proba(X_test)

In [104]:
probs = probs[:, 1] # Keeping only the values in positive label

In [110]:
loss = brier_score_loss(y_test, probs)
loss

0.18828291612850948

The brier score loss for above model is 18.8%.

## References

[A Gentle Introduction to Probability Scoring Methods in Python by Jason Brownlee](https://machinelearningmastery.com/how-to-score-probability-predictions-in-python/)