# Evaluate your Model using the F-score

The F-score or F-measure is a metric for indicating the accuracy of a classification model. After obtaining the precision and recall on your test set, you can calculate the F1-score with following formula:  

$$
F_{1} = 2 \cdot \frac{\textrm{precision} \cdot \textrm{recall}}{\textrm{precision} + \textrm{recall}}
$$  

The maximum possible value is 1, which indicates a perfect model. If either precision or recall is 0, the $F_{1}$-score is 0 as well.  
In the formula above, we state that precision is as important as recall for our application; that is why we write $F_{1}$. We will use this version of the metric in this template via the `scikit-learn`'s function `f1_score()`. However, you can also apply weights to the precision or recall with the [${F_{\beta}}$-score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.fbeta_score.html#:~:text=The%20F%2Dbeta%20score%20is,recall%20in%20the%20combined%20score.).

In [1]:
# Load packages
import numpy as np 
import pandas as pd 
from sklearn.metrics import f1_score
%config InlineBackend.figure_format = 'retina'

In [2]:
# Load data from the csv file
df = pd.read_csv("F1-score_data.csv")
df.head()

Unnamed: 0,PROBABILITY,ACTUAL LABEL
0,0.95,1
1,0.03,0
2,0.37,1
3,0.87,1
4,0.35,0


First, we need to convert the probabilities predicted by our model to actual binary predictions. In this example, we will use a threshold of 0.65; predictions under this threshold will be mapped to 0, the other predictions to 1.

In [3]:
THRESHOLD = 0.65                 # Choose your threshold for which you want to calculate the F-score

# Convert the probability predicted by the model to an actual binary prediction
def convert_to_pred(x):
    if x < THRESHOLD:            # If under threshold,
        return 0                   # map to 0.
    else:                        # Else,
        return 1                   # map to 1.

df['PREDICTION'] = df['PROBABILITY'].apply(lambda x: convert_to_pred(x))

df.head()

Unnamed: 0,PROBABILITY,ACTUAL LABEL,PREDICTION
0,0.95,1,1
1,0.03,0,0
2,0.37,1,0
3,0.87,1,1
4,0.35,0,0


Now we can go ahead and calculate the $F_{1}$-score using the `scikit-learn` package.

In [5]:
f1_score(df['ACTUAL LABEL'], df['PREDICTION'])

0.7058823529411764

The $F_{1}$-score is about 0.7. That is pretty good, but this probably can be increased depending on the threshold that you use. You can change the threshold to try to get a better classification model.