# Metrics
## P-value
Let us generate two random samples of size 100 using a normal distribution. First has variance 1 and mean 10. Second has variance 10 and mean 0.

In [1]:
import numpy as np
from scipy import stats

In [2]:
data_1 = np.random.randn(100) + 10
data_2 = 10*np.random.randn(100)

Now, suppose we want to compare if the two samples have the same distribution using the student t-test. We will need to calculate the standard deviation, the t-value and finally the p-value.

In [3]:
var1 = data_1.var(ddof=1)
var2 = data_2.var(ddof=1)
s = np.sqrt((var1 + var2)/2)
t = (data_1.mean() - data_2.mean())/(s*np.sqrt(2/100))
p = 1 - stats.t.cdf(t, df=2*100 - 2)

print("p-value is: " + str(2*p))

p-value is: 2.22044604925e-16


Now, we can compare the calculations we got using the build-in functions in scipy.

In [4]:
results = stats.ttest_ind(data_1, data_2)

print("p-value is: " + str(2*results[1]))

p-value is: 1.13408356772e-16


As the p-value is very small, we can reject the null hypothesis: the two sample datas have different distributions. This makes sense since we know they have different mean and variance.

## Metrics and Confusion Matrix
Suppose you have trained a model for a 2-class supervised learning problem. You now have an array of target (your real y values) and prediction (your model's predicted y values). You can pass the 2 arrays to the below 2 functions to calculate how accurate your model is:

In [5]:
def confusion_matrix(target, prediction):
    TP = FP = FN = TN = 0
    for i in range(len(target)):
        if prediction[i] > 0.5 and target[i] == 1:
            TP += 1
        elif prediction[i] > 0.5 and target[i] == 0:
            FP += 1
        elif target[i] == 1:
            FN += 1
        else:
            TN += 1
    return TP, FP, FN, TN

In [6]:
def metrics(TP, FP, FN, TN, verbose=True):
    # Accuracy
    accuracy = (TP+TN)/(TP + FP + FN + TN)
    # Precision
    precision = TP/(TP+FP)
    # Recall
    recall = TP/(TP+FN)
    # F1-Measure
    f1 = 2*precision*recall/(precision+recall)
    if verbose:
        print("Accuracy: " + str(accuracy))
        print("Precision: " + str(precision))
        print("Recall: " + str(recall))
        print("F1 measure: " + str(f1))
    return accuracy, precision, recall, f1

Below we run one simple Logistic Regression written by Daniel Godoy [here](https://towardsdatascience.com/understanding-binary-cross-entropy-log-loss-a-visual-explanation-a3ac6025181a).

In [7]:
from sklearn.linear_model import LogisticRegression
import numpy as np

In [8]:
x = np.array([-2.2, -1.4, -.8, .2, .4, .8, 1.2, 2.2, 2.9, 4.6])
y = np.array([0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0])

logr = LogisticRegression(solver='lbfgs')
logr.fit(x.reshape(-1, 1), y)

y_pred = logr.predict_proba(x.reshape(-1, 1))[:, 1].ravel()

In [9]:
TP, FP, FN, TN = confusion_matrix(y, y_pred)
metrics(TP, FP, FN, TN)

Accuracy: 0.8
Precision: 0.8571428571428571
Recall: 0.8571428571428571
F1 measure: 0.8571428571428571


(0.8, 0.8571428571428571, 0.8571428571428571, 0.8571428571428571)

## Error Functions
The easies way to use error functions is by calling the metrics class within scikit-learn. Here are some sample code you can use to calculate the logloss error functions. Other error functions such as MSE and MAE are also implemented in Scikit-learn. I would recommend going to the API description to make sure the used used metric formula is the one you expect.

In [10]:
from sklearn.metrics import log_loss

In [11]:
loss = log_loss(y, y_pred)
print('Log Loss is : ' + str(loss))

Log Loss is : 0.332912987074
