# Evaluation

## Classification

### Accuracy

$$ Accuracy = \frac{(TP + TN)} {(TP + TN + FP + FN)} $$

### Precision

$$ Precision = \frac {(TP)} {(TP + FP)} $$

### Recall

$$Recall = \frac {(TP)} {(TP + FN)} $$

### F1 Score

$$ F1 = 2 * \frac{precision * recall} {precision + recall}$$

In [2]:
from sklearn.metrics import f1_score
y_true = [0,1,1,0,1,1]
y_pred = [0,0,1,0,0,1]
f1_score(y_true, y_pred)

0.6666666666666666

#### F1 Beta Score

harmonic mean - https://deepai.org/machine-learning-glossary-and-terms/harmonic-mean

$$ F1_{\beta} = (1 + \beta^2)* \frac{precision * recall} { \beta^2 * precision + recall}$$

In [13]:
from sklearn.metrics import fbeta_score
fbeta_score(y_true, y_pred, beta=0.5)

0.8333333333333334

In [16]:
# fbeta_score?
# f1_score?

### Log Loss/Binary Cross Entropy

$$ -(y log(p) + (1-y)log(1-p) $$

In [25]:
from sklearn.metrics import log_loss

log_loss(y_true, y_pred, eps=1e-15)
#log_loss?

11.512925464970229

### Categorical Cross Entropy

$$ Log loss = \frac {-1} {N}\sum_{i=1}^{n} \sum_{i=1}^{m} y_{ij} * log(p_{ij}) $$

### AUC

$$ sensitivity = TPR = Recall = \frac {TP} {TP+FP} $$

$$ 1 - specificity = FPR = \frac {FP} {TN +FP} $$

In [10]:
import numpy as np
from sklearn.metrics import roc_auc_score, roc_curve

y_true = np.array([0,0,1,1])
y_scores = np.array([0.1,0.4,0.35,0.8])
print(roc_auc_score(y_true, y_scores))
# roc_auc_score?
roc_curve?

0.75


[0;31mSignature:[0m
[0mroc_curve[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0;34m[[0m[0;34m'y_true'[0m[0;34m,[0m [0;34m'y_score'[0m[0;34m,[0m [0;34m'pos_label=None'[0m[0;34m,[0m [0;34m'sample_weight=None'[0m[0;34m,[0m [0;34m'drop_intermediate=True'[0m[0;34m][0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Compute Receiver operating characteristic (ROC)

Note: this implementation is restricted to the binary classification task.

Read more in the :ref:`User Guide <roc_metrics>`.

Parameters
----------

y_true : array, shape = [n_samples]
    True binary labels. If labels are not either {-1, 1} or {0, 1}, then
    pos_label should be explicitly given.

y_score : array, shape = [n_samples]
    Target scores, can either be probability estimates of the positive
    class, confidence values, or non-thresholded measure of decisions
    (as returned by "decision_function" on some classifiers).

pos_label : int or str, de

In [None]:
# Hinge

In [None]:
# Huber

In [17]:
# Kullback-Leibler

In [18]:
# MAE (L1)

In [19]:
# MSE (L2)