Definición de estrategias de scoring
===

Notación básica
---

* Funciones que terminan con un `_score` retornan un valor que se debe maximizar.

* Funciones que terminan con un `_error` o `_loss` retornan un valor que se debe minimizar.

Transformación de métricas a funciones de scoring con make_scorer
---

In [1]:
from sklearn.metrics import fbeta_score, make_scorer
from sklearn.model_selection import GridSearchCV
from sklearn.svm import LinearSVC

ftwo_scorer = make_scorer(
    # -------------------------------------------------------------------------
    # Score function (or loss function) with signature
    # score_func(y, y_pred, **kwargs).
    score_func=fbeta_score,
    # -------------------------------------------------------------------------
    # Whether score_func is a score function (default), meaning high is good,
    # or a loss function, meaning low is good. In the latter case, the scorer
    # object will sign-flip the outcome of the score_func.
    greater_is_better=True,
    # -------------------------------------------------------------------------
    # Additional parameters to be passed to score_func
    beta=2,
)


grid = GridSearchCV(
    LinearSVC(),
    param_grid={"C": [1, 10]},
    scoring=ftwo_scorer,
    cv=5,
)

In [2]:
import numpy as np
from sklearn.dummy import DummyClassifier


def my_custom_loss_func(y_true, y_pred):
    diff = np.abs(y_true - y_pred).max()
    return np.log1p(diff)

#
# score will negate the return value of my_custom_loss_func,
# which will be np.log(2), 0.693, given the values for X
# and y defined below.
#
score = make_scorer(my_custom_loss_func, greater_is_better=False)

X = [[1], [1]]
y = [0, 1]

clf = DummyClassifier(
    strategy="most_frequent",
    random_state=0,
)

clf = clf.fit(X, y)

print(my_custom_loss_func(y, clf.predict(X)))
print(score(clf, X, y))

0.6931471805599453
-0.6931471805599453
