# SkPro loss function Interface

SkPro proposes an object oriented loss function class implementation. 


In [4]:
#@title IMPORTS
import numpy as np

from sklearn.linear_model import LinearRegression

from skpro.workflow.manager import DataManager
from skpro.distributions.distribution_base import Mode
from skpro.baselines.classical_baselines import ClassicalBaseline
from skpro.metrics.classical_loss import SquaredError

data = DataManager('boston')

## 1. Classical Losses

The 'classical losses' map a vector of __scalar predictions__ and a vector of targets onto the real. They all inherit from a common 'LossFunction' base abstract class and implements a '__call__' method (i.e. () operator override) that returns the vector of losses.

In [6]:
baseline =  ClassicalBaseline(LinearRegression()).fit(data.X_train, data.y_train)
estimator = LinearRegression().fit(data.X_train, data.y_train)

loss_func = SquaredError()
losses = loss_func(estimator.predict(data.X_test), data.y_test)

print('setting: ' + str(loss_func.type()))
print('total squared error: ' + str(np.mean(losses)))

setting: classical
total squared error: 33.44897999767653


## 2. Probabilisitc Losses

The 'probabilistic losses' map a vector of __distribution predictions__ (i.e. skpro distribution object) and a vector of targets onto the real. They all inherit from a common 'LossFunction' base abstract class and implements a '__call__' method (i.e. () operator override) that returns the vector of losses.

PS : The distribution mode must eventually be set to 'ELEMENT_WISE' to output a loss vector in a 'one a one for one basis. ' 

In [10]:
from skpro.baselines.classical_baselines import ClassicalBaseline

from skpro.metrics.proba_loss_cont import LogLossClipped
from skpro.metrics.proba_scorer import ProbabilisticScorer

baseline =  ClassicalBaseline(LinearRegression()).fit(data.X_train, data.y_train)
dist = baseline.predict_proba(data.X_test) 
dist.setMode(Mode.ELEMENT_WISE)

loss_func = LogLossClipped(cap = np.exp(-23))
losses = loss_func(dist, data.y_test)

print('setting: ' + str(loss_func.type()))
print('total squared error: ' + str(np.mean(losses)))

setting: probabilistic
total squared error: 3.3375354432750513


A 'Scorer' can be used to symplify the procedure above. It acts as a wrap-up class that directly evaluates an error given a loss functor and a probabilistic estimator. The loss function is passed in the Scorer constructor. 

A '__call__' implementation returns the score. It takes as argument : a probabilistic estimator, an array of test samples (X), an array of targets (y) and a string specifying the format of the output (mode : taking values in ['average', 'absolute'])

In [11]:

scorer = ProbabilisticScorer(LogLossClipped(cap = np.exp(-23)))
score = scorer(estimator = baseline, X = data.X_test, y = data.y_test, mode = 'average')
print('total log-loss error from scorer: ' + str(np.mean(losses)))


total log-loss error from scorer: 3.3375354432750513
