## Loss function:

This section shares my understanding of the following two formula and why I think they are the same. The two formula are:

$$ LogLoss = - \frac{1}{n} \sum_{i=1}^n [y_i \cdot log_e\hat{y_i} + (1-y_i) \cdot log_e(1-\hat{y_i})],\ where\ \hat{y_i} = \phi(\textbf{w},\textbf{x})$$

vs

$$ Loss = \frac{1}{n} \sum_{i=1}^n log(1+exp(-\bar{y_i}\phi(\textbf{w},\textbf{x}_i))$$


The two formula are acutually the same. It's just the $\bar{y_i}$ in the loss function from the paper has to have this relationship with $y_i$, where  
$$\bar{y_i} = 
\begin{cases}
    1,\ if\ y_i=1\\
    -1,\ if\ y_i=0\\
\end{cases}$$

### More detailed derivation:

<img src="derivation.png">

### Test in code using Connor's example:

In [11]:
import numpy as np

In [35]:
def getCTR(s):
    """calculate CTR using model output"""
    return 1/(1+np.exp(-s))

In [36]:
def getLoss(labels, ctr):
    """calculate loss using Log Loss"""
    return -(labels * np.log(ctr)+(1-labels) * np.log(1-ctr))

In [42]:
def getLossPaper(labels, s):
    """calculate loss using the formula from the paper"""
    #map the labels, from y to yBar
    modified_labels = np.where(labels == 0, -1, labels)
    return np.log(1+np.exp(-modified_labels*s))

In [38]:
#manually input data (copying Connor's example)
s = np.asarray([-0.27,-0.55,-1.05,-0.93,0.05,-0.3,-0.93])
labels = np.asarray([1,0,0,1,1,0,1])

In [39]:
#calculate CTR and the two losses

ctr = getCTR(s)
loss = getLoss(labels, ctr)
lossPaper = getLossPaper(labels, s)

In [41]:
print("Model Output:", s)
print("Label:", labels)
print("CTR:", ctr)

print("="*20)

print("Log Loss:\n", loss)
print("Loss using formula from the paper:\n", lossPaper)
print("Means of the Losses", np.mean(loss), np.mean(lossPaper))

Model Output: [-0.27 -0.55 -1.05 -0.93  0.05 -0.3  -0.93]
Label: [1 0 0 1 1 0 1]
CTR: [0.4329071  0.36586441 0.2592251  0.28292471 0.5124974  0.42555748
 0.28292471]
Log Loss:
 [0.83723214 0.45549248 0.30005848 1.26257444 0.66845965 0.55435524
 1.26257444]
Loss using formula from the paper:
 [0.83723214 0.45549248 0.30005848 1.26257444 0.66845965 0.55435524
 1.26257444]
Means of the Losses 0.7629638392999175 0.7629638392999174
